Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-07-11 Thread Gilles Sadowski
Le lun. 11 juil. 2022 à 20:03, Sumanth Rajkumar
 a écrit :
>
> Hi,
>
> I have finished updating the test classes, but I am encountering a problem
> in the ComplexEdgeCaseTest class.
>
> private static void assertComplex(double a, double b,
>   String name, UnaryOperator
> operation,
>   ComplexUnaryOperator
> operation2,
>   double x, double y, long maxUlps) {
> }
>
>
> I added my ComplexUnaryOperator as a parameter and am getting the error of
> having more than 7 parameters in this method.
> Is there anything I can do?

Assuming that the error is raised by "CheckStyle" (?), this check can
be disabled
on a class by class basis in this configuration file:
  src/main/resources/checkstyle/checkstyle-suppressions.xml

Regards,
Gilles

> > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [collections] JMH results for IndexedLinkedList

2022-07-11 Thread Gilles Sadowski
Le lun. 11 juil. 2022 à 09:51, Rodion Efremov  a écrit :
>
> Hi Gilles,
>
> Would it be sufficient to include .txt files for the tables and the .ods
> file provided by Matt?

It would be more handy for readers that tables are inserted "inline"
within a comment (using the JIRA syntax) and plots are uploaded
as PNG images.  See e.g. this report:
https://issues.apache.org/jira/browse/NUMBERS-156

Regards,
Gilles

>
> Best regards,
> rodde
>
> ma 11.7.2022 klo 10.33 Gilles Sadowski  kirjoitti:
>
> > Hi.
> >
> > Le lun. 11 juil. 2022 à 07:23, Rodion Efremov  a
> > écrit :
> > >
> > > Hello Matt and community,
> > >
> > > I have created an ASF JIRA issue back in the days:
> > >
> > https://issues.apache.org/jira/projects/COLLECTIONS/issues/COLLECTIONS-797?filter=allopenissues
> >
> > I suggest that the benchmark tables and figures be copied over there.
> > [Preferably, in separate files in formats that can be readily
> > displayed in a browser.]
> >
> > Gilles
> >
> > >
> > > Best regards,
> > > rodde
> > >
> > > On Mon, Jul 11, 2022 at 6:23 AM Matt Juntunen  > >
> > > wrote:
> > >
> > > > Hello rodde,
> > > >
> > > > Thanks for your patience while I looked at this. I've made a PR [1] on
> > > > your benchmark project with an updated benchmark class. (I used the
> > > > completely uninspired class name of IndexedLinkedListPerformance2 :-)
> > > > The results back up what you've been saying about the performance of
> > > > this list implementation. I've attached a spreadsheet summarizing the
> > > > data for a number of different operations along with some images of
> > > > some of the most interesting comparisons. I've compared results for
> > > > java.util.ArrayList, java.util.LinkedList,
> > > > org.apache.commons.collections4.list.TreeList, and
> > > > com.github.coderodde.util.IndexedLinkedList (the list in question
> > > > here) using JDK 18 on list sizes of 10, 100, 1000, and 1.
> > > >
> > > > Below are some notes on the attached images.
> > > > - get-random.png - Displays timings for element access at random
> > > > indices. As expected, ArrayList is by far the best. TreeList and
> > > > IndexedLinkedList are relatively close to each other but
> > > > IndexedLinkedList is consistently faster. LinkedList was too terrible
> > > > to even include on the graph.
> > > > - iterate.png - Displays timings for list traversal using the list's
> > > > iterator. This was unexpectedly bad for TreeList, which performed far
> > > > worse than the others. The performance of IndexedLinkedList was on par
> > > > with the JDK lists overall.
> > > > - iterate-and-modify.png - Displays timings for iterating through the
> > > > list while randomly adding and removing elements via the iterator.
> > > > IndexedLinkedList did extraordinarily well here, with performance very
> > > > close to LinkedList. Surprisingly, TreeList did worse than all of the
> > > > others, including ArrayList.
> > > >
> > > > Overall, I think this list implementation would be a good option to
> > > > include in commons-collections. Does anyone have any objections to
> > > > opening a Jira ticket to pursue this?
> > > >
> > > > Regards,
> > > > Matt J
> > > >
> > > > [1] https://github.com/coderodde/IndexedLinkedListBenchmark/pull/3
> > > >
> > > > > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [collections] JMH results for IndexedLinkedList

2022-07-11 Thread Gilles Sadowski
Hi.

Le lun. 11 juil. 2022 à 07:23, Rodion Efremov  a écrit :
>
> Hello Matt and community,
>
> I have created an ASF JIRA issue back in the days:
> https://issues.apache.org/jira/projects/COLLECTIONS/issues/COLLECTIONS-797?filter=allopenissues

I suggest that the benchmark tables and figures be copied over there.
[Preferably, in separate files in formats that can be readily
displayed in a browser.]

Gilles

>
> Best regards,
> rodde
>
> On Mon, Jul 11, 2022 at 6:23 AM Matt Juntunen 
> wrote:
>
> > Hello rodde,
> >
> > Thanks for your patience while I looked at this. I've made a PR [1] on
> > your benchmark project with an updated benchmark class. (I used the
> > completely uninspired class name of IndexedLinkedListPerformance2 :-)
> > The results back up what you've been saying about the performance of
> > this list implementation. I've attached a spreadsheet summarizing the
> > data for a number of different operations along with some images of
> > some of the most interesting comparisons. I've compared results for
> > java.util.ArrayList, java.util.LinkedList,
> > org.apache.commons.collections4.list.TreeList, and
> > com.github.coderodde.util.IndexedLinkedList (the list in question
> > here) using JDK 18 on list sizes of 10, 100, 1000, and 1.
> >
> > Below are some notes on the attached images.
> > - get-random.png - Displays timings for element access at random
> > indices. As expected, ArrayList is by far the best. TreeList and
> > IndexedLinkedList are relatively close to each other but
> > IndexedLinkedList is consistently faster. LinkedList was too terrible
> > to even include on the graph.
> > - iterate.png - Displays timings for list traversal using the list's
> > iterator. This was unexpectedly bad for TreeList, which performed far
> > worse than the others. The performance of IndexedLinkedList was on par
> > with the JDK lists overall.
> > - iterate-and-modify.png - Displays timings for iterating through the
> > list while randomly adding and removing elements via the iterator.
> > IndexedLinkedList did extraordinarily well here, with performance very
> > close to LinkedList. Surprisingly, TreeList did worse than all of the
> > others, including ArrayList.
> >
> > Overall, I think this list implementation would be a good option to
> > include in commons-collections. Does anyone have any objections to
> > opening a Jira ticket to pursue this?
> >
> > Regards,
> > Matt J
> >
> > [1] https://github.com/coderodde/IndexedLinkedListBenchmark/pull/3
> >
> > > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [collections] Recording the BloomFilter package in changes

2022-07-05 Thread Gilles Sadowski
Hi.

Le mar. 5 juil. 2022 à 14:54, Alex Herbert  a écrit :
>
> Claude is currently working through changes to the Bloom filter package.
> There are a number of Jira tickets for each change. However this is
> unreleased code and so I believe we do not have to put them all in the
> changes.xml.

I agree that "changes.xml" should only contain updates wrt officially
released code.

> Currently there is only this entry:
>
>  due-to="Claude Warren">
>   BloomFilter contribution.
> 
>
> I have linked the 728 ticket to all the child tickets. Is this enough for
> the new bloomfilter package within changes.xml.
>
> I resolved a recent ticket as 'Fixed'. However it will then appear in the
> changes report on the site. Perhaps these should be resolved as 'Done' so
> will be ignored in the report.

There is an automatically generated "JIRA report" (whereas the contents
of "changes.xml" updated "manually").
Am I right that "changes.xml" should be user-oriented while JIRA is mainly
developer-oriented?

> The new bloomfilter package will then be
> recorded in changes as a single entry noting the contribution (ignoring all
> the modifications that have been made during development).

+1

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [CRYPTO] Problem with JNA on macOS and Windows

2022-06-30 Thread Gilles Sadowski
Le jeu. 30 juin 2022 à 23:35, sebb  a écrit :
>
> On Thu, 30 Jun 2022 at 16:21, Alex Remily  wrote:
> >
> >  > loading the dll, whereas Java apparently does not.>
> >
> > I experience the same behavior.  What's more interesting is that when I run
> > the main function from Eclipse as a Run Configuration with the
> > LD_LIBRARY_PATH set  as an Environment variable appended to the native
> > environment, it runs as expected.  As of yet I haven't found a way to get
> > the java CLI to recognize the LD_LIBRARY_PATH environment variable, even
> > though it echos out correctly when queried.  Strange.

Did you try with "java.library.path" (cf. [1])?

Gilles

[1] 
https://stackoverflow.com/questions/27945268/difference-between-using-java-library-path-and-ld-library-path

> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [collections] JMH results for IndexedLinkedList

2022-06-30 Thread Gilles Sadowski
Hello.

Le jeu. 30 juin 2022 à 07:54, Rodion Efremov  a écrit :
>
> Hi Matt and community,
>
> Have you take a look at my work? If so, what do you think?

Perhaps you missed that feedback:
  https://markmail.org/message/y3tozjdke2ivz3dr

>
> Best regards,
> rodde
>
> to 23.6.2022 klo 19.06 Matt Juntunen  kirjoitti:
>
> > Hello,
> >
> > Thanks for providing the data here. I will hopefully have time to take
> > a look at this over the weekend. Send me a ping here on the dev list
> > if you don't hear back from me (or someone else) within a week.
> >
> > Regards,
> > Matt J
> >
> > On Tue, Jun 21, 2022 at 7:22 AM Rodion Efremov 
> > wrote:
> > >
> > > Hi,
> > >
> > > Data structure: IndexedLinkedList
> > > 
> > > Benchmark: IndexedLinkedListBenchmark
> > > 
> > >
> > > Benchmark output:
> > > https://github.com/coderodde/indexedLinkedList/#benchmark-output
> > >
> > > From the benchmark output, we can judge that IndexedLinkedList
> > outperforms
> > > both java.util.LinkedList and the Apache Commons Collections TreeList.
> > It,
> > > however, does not seem to supersede the java.util.ArrayList.
> > >
> > > Basically, I would expect the IndexedLinkedList to beat the ArrayList
> > where
> > > we have a lot of following operations:
> > >
> > >- addFirst(E)
> > >- add(int, E)
> > >- remove(int)
> > >
> > > So, what do you think? I am getting anywhere with that?
> > >
> > > Best regards,
> > > rodde

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [ALL] Add *.json to RAT excludes in parent POM ?

2022-06-27 Thread Gilles Sadowski
Hello.

Le lun. 27 juin 2022 à 15:30, Gary Gregory  a écrit :
>
> Test fixtures in Commons Configuration and SCXML for example.
>
> Visual Code project settings in Commons Daemon
>
> RevApi configuration in Commons RDF.

Can't some other format be used that doesn't have this shortcoming?

Gilles

>
> Just search for *.json...
>
> Gary
>
>
> On Mon, Jun 27, 2022, 08:17 Gilles Sadowski  wrote:
>
> > Hello.
> >
> > Le lun. 27 juin 2022 à 11:34, sebb  a écrit :
> > >
> > > JSON files don't support comments, so it's not feasible to add an AL
> > > header to them.
> >
> > What are files in this format used for (in a "Commons" component)?
> >
> > Gilles
> >
> > >
> > > So rather than have to add such excludes at component level, it might
> > > be a good idea to update the parent POM.
> > >
> > > Any objections?
> > >
> > > Sebb

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] GA Design

2022-06-27 Thread Gilles Sadowski
Hello Avijit.

As the OP of the proposal on improving this part of the "Commons
Math" codebase, are you still willing to reach a consensus about
the purpose and design of that functionality?

Regards,
Gilles

Le mar. 31 mai 2022 à 23:52, Gilles Sadowski  a écrit :
>
> Hello.
>
> I dug further into the refactoring of the "genetic algorithm" (GA)
> functionality of [Math] (in package "o.a.c.math4.legacy.genetics"
> currently in the "master" branch of the repository).
>
> This post is a reboot of the discussion thread with Subject:[1]
>   [Math] Review of "genetic algorithm" module
> that is only very slowly converging to what the intended usage is of
> the GA implemented in "Commons Math" and the minimal (public)
> API for that.
>
> I thus followed up on my initial view[2] of a concise implementation
> that avoids the issues which I raised during the previous discussion.
>
> Please have a look at the code committed in branch
>   feature__MATH-1563__genetic_algorithm
> of the repository[3] (in a new "commons-math-ga2" maven module,
> for easier comparison with Avijit's proposal).
>
> AFAICT, this design can provide all the functionalities mentioned in the
> discussion (a.o. the "adaptive rate"[4]) although it is not complete (and
> perhaps contains a few bugs):
>  * Better names for some classes?
>  * Should "Population" allow duplicates?
>  * Port unit tests suite.
>  * Which genotype representation(s) to support?
>  * ...
>
> Regards,
> Gilles
>
> [1] https://markmail.org/message/2mzdbozc6nwobc37
> [2] https://issues.apache.org/jira/browse/MATH-1618
> [3] 
> https://gitbox.apache.org/repos/asf?p=commons-math.git;a=shortlog;h=refs/heads/feature__MATH-1563__genetic_algorithm
> [4] https://issues.apache.org/jira/browse/MATH-1563

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math][Numbers] Regression (due to a change in "BigFraction" class?)

2022-06-27 Thread Gilles Sadowski
Hello.

Le lun. 27 juin 2022 à 01:18, Alex Herbert  a écrit :
>
> On Sun, 26 Jun 2022 at 21:27, Gilles Sadowski  wrote:
>
> >
> >
> > >
> > > Strangely I am not receiving emails from GH actions (or jenkins) to
> > inform
> > > me that the build fails after a commit. There may be a setting for the GH
> > > actions that is missing to enable e-mail to the committer after a build
> > > failure.
> >
> > I also did not see the failure until trying a local build of [Math].
> > Isn't it related to the fact that such test failures entail that the build
> > is tagged as "unstable" rather than "failed"?
> >
>
> For GH actions I think we can add this to the .asf.yaml [1,2]:
>
> notifications:
>   jobs:   dev@commons.apache.org
>
> I am not sure if we should try this with dev@ or use another e-mail list.

There is a "notificati...@commons.apache.org" ML.

>
> For Jenkins the post build editable e-mail notification section had
> 'Disable Extended Email Publisher' selected. I have unchecked this box (as
> per the RNG config); we will wait to see if it now sends emails on a build
> error. Note that statistics and math also have this setting unchecked.
> Geometry has the setting checked (perhaps it should be updated).
>
> However this may not be the setting to fix this since it was not checked in
> the math config and math had a recent build failure after it was triggered
> by a change in numbers. The email is targeted at the developer who created
> the commit. So the math build failure should have been sent to me since I
> have the most recent commit on master. But I received no email. The math
> build is logged as a failed build but the log output shows that maven still
> completes remaining modules and uploads the SNAPSHOT artifacts.
>
> Also note that the Jenkins build for numbers continues to deploy all
> modules after a module has failed tests (see [3] for the failed build after
> the offending commit). So there is something in the Jenkins setup that is
> ignoring test failures and continuing with the build to deploy artifacts
> from all the modules.

I also vaguely noticed it.
It's not the first time that Jenkins changes behaviour without
any action on our part (and no notice from INFRA that this
could happen)...

Regards,
Gilles

>
> Alex
>
> [1]
> https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-GitHubActionsbuildstatusemails
> [2]
> https://cwiki.apache.org/confluence/display/INFRA/Git+-+.asf.yaml+features#Git.asf.yamlfeatures-Notificationsettingsforrepositories
> [3] https://ci-builds.apache.org/job/Commons/job/commons-numbers/150/

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-06-27 Thread Gilles Sadowski
Hello.

> > [...]
> > I have raised PR #113 after rebasing to the master branch with Alex's
> > checkstyle changes
> >
> > As per feedback, I have made the following changes
> > a) Added javadoc comments
> > b) Ensured test coverage
> > c) Renamed accessors on the interface
> >
>
> [...]
>
> >
> >
> > > In "DComplex", I propose that the accessors be named "real()" and
> > > "imag()" (or just
> > > "re()" and "im()").  ["DComplex" is not a very satisfying name either...]
> >
> > For the interface name, shall I change it to Complex64 from DComplex?
> >
>
> In c the 'complex' keyword is a suffix:
>
> double complex c1;
> float complex c2;
> long double complex c3;
>
> In c++ the type is generic (and read as a suffix):
>
> complex c1;
> complex c2;
>
> Either of these would be my preference over DComplex or Complex64.

Just to be sure: Are we discussing this because "Complex" is
already taken?

> > > Are we sure that all this code needs to be part of the public API?
> > > If not, I'd suggest limiting accessibility to "package-private".
> >
> > Are you referring to the static methods in ComplexFunctions and
> > ComplexBiFunctions classes?
> > I think they would need to be public for developers to be able to compose
> > multiple operations...
> >
>
> The static helper functions have been extracted to support all the ISO c99
> operations on the list structure of complex numbers.
>
> A list will ideally implement a generic foreach operation. So to apply a
> single function only requires making the static functions public. The
> alternative is to make the list expose all the ISO c99 operations in its
> public API.
>
> To create a composite function that eventually writes back to the list can
> be implemented by writing intermediate values to a result which is then
> passed to the next operation. This can be satisfied by using the Complex
> class. This already exposes all the ISO c99 functions. So perhaps it is not
> required to make all the helper functions public for the purpose of
> composing multiple operations. But it would be helpful for all the single
> operations.
>

I may be one or more steps behind, sorry, but I still cannot figure
out how the API is supposed to be applied (IOW, the "use cases").
I'm still at "provide functions that operate on a list of complex numbers".
But the subsequent question: For what purpose?
Some weeks ago (IIRC), I asked the same and whether the only use
case was FFT...

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [ALL] Add *.json to RAT excludes in parent POM ?

2022-06-27 Thread Gilles Sadowski
Hello.

Le lun. 27 juin 2022 à 11:34, sebb  a écrit :
>
> JSON files don't support comments, so it's not feasible to add an AL
> header to them.

What are files in this format used for (in a "Commons" component)?

Gilles

>
> So rather than have to add such excludes at component level, it might
> be a good idea to update the parent POM.
>
> Any objections?
>
> Sebb

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math][Numbers] Regression (due to a change in "BigFraction" class?)

2022-06-26 Thread Gilles Sadowski
Hello.

Le dim. 26 juin 2022 à 22:17, Alex Herbert  a écrit :
>
> On Sun, 26 Jun 2022 at 21:12, Gilles Sadowski  wrote:
>
> > Hello.
> >
> > Jenkins (as well as local build of [Math]) is failing:
> >   https://ci-builds.apache.org/job/Commons/job/commons-math/336/
> >
> > Two regressions appeared in unit tests untouched for ages:
> >   org.apache.commons.math4.legacy.analysis.polynomials.PolynomialsUtilsTest
> >
> > org.apache.commons.math4.legacy.ode.nonstiff.AdamsNordsieckTransformerTest
> >
> > The impacted classes
> >   org.apache.commons.math4.legacy.analysis.polynomials.PolynomialsUtils
> >   org.apache.commons.math4.legacy.ode.nonstiff
> > both depend on class "BigFraction" from [Numbers] that has been modified
> > in commit 1497df18dfb77f454450d71733c31a47560c6845.
> > There, parentheses have been removed in logical tests.
> > Could this seemingly innocuous change be causing this issue?
> >
>
> Yes. It is a rounding error I introduced when erasing the wrong set of
> parentheses. I have corrected the error. Sorry for the mistake.

Once in a (long) while. ;-)
It's great to see that the test suites are working!

>
> Strangely I am not receiving emails from GH actions (or jenkins) to inform
> me that the build fails after a commit. There may be a setting for the GH
> actions that is missing to enable e-mail to the committer after a build
> failure.

I also did not see the failure until trying a local build of [Math].
Isn't it related to the fact that such test failures entail that the build
is tagged as "unstable" rather than "failed"?

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[Math][Numbers] Regression (due to a change in "BigFraction" class?)

2022-06-26 Thread Gilles Sadowski
Hello.

Jenkins (as well as local build of [Math]) is failing:
  https://ci-builds.apache.org/job/Commons/job/commons-math/336/

Two regressions appeared in unit tests untouched for ages:
  org.apache.commons.math4.legacy.analysis.polynomials.PolynomialsUtilsTest
  org.apache.commons.math4.legacy.ode.nonstiff.AdamsNordsieckTransformerTest

The impacted classes
  org.apache.commons.math4.legacy.analysis.polynomials.PolynomialsUtils
  org.apache.commons.math4.legacy.ode.nonstiff
both depend on class "BigFraction" from [Numbers] that has been modified
in commit 1497df18dfb77f454450d71733c31a47560c6845.
There, parentheses have been removed in logical tests.
Could this seemingly innocuous change be causing this issue?
If so, there is a probably unwanted side-effect (either in [Numbers] or in
[Math]).

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-06-24 Thread Gilles Sadowski
Hello.

Le ven. 24 juin 2022 à 16:59, Sumanth Rajkumar
 a écrit :
>
> Hi Alex, Gilles, and Matt,
>
> I have raised a PR to the complex-gsoc-22 branch and it has been linked to
> the NUMBERS-188 jira.

One tenet of a project such as "Commons" is that everything must be
documented.[1]
For the Javadoc comments, please apply the same style as in other source files.

Formatting should also be taken care of (to help review, and future
maintenance).[2]

Are we sure that all this code needs to be part of the public API?  If
not, I'd suggest
limiting accessibility to "package-private".

In "DComplex", I propose that the accessors be named "real()" and
"imag()" (or just
"re()" and "im()").  ["DComplex" is not a very satisfying name either...]

Thanks,
Gilles

[1] I'm a bit surprised that the build succeeds despite the missing comments.
[2] E.g. one argument per line improves readability (IMHO).

>
> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [collections] JMH results for IndexedLinkedList

2022-06-24 Thread Gilles Sadowski
Hello.

Le mar. 21 juin 2022 à 13:21, Rodion Efremov  a écrit :
>
> Hi,
>
> Data structure: IndexedLinkedList
> <https://github.com/coderodde/IndexedLinkedList>
> Benchmark: IndexedLinkedListBenchmark
> <https://github.com/coderodde/IndexedLinkedListBenchmark>
>
> Benchmark output:
> https://github.com/coderodde/indexedLinkedList/#benchmark-output
>
> From the benchmark output, we can judge that IndexedLinkedList outperforms
> both java.util.LinkedList and the Apache Commons Collections TreeList. It,
> however, does not seem to supersede the java.util.ArrayList.
>
> Basically, I would expect the IndexedLinkedList to beat the ArrayList where
> we have a lot of following operations:
>
>- addFirst(E)
>- add(int, E)
>- remove(int)
>
> So, what do you think? I am getting anywhere with that?

Do you have use-cases?

Could perhaps extend the benchmarks to all the JDK collections
that could target the same use-cases?
Also, it may be worth actually testing different sizes.
[For easier reading, please make a table with the JMH results.]

Thanks,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers] Complex Float Support

2022-06-24 Thread Gilles Sadowski
Hello.

Le jeu. 23 juin 2022 à 21:27, Sumanth Rajkumar
 a écrit :
>
> Hi,
>
> As part of NUMBERS-187 (Enhanced Support for complex numbers), should we
> add support for float types?

I'd say no.
At least, any computation should be done with "double" precision.

Converting from/to "float" would only be the first/last step during
in-place processing of large lists of complex numbers.
Hence, it would be oblivious of the "complex" nature of a pair of
"double"s, whatever the input/output layout.

Anyway, this would seem very low priority (unless you have a use
case).

> If yes, for operations on float types, can we reuse functions (part of
> NUMBERS-188) that operate on double types

An example?

> and use Java float-double
> widening and narrowing conversions?

That would be a cast (from "float" to "double" on input and vice-versa
on output), no?

Regards,
Gilles

>
> https://docs.oracle.com/javase/specs/jls/se10/html/jls-5.html#jls-5.1.2
>
> Thanks,
> Sumanth

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [CANCELLED] [VOTE] Release Apache Commons Configuration 2.8 based on RC1

2022-06-22 Thread Gilles Sadowski
Hello.

Le mer. 22 juin 2022 à 20:59, Matt Juntunen
 a écrit :
>
> Gary,
>
> I was unaware of this. Is this a new convention that we've decided on?

Although Gary's suggestion would be a slight improvement, most
of the components indeed do not follow that convention.
This is the kind of common practice that should be voted on, and
eventually enforced.

> If not, I'd prefer to wait for the next release since "2.8" is
> consistent with previous commons-configuration releases and the vote
> has already started on rc2.

Sure.

Regards,
Gilles

>
> Regards,
> Matt J
>
> On Wed, Jun 22, 2022 at 7:45 AM Gary Gregory  wrote:
> >
> > Please use 2.8.0, I've been using the 3 part version format for all recent
> > releases. I think it would be nice to follow this naming here.
> >
> > Gary
> >
> > > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-06-20 Thread Gilles Sadowski
Hello Sumanth.

Le lun. 20 juin 2022 à 07:05, Sumanth Rajkumar
 a écrit :
>
> [...]
>
> Also, should we add support for float data type for complex numbers?

In the "dev" ML, we customarily aim at focussing each discussion, so
that people can easily decide (ideally, from the "Subject:" line) if they
want to voice some opinion.  [This also improves searches in the ML
archive.]
Thus could you please start a new thread with this question?
[Note that the relationships between all technical issues related to the
extension of the functionalities around "Complex" is better managed
through links between JIRA issues than by lengthy threads here.]

Thanks,
Gilles

>
> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-configuration] branch master updated: fixing binary incompatibilities with v2.7

2022-06-20 Thread Gilles Sadowski
Hello.

Le lun. 20 juin 2022 à 04:25, Matt Juntunen
 a écrit :
>
> Is someone able to confirm that the
> METHOD_NO_LONGER_THROWS_CHECKED_EXCEPTION japicmp compatibility change
> is something we're ok with between releases? Note that this is a
> binary compatible change but not a source compatible one.

Is there any advantage in distinguishing between source and binary
compatibility?
IOW, what do we gain by changing this now, rather than in the next
major release?

Is the general rule (for any "Commons" release) that if a developer
recompiles his application, he should always expect to make changes
in his code?
If so, shouldn't we always mention both kinds of compatibility in the
release notes?

Regards,
Gilles

>
> Regards,
> Matt J
>
> On Sun, Jun 19, 2022 at 10:50 AM Matt Juntunen
>  wrote:
> >
> > Hello,
> >
> > The throw clauses in question are on protected methods. If a user had
> > overridden these and then thrown an exception, they may have to modify
> > their source in order to compile against 2.8. Is this ok from the
> > point of view of our backwards compatibility guarantees?
> >
> > Regards,
> > Matt J
> >
> > On Sun, Jun 19, 2022 at 2:26 AM Gary Gregory  wrote:
> > >
> > > This change is incorrect, binary compatibility was NOT broken as the JLS
> > > specifies that:
> > >
> > > "Changes to the throws clause of methods or constructors do not break
> > > compatibility with pre-existing binaries; these clauses are checked only 
> > > at
> > > compile time."
> > >
> > > See
> > > https://docs.oracle.com/javase/specs/jls/se8/html/jls-13.html#jls-13.4.21
> > >
> > > The Maven default goal runs JApiCmp which checks this.
> > >
> > > This free us to clean up our code.
> > >
> > > If a user is actually reconciling sources, then, yes, they may have to
> > > adjust call sites, which ok. Binary compatibility is maintained.
> > >
> > > Gary
> > >
> > > On Sat, Jun 18, 2022, 23:59  wrote:
> > >
> > > > This is an automated email from the ASF dual-hosted git repository.
> > > >
> > > > mattjuntunen pushed a commit to branch master
> > > > in repository
> > > > https://gitbox.apache.org/repos/asf/commons-configuration.git
> > > >
> > > >
> > > > The following commit(s) were added to refs/heads/master by this push:
> > > >  new 2e39ef6b fixing binary incompatibilities with v2.7
> > > > 2e39ef6b is described below
> > > >
> > > > commit 2e39ef6b3909425db1ccf6c1bb58d76f953b5f9a
> > > > Author: Matt Juntunen 
> > > > AuthorDate: Sat Jun 18 23:59:17 2022 -0400
> > > >
> > > > fixing binary incompatibilities with v2.7
> > > > ---
> > > >  .../apache/commons/configuration2/YAMLConfiguration.java  | 15
> > > > ---
> > > >  .../configuration2/builder/ConfigurationBuilderEvent.java |  2 +-
> > > >  .../org/apache/commons/configuration2/event/Event.java|  2 +-
> > > >  .../commons/configuration2/interpol/ConstantLookup.java   |  4 ++--
> > > >  4 files changed, 12 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git
> > > > a/src/main/java/org/apache/commons/configuration2/YAMLConfiguration.java
> > > > b/src/main/java/org/apache/commons/configuration2/YAMLConfiguration.java
> > > > index 705c2a21..4732e3f3 100644
> > > > ---
> > > > a/src/main/java/org/apache/commons/configuration2/YAMLConfiguration.java
> > > > +++
> > > > b/src/main/java/org/apache/commons/configuration2/YAMLConfiguration.java
> > > > @@ -17,6 +17,12 @@
> > > >
> > > >  package org.apache.commons.configuration2;
> > > >
> > > > +import java.io.IOException;
> > > > +import java.io.InputStream;
> > > > +import java.io.Reader;
> > > > +import java.io.Writer;
> > > > +import java.util.Map;
> > > > +
> > > >  import org.apache.commons.configuration2.ex.ConfigurationException;
> > > >  import 
> > > > org.apache.commons.configuration2.ex.ConfigurationRuntimeException;
> > > >  import org.apache.commons.configuration2.io.InputStreamSupport;
> > > > @@ -27,12 +33,6 @@ import org.yaml.snakeyaml.Yaml;
> > > >  import org.yaml.snakeyaml.constructor.Constructor;
> > > >  import org.yaml.snakeyaml.representer.Representer;
> >

Re: [PARENT] Update Apache pom version to 26?

2022-06-17 Thread Gilles Sadowski
Le ven. 17 juin 2022 à 15:16, sebb  a écrit :
>
> As the subject says - maybe we should update from v24 to v26

Why not?

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [CRYPTO] Multiple Docker files - are they both needed?

2022-06-16 Thread Gilles Sadowski
Hello.

Le jeu. 16 juin 2022 à 19:36, Alex Remily  a écrit :
>
> Do you think we could simply use CRYPTO-120
> <https://issues.apache.org/jira/browse/CRYPTO-120>?

Maybe.
Or open a new one with an up-to-date description,
and signal that it replaces old/obsolete reports...

Gilles

> On Thu, Jun 16, 2022 at 1:31 PM Gilles Sadowski 
> wrote:
>
> > Hello.
> >
> > Since there is an issue to be solved, could you file a report on JIRA?
> > [And post there patches or new files, and instructions.]
> >
> > Thanks,
> > Gilles
> >
> > Le jeu. 16 juin 2022 à 19:18, Alex Remily  a écrit
> > :
> > >
> > > I just ran [2], and whatever it does, it doesn't appear to do a build of
> > > commons-crypto.  I'd appreciate it if any developers who have the time
> > > would take a look at the dockerfile here:
> > >
> > > https://github.com/aremily/commons-crypto
> > >
> > > If you're copying the dockerfile into your own fork, you'll need the
> > > makefile.common file as well.
> > >
> > > Alex
> > >
> > > On Thu, Jun 16, 2022 at 1:07 PM Jochen Wiedmann <
> > jochen.wiedm...@gmail.com>
> > > wrote:
> > >
> > > > On Thu, Jun 16, 2022 at 7:00 PM sebb  wrote:
> > > >
> > > > > [1] src/docker/Dockerfile
> > > > > [2] src/conf/Docker/Dockerfile-luw
> > > >
> > > > Have to admit, that I wasn't aware of [2], when I created [1]. Mine is
> > > > incomplete, and can easily be removed. Was basically just an attempt
> > > > to reproduce the build instructions in the hope, that others would
> > > > verify, and fix my errors.
> > > >
> > > > Jochen

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [CRYPTO] Multiple Docker files - are they both needed?

2022-06-16 Thread Gilles Sadowski
Hello.

Since there is an issue to be solved, could you file a report on JIRA?
[And post there patches or new files, and instructions.]

Thanks,
Gilles

Le jeu. 16 juin 2022 à 19:18, Alex Remily  a écrit :
>
> I just ran [2], and whatever it does, it doesn't appear to do a build of
> commons-crypto.  I'd appreciate it if any developers who have the time
> would take a look at the dockerfile here:
>
> https://github.com/aremily/commons-crypto
>
> If you're copying the dockerfile into your own fork, you'll need the
> makefile.common file as well.
>
> Alex
>
> On Thu, Jun 16, 2022 at 1:07 PM Jochen Wiedmann 
> wrote:
>
> > On Thu, Jun 16, 2022 at 7:00 PM sebb  wrote:
> >
> > > [1] src/docker/Dockerfile
> > > [2] src/conf/Docker/Dockerfile-luw
> >
> > Have to admit, that I wasn't aware of [2], when I created [1]. Mine is
> > incomplete, and can easily be removed. Was basically just an attempt
> > to reproduce the build instructions in the hope, that others would
> > verify, and fix my errors.
> >
> > Jochen

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Crypto] Should "Commons" provide platform-specific binaries ?

2022-06-14 Thread Gilles Sadowski
Hello.

Le mar. 14 juin 2022 à 17:21, Gary Gregory  a écrit :
>
> That would make it pretty painful for users IMO

The price to pay for playing outside the FLOSS ecosystem.

> and we'd need to make
> sure users are pointed to a "safe" and authentic place to get the
> binaries in addition to the jars.

No, we don't need to be sure; that's the point about Commons
not being responsible to remediate a security issue in source
code that doesn't come from "here".

>
> We can leave it up to the RM as to what to do on a per release basis I
> suppose, but I would not like us to build code and extra gadgetry to
> support this.

The idea was to reduce the burden.

>
> I did the previous release and would do the next one if no one else
> can. You must use macOs hardware to legally produce macOS binaries and
> you must use a legal copy of Windows for the Windows binary, that's
> the only hurdle I think.

Of course, that is the problem.

> Linux/Ubuntu is free and anyone can do that
> with Docker.

Or without it.

Gilles

>
> Gary
>
> On Tue, Jun 14, 2022 at 9:21 AM Gilles Sadowski  wrote:
> >
> > Hello.
> >
> > Given the trouble it entails and the very few people who can or want
> > to be involved in (the maintenance of) cross-compilation, wouldn't it
> > be safer to make all binaries optional?
> > It would be the application developers' responsibility to drop them to
> > a location where the [Crypto] wrapper can find them.
> >
> > From a security POV, it seems (?) that this approach could dramatically
> > lower (or even remove) Commons' responsibility (and ensuing burden)
> > in case of vulnerabilities in the native code(s).
> >
> > Regards,
> > Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Crypto] What is it ?

2022-06-14 Thread Gilles Sadowski
Le mar. 14 juin 2022 à 16:38, Alex Remily  a écrit :
>
> As a user and past contributor, my view of Commons Crypto is that it is a
> Java wrapper around certain common features of OpenSSL, full stop.

Maybe the mistake was in the name (?).

> As
> such, it provides near-native performance to the Java developer for
> processor-intensive operations via a Java API.  It is a set of tools for
> developing cryptographic applications in Java only to the extent that it
> exposes underlying OpenSSL functionality.  The latest contribution offer
> seems consistent with that scope because it relates specifically to
> functionality that currently exists in the underlying native library.

IMO, this description more clearly delineates the scope than what is
currently on the web site.[1]

> I don't have a strong opinion as to whether the project should be converted
> to multi-module, but I'm unclear as to whether or not it would confer any
> additional benefit.  The existing project structure already separates the
> native source from the Java source, and the maven build process already
> compiles and packages the native and Java source and binaries
> appropriately. There is a Docker file currently in the repository, used
> for the 1.1 release, that manages the cross-compilation for the JNI
> libraries

>From recent conversation, the build process didn't look very smooth,
and ultra-fragile given the few people (1 ?) who feel comfortable
managing it.

> and packages them into the commons-crypto jar.

See also my other post, about the possible (?) security implications.

Regards,
Gilles

[1] https://commons.apache.org/proper/commons-crypto

> Anyway, that's my $0.02.
>
> Alex
>
> On Tue, Jun 14, 2022 at 9:10 AM Gilles Sadowski 
> wrote:
>
> > Hello.
> >
> > Contradicting comments about the latest contribution offer[1] suggest
> > that the scope of the [Crypto] component is ill-defined.
> >
> > Is it a Java wrapper around a specific library ("openssl")?
> > Is it a set of tools (a.o. strong random number generators) for developing
> > cryptographic applications in Java?
> > Is it both?  Does it intend to be more?
> >
> > In order to simplify maintenance (and clarify expectations), shouldn't it
> > become a (maven) multi-module project, with explicit separation between
> > platform-agnostic functionality and platform-specific (native) codes?
> >
> > Regards,
> > Gilles
> >
> > [1] https://issues.apache.org/jira/browse/CRYPTO-162

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[Crypto] Should "Commons" provide platform-specific binaries ?

2022-06-14 Thread Gilles Sadowski
Hello.

Given the trouble it entails and the very few people who can or want
to be involved in (the maintenance of) cross-compilation, wouldn't it
be safer to make all binaries optional?
It would be the application developers' responsibility to drop them to
a location where the [Crypto] wrapper can find them.

>From a security POV, it seems (?) that this approach could dramatically
lower (or even remove) Commons' responsibility (and ensuing burden)
in case of vulnerabilities in the native code(s).

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[Crypto] What is it ?

2022-06-14 Thread Gilles Sadowski
Hello.

Contradicting comments about the latest contribution offer[1] suggest
that the scope of the [Crypto] component is ill-defined.

Is it a Java wrapper around a specific library ("openssl")?
Is it a set of tools (a.o. strong random number generators) for developing
cryptographic applications in Java?
Is it both?  Does it intend to be more?

In order to simplify maintenance (and clarify expectations), shouldn't it
become a (maven) multi-module project, with explicit separation between
platform-agnostic functionality and platform-specific (native) codes?

Regards,
Gilles

[1] https://issues.apache.org/jira/browse/CRYPTO-162

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-06-11 Thread Gilles Sadowski
Hello.

> [...]
>
> interface ComplexDoubleArray {
> Stream stream(int start, int length);
> }
>
> ComplexDoubleArray a;
> // Will use the Java 8 ForkJoinPool.commonPool() for parallel execution
> a.stream(start, length).parallel().forEach(x -> ComplexFunctions.conj(x,
> x));
>
> class ComplexFunctions {
> static void conj(ComplexDoubleArray in, ComplexDoubleArray out);
> }
>
> [...]

I have a hard time figuring out whether these bits of code are
intended to become the application developer API...
What data-structure(s) will be visible (from the application)?
What will be hidden ("implementation details")?
Do we have use-cases of non-trivial processing of N-dimensional
cubes of complex numbers?  [I imagine that the same API should
be able to also process cubes of real numbers (without storing the
"0" imaginary parts).]

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-06-10 Thread Gilles Sadowski
Hello.

Le ven. 10 juin 2022 à 15:10, Sumanth Rajkumar
 a écrit :
>
> For 1) I ran mvn verify and found the following errors
>
> org.apache.commons.numbers.complex.Complex.getImaginary():METHOD_NOW_ABSTRACT,
> org.apache.commons.numbers.complex.Complex.getReal():METHOD_NOW_ABSTRACT,
> org.apache.commons.numbers.complex.Complex:CLASS_NOW_ABSTRACT,
> org.apache.commons.numbers.complex.Complex:CLASS_TYPE_CHANGED,
> org.apache.commons.numbers.complex.ComplexCartesianImpl:METHOD_ABSTRACT_ADDED_IN_IMPLEMENTED_INTERFACE,
> org.apache.commons.numbers.complex.ComplexList:METHOD_ABSTRACT_ADDED_IN_IMPLEMENTED_INTERFACE,
> org.apache.commons.numbers.complex.ImmutableComplexList:METHOD_ABSTRACT_ADDED_IN_IMPLEMENTED_INTERFACE
>
> These are expected changes... Is there a compatibility issue here?
>
> 2) Regarding usage of arrays in functional interfaces
> @FunctionalInterface
> public interface ComplexFunction3 {
>   void apply(Complex input, int offset, double[] result);
> }
>
> The underlying implementations of Complex and ComplexList all use arrays
> and there would be no additional array instantiation /RAM allocation just
> to apply the functional interface functions

The current implementation of "Complex" encapsulates two "double" fields (not
a "double[]").
Should we make two, at first separate, discussions: One for the implementation
of the "complex number" concept, and another for (n-dimensional) lists of them?

> A) ComplexCartesianImpl data structure will change to double[]
> realAndImgPair

What gain do you expect from involving an array here?

> B) ComplexList can use single double[] for realAndImg parts (similar to all
> external implementations such as jtransform)

Yes (because of the gain in RAM usage).
But AFAICT, the goal would be to make the "double[]" an "implementation
detail" of "ComplexList" and have all operators handle "ComplexList" (or
any n-dimensional "cube") as their input/output (?).

Regards,
Gilles

>
> Thanks
> Sumanth
>
> On Fri, 10 Jun 2022 at 08:58, Gilles Sadowski  wrote:
>
> > Hello.
> >
> > Le ven. 10 juin 2022 à 14:43, Sumanth Rajkumar
> >  a écrit :
> > >
> > > Thanks for the quick response
> > >
> > > 1) I will run the mvn checks as suggested and get back to you
> > >
> > > 2) Yes, I realized the inefficiency and would not work.. I was following
> > up
> > > on another alternative but the email got sent prematurely
> > >
> > > Please ignore my previous email and consider this approach or some
> > > variation of it?
> > >
> > > @FunctionalInterface
> > > public interface ComplexFunction {
> > >   void apply(Complex input, int offset, double[] result);
> > > }
> > >
> > > Example Conjugate implementation
> > >
> > > public static void conj(Complex in, int offset, double[] result) {
> > > result[offset] = in.getReal();
> > > result[offset+1] = in.getImaginary();
> > >  }
> >
> > My first feeling would be to steer away from (ab)using array as a pair.
> > We may have to use arrays for interfacing with external tools (or perhaps
> > internally too, e.g. to minimize RAM usage when processing a large list
> > of complex numbers) but from a OO point of view, we should come up
> > with an encapsulation that ensures robustness (e.g. featuring
> > immutability).
> > Also the type(s) should be easily usable in functional programming style.
> >
> > Gilles
> >
> > >
> > > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-06-10 Thread Gilles Sadowski
Hello.

Le ven. 10 juin 2022 à 14:43, Sumanth Rajkumar
 a écrit :
>
> Thanks for the quick response
>
> 1) I will run the mvn checks as suggested and get back to you
>
> 2) Yes, I realized the inefficiency and would not work.. I was following up
> on another alternative but the email got sent prematurely
>
> Please ignore my previous email and consider this approach or some
> variation of it?
>
> @FunctionalInterface
> public interface ComplexFunction {
>   void apply(Complex input, int offset, double[] result);
> }
>
> Example Conjugate implementation
>
> public static void conj(Complex in, int offset, double[] result) {
> result[offset] = in.getReal();
> result[offset+1] = in.getImaginary();
>  }

My first feeling would be to steer away from (ab)using array as a pair.
We may have to use arrays for interfacing with external tools (or perhaps
internally too, e.g. to minimize RAM usage when processing a large list
of complex numbers) but from a OO point of view, we should come up
with an encapsulation that ensures robustness (e.g. featuring
immutability).
Also the type(s) should be easily usable in functional programming style.

Gilles

>
> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-06-10 Thread Gilles Sadowski
Hello.

> [...]
>
> >
> > We could split complex unary operators into two primitive functions (
> > ToDoubleFunction) one returning the real part of result and other
> > for imaginary part
> >
> > interface ComplexFunction  {
> >  ToDoubleFunction getReal() ;
> >  ToDoubleFunction getImaginary() ;
> > }
> >
> >
> This has concerns for efficiency.

First thought that came to my mind, being confirmed when looking
at the "conj" example (where "applyAsDouble" is called twice)...

Gilles

> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [collections] EOL 3.x?

2022-06-07 Thread Gilles Sadowski
Hello.

Le lun. 6 juin 2022 à 22:44, Bruno Kinoshita  a écrit :
>
> Hi Gary,
>
> No objections from me. I can't recall if we did that for other components.
>
> We could also, I think, have a document for Commons to reference when
> releasing a major version update of a component. In that document we could
> explain we are a team of volunteers, that if 4.x is out, users can only
> expect security updates to 3.x if there are volunteers with time to work on
> the fixes, etc. So for future questions of this type, one could just point
> to this doc. WDYT?

IMO, such information belong on that page[1]:
  https://commons.apache.org/index.html

[1] There we should probably remove the link to the article meant to
introduce "Commons" components...

Regards,
Gilles

>
> -Bruno
>
> On Tue, 7 Jun 2022 at 00:33, Gary Gregory  wrote:
>
> > Hi All:
> >
> > Should we formally announce that the 3.x line is EOL and encourage
> > users to migrate to 4.x?
> >
> > Gary

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-05-31 Thread Gilles Sadowski
Responding below to some of my own questions following commit
   ddfd5bf859d04cc5da604b20021ceaba9de7def6
in branch
  feature__MATH-1563__genetic_algorithm

Le mar. 24 mai 2022 à 01:54, Gilles Sadowski  a écrit :
>
> Hello.
>
> Le mer. 18 mai 2022 à 16:34, Avijit Basak  a écrit :
> >
> > Hi All
> >
> > Please find my comments below.
> >
> > Comments related to new model:
> >
> > 1) Hierarchy of GeneticOperator: In the proposed model the genetic
> > operators are designed hierarchically implementing a common interface and
> > operators are accepted as a list.
> > This enables interchangeability of operators. Library users would be able
> > to use crossover and mutation operators interchangeably.
> > However, in genetic algorithms or other population based search algorithms
> > the operators are used for broadly two purposes, exploration and
> > exploitation.
> > In GA crossover is used for exploitation and mutation is used for
> > exploration. Keeping them in a common hierarchy and allowing
> > interchangeable operation violates that purpose.
>
> I'm not sure that the semantics of "exploitation" and "exploration"
> should drive the API design.
> Saying it differently: We don't need to know how various operators
> will be used in order to implement them; hence IMO, there is no
> need to discriminate at the API level.

The "core" GA algorithm (see class "GeneticAlgorithmFactory") is
oblivious to whether a genetic operator "is-a" crossover or mutation.

> >
> > 2) Chromosome Fitness: In the new design the chromosome fitness is
> > maintained as a hashmap where the key is the chromosome itself.
> > Are we going to retain the fitness value in chromosome too otherwise
> > implementation of Comparable won't be possible?
>
> Sorry, I don't follow.

Implementing "Comparable" is not necessary.

> > Assuming the chromosome representation would be used to calculate hashcode,
> > it would be a very time consuming process depending on the length of
> > chromosome.
>
> Is this assumption correct?
> For what purpose do we need to compute a custom hash code?

A custom "hashCode" method is not necessary.
The only consequence seems that 2 different instances of a genotype that is
logically the same (genotype-wise) could both be added to a population while
the same (in-memory) instance would only appear once in the hash map.

>
> > E.g. in binary chromosomes we have provision to allow chromosome length
> > upto (2^31 - 1).
>
> That's a lot. ;-)
> Do you have use cases where such long representations can be handled?

For now, I've use "BitSet" as the internal representation (but this could
be changed if necessary, because it is not part of the public API).

> > Along with that the chromosome implements Comparable
> > interface.
> > Equality by Comparable interface would be decided by fitness
>
> Sure.
>
> > however
> > equality by equals() method would depend on representation.
>
> Do we need a custom "equals"?

We don't.

> > As chromosomes with different representations may also have the same
> > fitness, this would produce a conflict.
>
> Please provide an example.
>
> >
> > 3) The model does not consider anything related to adaptive approaches.
> > Would it be a separate factory?
>
> What I've proposed is an alternative "skeleton" for the API.  Of course,
> more classes will provide specific functionality.
> An adaptive operator "is-a" specialized genetic operator (perhaps the
> notion of "Adaptive" will be defined through an interface?).

We don't need anything that complex (IIUC).
See interface "ApplicationRate" and its implementations in package "rate".

> >
> > 4) Comparison of chromosomes: In the current model two chromosomes are
> > compared after decoding the genotype to phenotype using the internal
> > decoder.
> > How can we address this in the new model?
>
> As mentioned above: Do we really need to compare representations?

Within the library, it does not seem necessary.

> >
> > 5) Chromosome String representation: Currently we use the toString() method
> > to print the chromosome's phenotype. In the new model we would need to
> > avoid this approach as decoders won't be available to chromosomes.
>
> This seems like a minor issue (or perhaps no issue at all?) unless I'm
> missing something.

The GA does not need to print the phenotype.
The decoder is user-defined, hence he can obviously apply it whenever
he needs a printable version of the chromosomes.

> >
> > 6) 

[Math] GA Design

2022-05-31 Thread Gilles Sadowski
Hello.

I dug further into the refactoring of the "genetic algorithm" (GA)
functionality of [Math] (in package "o.a.c.math4.legacy.genetics"
currently in the "master" branch of the repository).

This post is a reboot of the discussion thread with Subject:[1]
  [Math] Review of "genetic algorithm" module
that is only very slowly converging to what the intended usage is of
the GA implemented in "Commons Math" and the minimal (public)
API for that.

I thus followed up on my initial view[2] of a concise implementation
that avoids the issues which I raised during the previous discussion.

Please have a look at the code committed in branch
  feature__MATH-1563__genetic_algorithm
of the repository[3] (in a new "commons-math-ga2" maven module,
for easier comparison with Avijit's proposal).

AFAICT, this design can provide all the functionalities mentioned in the
discussion (a.o. the "adaptive rate"[4]) although it is not complete (and
perhaps contains a few bugs):
 * Better names for some classes?
 * Should "Population" allow duplicates?
 * Port unit tests suite.
 * Which genotype representation(s) to support?
 * ...

Regards,
Gilles

[1] https://markmail.org/message/2mzdbozc6nwobc37
[2] https://issues.apache.org/jira/browse/MATH-1618
[3] 
https://gitbox.apache.org/repos/asf?p=commons-math.git;a=shortlog;h=refs/heads/feature__MATH-1563__genetic_algorithm
[4] https://issues.apache.org/jira/browse/MATH-1563

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-05-26 Thread Gilles Sadowski
Hello.

Le jeu. 26 mai 2022 à 07:04, Sumanth Rajkumar
 a écrit :
>
> Hi,
>
> My proposal was accepted into GSoC 2022 to work on the Numbers-186 [1] Jira
> of the Commons Numbers project.
>
> I first want to ask if I can be assigned to this Jira

Done.

> since my GSoC
> proposal was accepted.
>
> Next, I wanted to mention how I plan to start this project and was hoping
> to get some feedback.
>
> As per my proposal, the first thing I wanted to start with was the API
> design which would have interfaces to represent complex numbers, methods to
> convert to/from linear primitive arrays, Java 8 functional interfaces for
> unary/binary operators and for functions for complex operations and
> transforms involving: complex number and real numbers, complex vectors and
> scalars, complex matrix, vectors and scalars.

For each of the mentioned functionalities, types and operations,
please provide a concrete example of what you propose, indicating
whether it is a new feature or a change and whethe it is backwards
comptable (BC) or not.  [These details can be further discussed on
the JIRA page.]

Whenever possible, please ensure that the proposed changes do
not break the build.  [You can test this by opening a PR, which should
trigger an automated build.]

>
> Working on the API, am I on the right track to start with refactoring all
> the existing methods in the Complex class as static functions for use as
> lambdas?
>
> I already refactored some methods which can be viewed here [2]

Renaming is typically something to be first discussed here and/or
on JIRA.
Unless other non BC changes are foreseen for the next release,
such "cosmetic" changes must be avoided.

Thanks,
Gilles

>
> Thanks,
>
> Sumanth
>
> [1] https://issues.apache.org/jira/browse/NUMBERS-186
>
> [2]
> https://github.com/sumanth-rajkumar/commons-numbers/tree/sumanth-gsoc-22/commons-numbers-complex/src/main/java/org/apache/commons/numbers/complex
>
> On Mon, Mar 28, 2022, 7:01 PM Gilles Sadowski  wrote:
>
> > Hello.
> >
> > Le lun. 28 mars 2022 à 00:32, Sumanth Rajkumar
> >  a écrit :
> > >
> > > Thanks Alex and Gilles for your feedback
> > >
> > > So currently Commons Math transform depends on Common complex numbers,
> > and
> > > the API involves usage of Complex Object Arrays instead of primitive
> > array
> > > data structures
> > >
> > > I also briefly looked into other library implementations besides
> > Jtransform
> > > and EJML that are not pure java but have java bindings such as JBLAS[1]
> > and
> > > NAG[2]
> > > All of the implementation use single array data structures to represent
> > > Complex Lists and higher dimensional matrices
> > >
> > > Since these involve parallel data pipelines I looked into libraries that
> > > use SIMD [3] operations that use GP GPU (jcuda [4][5] /aparapi [6]) and
> > CPU
> > > (Java 17 Vector API [7])
> >
> > Thanks for the investigation!
> >
> > >
> > > Given all the alternative implementations, I agree it does not make sense
> > > to re implement transforms here.
> >
> > Some transforms are (already) implemented here.
> > Of course, it makes sense to wonder whether to keep maintaining those
> > codes, or rely on external dependencies.  The decision would depend on
> > performance comparisons and whether users are able (or allowed) to
> > interface with native libraries.  [I know of one project where "pure Java"
> > was a requirement...]
> >
> > >
> > > So instead would it be useful to provide users with
> > > 1) a complex numbers Linear Algebra and transforms API to compile against
> > > and run with any of existing providers (apache commons, jtransform, EJML,
> > > jblas)
> > > AND
> > > 2) a service provider interface to allow adapter implementations to
> > > integrate existing and future providers such as jcudas/aparapi/vector
> > APIs
> > >
> > > Do the commons library modularization dependency requirements apply to
> > > compile time dependencies only or runtime also?
> >
> > Which "dependency requirements" are you referring to?
> >
> > > To minimize bloat, the runtime dependencies could be made optional and
> > need
> > > not be transitively included by default
> >
> > Flexibility would be ideal, indeed.
> >
> > >
> > > Providing a Complex linear algebra and transforms API that can run with
> > > different runtime providers would allow users to take advantage of
> > hardware
> > 

Re: [Math] Review of "genetic algorithm" module

2022-05-23 Thread Gilles Sadowski
lazy" evaluation?
> >> >Dropping it would make the instance immutable (and "evaluate()"
> >> >should be renamed to "getFitness()").
> >> >
> >> >Why should the "FitnessFunction" be stored in every chromosome?
> >> >
> >> -- I have modified the fitness as final and initialized the same in the
> >> constructor.
> >
> >Better, but did you check my proposal in MATH-1618, where
> >Chromosome and fitness are decoupled, and their relationship
> >is held within a "Population" instance?
> --Mentioned earlier.

I still don't know whether you agree that my proposal makes it
simpler to express a GA.

> >
> >> [...]
> >
> >>
> >> >(8)
> >> >@SuppressWarnings("unchecked")
> >> >
> >> >By default, I'm a bit suspicious about having to resort to these
> >> annotations,
> >> >especially for the kind of algorithms we are trying to implement.
> >> >What do you think of the alternative approach outlined in the ZIP file
> >> >attached in MATH-1618:
> >> >https://issues.apache.org/jira/browse/MATH-1618
> >> >?
> >> -- This annotation is required because we have kept an option to use
> >> different types of genotypes including primitive.
> >> Because of that our base interfaces only declares phenotype not genotype.
> >> This introduced a kind of hierarchy in all operators and chromosome
> classes
> >> which required us to use the mentioned annotation.
> >
> >I may again be missing something.
> >Could you please explain the case that makes these annotations
> >necessary.
> -- This has been only used to avoid the warning in the place of typecasting.
> However, I can work to minimize this following your new model.

"Minimize"?

> >
> >> >
> >> >(9)
> >> >Naming of factory methods should be harmonized to match the convention
> >> >adopted in components like [RNG] and [Numbers].
> >> >E.g. instead of "newChromosome(...)", please use "of(...)" or
> "from(...)"
> >> >for "value object", and "create(...)" otherwise.
> >> >
> >> -- I have renamed the same for Chromosome classes.
> >> What about the nextGeneration() method of ListPopulation class. Renaming
> >> this to create() or from() won't convey the purpose of it.
> >
> >I agree, and that's why the new "Population" class (in MATH-1618) does
> >not provide a factory method (see also the "GeneticAlgorithmFactory"
> >class).
> -- We can avoid the same in the current model if we agree to use a default
> implementation of population and remove the Population interface following
> your new model.

So, do we adopt that "new model"?
Or do you still have objections?

> >
> >> >(10)
> >> >o.a.c.m.ga.chromosome.AbstractListChromosome
> >> >
> >> >Constructor is called with an argument that is a previously instantiated
> >> >"representation".  If the latter is mutable, the caller will be able to
> >> modify
> >> >the underlying data structure of the newly created chromosome.  [The
> >> >doc assumes immutability of the representation but this cannot be
> >> >enforced, and mixed ownership can entail subtle bugs.]
> >> -- I think this applies to both representation as well as generic
> parameter
> >> type T. But I don't see any other option but to rely on the user.
> >
> >The Javadoc (at line 84) is misleading in its mention of "immutable".
> >
> >> If you have any suggestions kindly share.
> >
> >I may not understand all the implications, but I'd suggest that the
> >"representation" be instantiated within the control of the library (e.g.
> >through a "builder"/"factory").
> -- Currently we have the ChromosomeRepresentationUtils for the same. Its
> methods are designed to generate the representations.

My suggestion is that this design can be improved (a.o. according to my
above suggestion).

> >
> >> >
> >> >(11)
> >> >Do we agree that, in a GA, the most time-consuming task is the fitness
> >> >computation?  Hence IMO, it should be the focus of the multithreading
> >> >tools (i.e. "ExecutorService"), probably keeping the other parts (namely
> >> >the genetic operators) within a simple sequential loop (as in class
> >> >"GeneticAlgorithmFactory" in MATH-1618).
> >> -- Current implementation uses separate threads for applying crossover
> and
> >> mutation operators for each pair of selected chromosomes.
> >> I think this ensures better utilization of multi-core processors compared
> >> to use of multi-threading only for the fitness calculation.
> >
> >I have the opposite intuition: Parallel application of the genetic
> >operators would only provide marginal gains wrt the fitness
> >computation.
> >In any case, I think that it will be fairly easy to modify my proposed
> >"OffspringGenerator" class to use an "ExecutorService" (if benchmarks
> >show that a substantial gain could indeed be achieved).
> >
> >> -- Some codes are checked in. But there is a conflict in the pull
> request.
> >> So I shall create a new one and delete the old branch itself.
> >
> >IMHO, we could make more substantial progress if you could
> >first point to issues with my proposal in MATH-1618.
> --Mentioned earlier.

Well, I don't know where we stand...

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [parent][rng] japicmp binary compatibility for interface default methods

2022-05-10 Thread Gilles Sadowski
Le mar. 10 mai 2022 à 09:53, Alex Herbert  a écrit :
>
> On Tue, 10 May 2022 at 02:32, Matt Juntunen 
> wrote:
>
> > Sounds reasonable to me. Are there any arguments against this change
> > other than the fact that it is not a japicmp default setting?
> >
>
> I do not know why the setting default is at the current value. Unlike the
> documentation for revapi there is limited explanation of the default
> settings in japicmp and why compatibility for binary or source will be
> broken. It may be that the developer explicitly wished to be informed of
> additions to interfaces.
>
> Since this is unlikely to affect much at all it may be fine left as is in
> commons parent. The configuration can be added to the relevant POM in
> Commons RNG.

IIUC, it should be the other way around: If BC is not broken, the
common Commons settings should not report otherwise, and if
some specific component has additional requirements, let it modify
its own POM.

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [crypto] How to get the native library?

2022-05-05 Thread Gilles Sadowski
Le jeu. 5 mai 2022 à 22:24, Jochen Wiedmann
 a écrit :
>
> Hi,
>
> trying to run the unit tests for commons-crypto, I get the error message
>
> Native library is not loaded
>
> I understand, what it tells me. However, I do not find any hints on
> how to get the native library. Not on the website, not in the
> top-level docs.

File "BUILDING.txt" at the top-level seems to contain
related information.

>
> Can anyone help with that?
>
> Thanks,
>
> Jochen
>
> P.S: My OS would be Windows, or Fedora Linux.

File referred to above contains instructions for Ubuntu.

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-05-01 Thread Gilles Sadowski
ome classes.
> What about the nextGeneration() method of ListPopulation class. Renaming
> this to create() or from() won't convey the purpose of it.

I agree, and that's why the new "Population" class (in MATH-1618) does
not provide a factory method (see also the "GeneticAlgorithmFactory"
class).

> >(10)
> >o.a.c.m.ga.chromosome.AbstractListChromosome
> >
> >Constructor is called with an argument that is a previously instantiated
> >"representation".  If the latter is mutable, the caller will be able to
> modify
> >the underlying data structure of the newly created chromosome.  [The
> >doc assumes immutability of the representation but this cannot be
> >enforced, and mixed ownership can entail subtle bugs.]
> -- I think this applies to both representation as well as generic parameter
> type T. But I don't see any other option but to rely on the user.

The Javadoc (at line 84) is misleading in its mention of "immutable".

> If you have any suggestions kindly share.

I may not understand all the implications, but I'd suggest that the
"representation" be instantiated within the control of the library (e.g.
through a "builder"/"factory").

> >
> >(11)
> >Do we agree that, in a GA, the most time-consuming task is the fitness
> >computation?  Hence IMO, it should be the focus of the multithreading
> >tools (i.e. "ExecutorService"), probably keeping the other parts (namely
> >the genetic operators) within a simple sequential loop (as in class
> >"GeneticAlgorithmFactory" in MATH-1618).
> -- Current implementation uses separate threads for applying crossover and
> mutation operators for each pair of selected chromosomes.
> I think this ensures better utilization of multi-core processors compared
> to use of multi-threading only for the fitness calculation.

I have the opposite intuition: Parallel application of the genetic
operators would only provide marginal gains wrt the fitness
computation.
In any case, I think that it will be fairly easy to modify my proposed
"OffspringGenerator" class to use an "ExecutorService" (if benchmarks
show that a substantial gain could indeed be achieved).

> -- Some codes are checked in. But there is a conflict in the pull request.
> So I shall create a new one and delete the old branch itself.

IMHO, we could make more substantial progress if you could
first point to issues with my proposal in MATH-1618.

Thanks,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[All] Web access to repository borked?

2022-04-29 Thread Gilles Sadowski
Hello.

>From a link on the web page of a component, say
https://commons.apache.org/proper/commons-geometry/
clicking on the "Source repository (current)" link (in
the menu on the left of the web page) leads to
https://gitbox.apache.org/repos/asf?p=commons-geometry.git

Clicking on "tree" (at the top of the page) and then on
any link that should point to a file in the repository, returns
a page that says
---CUT---
Reading blob failed.
---CUT---

Furthermore, clicking on the "tree" links (in the lines that
read "commit | commitdiff | tree | snapshot" on the right of
commit message) leads to GitHub.

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-04-14 Thread Gilles Sadowski
Hello.

> > > [...]

(1)
o.a.c.m.ga.GeneticAlgorithmTestPermutations
(under "src/test")

As per your comment in that class, it is a usage example.
Given that its name does not end with "Test", it is not run by the
test suite.  Please move it to the "examples" module.

(2)
I'm missing a high-level doc that would enable a newbie to figure
out what to implement in order to get going.
E.g. what is the interplay between
 * genotype
 * allele
 * phenotype
 * decoder
 * fitness function
?
Several classes do not provide explanations (or links) about the
concept which they represent.  For example, there is no doc about
what a "RandomKeyDecoder" is, and the reason for using it (or not).

(3)
o.a.c.m.ga.utils.ChromosomeRepresentationUtils

It seems to be a "mixed-bag" kind of class (that is being frowned
upon nowadays).
Its comment refers to "random" but some methods are not using
any randomization.  Most methods are only used in unit tests.

(4)
o.a.c.m.ga.RandomProviderManager

As already discussed, this class should not be part of the public
API, namely because the "getRandomProvider()" method returns
an object that is not thread-safe.
If used internally as "syntactic sugar", it should be located in a
package named "internal"; however I'd tend to remove it
altogether, and call "ThreadLocalRandomSource.current(...)"
explicitly.

(5)
Why does a "Chromosome" need an "identifier"?
Method "getId()" is only used in "PopulationStatisticalSummaryImpl"
that is an internal class, where it seems that the chromosome itself
(rather than its "id") could serve as the map's key.

(6)
o.a.c.m.ga.chromsome.AbstractChromosome

Field "fitness" is not "final", yet it could be: a "FitnessFunction"
object (used in "evaluate() to compute that field) is passed to the
constructor.  Is there a reason for the "lazy" evaluation?
Dropping it would make the instance immutable (and "evaluate()"
should be renamed to "getFitness()").

Why should the "FitnessFunction" be stored in every chromosome?

(7)
Spurious "@since" tags: In the new code (in "commons-math-ga"
module), none should refer to a version < 4.0.

(8)
@SuppressWarnings("unchecked")

By default, I'm a bit suspicious about having to resort to these annotations,
especially for the kind of algorithms we are trying to implement.
What do you think of the alternative approach outlined in the ZIP file
attached in MATH-1618:
https://issues.apache.org/jira/browse/MATH-1618
?

(9)
Naming of factory methods should be harmonized to match the convention
adopted in components like [RNG] and [Numbers].
E.g. instead of "newChromosome(...)", please use "of(...)" or "from(...)"
for "value object", and "create(...)" otherwise.

(10)
o.a.c.m.ga.chromosome.AbstractListChromosome

Constructor is called with an argument that is a previously instantiated
"representation".  If the latter is mutable, the caller will be able to modify
the underlying data structure of the newly created chromosome.  [The
doc assumes immutability of the representation but this cannot be
enforced, and mixed ownership can entail subtle bugs.]

(11)
Do we agree that, in a GA, the most time-consuming task is the fitness
computation?  Hence IMO, it should be the focus of the multithreading
tools (i.e. "ExecutorService"), probably keeping the other parts (namely
the genetic operators) within a simple sequential loop (as in class
"GeneticAlgorithmFactory" in MATH-1618).

To be continued...

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-04-13 Thread Gilles Sadowski
Hello.

> [...]
> -- Created a new PR https://github.com/apache/commons-math/pull/209.

Merged in branch "feature__MATH-1563__genetic_algorithm".

> > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: New component proposal: commons-plugins

2022-04-12 Thread Gilles Sadowski
Le mar. 12 avr. 2022 à 13:23, Gary Gregory  a écrit :
>
> Commons Component can and do depend on other runtime libraries, for
> example, VFS, Configuration, JCS, and so on. There are libraries that are
> naturally lower level where we do want to keep zero depencies like IO and
> Lang. If an app has a plugin system it seems evident to me that it would be
> the kind of app that depends on other libraries anyway.

Question remains: Start something here or not (advantages vs drawbacks)?

>
> Gary
>
> On Tue, Apr 12, 2022, 05:57 Gilles Sadowski  wrote:
>
> > Hello.
> >
> > Le mar. 12 avr. 2022 à 08:58, Romain Manni-Bucau
> >  a écrit :
> > >
> > > Sounds like https://svn.apache.org/repos/asf/commons/sandbox/ can be a
> > > ready to start place even if I still think incubator is the real place
> > for
> > > such a project since it will quickly overpass commons standard case with
> > a
> > > lot of modules if it gets a community and adopted (for integrations).
> > >
> >
> > "Commons" components are supposed to not depend on anything
> > (except other "Commons" components and optional dependencies).
> > If some of them need the functionality being discussed in this thread
> > (as has been mentioned by Matt S, Matt J and Gary), but it is defined
> > in another TLP, reuse will be "forbidden".
> >
> > A modular (maven) project could contain
> > * modules that abide by the "no-dependency" policy (providing "core"
> >   functionality that can be reused here), and
> > * modules with external dependencies whenever required.
> > [If the latter modules are considered out-of-scope for Commons, then
> > the glue code would be left for the respective projects to implement.]
> >
> > Regards,
> > Gilles
> >
> > >> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] Repository sync quirk?

2022-04-12 Thread Gilles Sadowski
FTR: Reported to INFRA:
https://issues.apache.org/jira/browse/INFRA-23133

Le lun. 11 avr. 2022 à 23:30, Gilles Sadowski  a écrit :
>
> Hello.
>
> Fetching the "same" branch from either "github" or "gitbox", I don't
> end up with the same contents; see the sequence of commands,
> below, where the last commit is different.
>
> $ git remote -v
> github  https://github.com/apache/commons-math.git (fetch)
> github  https://github.com/apache/commons-math.git (push)
> origin  https://gitbox.apache.org/repos/asf/commons-math.git (fetch)
> origin  https://gitbox.apache.org/repos/asf/commons-math.git (push)
> $ git fetch origin
> feature__MATH-1563__genetic_algorithm:feature__MATH-1563__genetic_algorithm__GB
> $ git log feature__MATH-1563__genetic_algorithm__GB
> commit c573e368d7c3e0275fe374b460560d368dcd737c
> Author: avbasak1 
> Date:   Sun Mar 13 00:54:18 2022 +0100
>
> MATH-1563: Introducing new implementation of GA functionality (WIP).
>
> commit 57dda85533fbac18389a3ddc70e3640aa4484a91
> [...]
> $ git fetch github
> feature__MATH-1563__genetic_algorithm:feature__MATH-1563__genetic_algorithm__GH
> $ git log feature__MATH-1563__genetic_algorithm__GH | head
> commit 99ca99198449c9ccfc28fbd0987e9c6a2611e0e6
> Author: avbasak1 
> Date:   Sun Mar 13 00:54:18 2022 +0100
>
> MATH-1563: Introducing new implementation of GA functionality (WIP).
>
> Closes #208.
>
> commit 57dda85533fbac18389a3ddc70e3640aa4484a91
> [...]
>
>
> Any idea of what is going on?
>
> Regards,
> Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: New component proposal: commons-plugins

2022-04-12 Thread Gilles Sadowski
Hello.

Le mar. 12 avr. 2022 à 08:58, Romain Manni-Bucau
 a écrit :
>
> Sounds like https://svn.apache.org/repos/asf/commons/sandbox/ can be a
> ready to start place even if I still think incubator is the real place for
> such a project since it will quickly overpass commons standard case with a
> lot of modules if it gets a community and adopted (for integrations).
>

"Commons" components are supposed to not depend on anything
(except other "Commons" components and optional dependencies).
If some of them need the functionality being discussed in this thread
(as has been mentioned by Matt S, Matt J and Gary), but it is defined
in another TLP, reuse will be "forbidden".

A modular (maven) project could contain
* modules that abide by the "no-dependency" policy (providing "core"
  functionality that can be reused here), and
* modules with external dependencies whenever required.
[If the latter modules are considered out-of-scope for Commons, then
the glue code would be left for the respective projects to implement.]

Regards,
Gilles

>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[All] Repository sync quirk?

2022-04-11 Thread Gilles Sadowski
Hello.

Fetching the "same" branch from either "github" or "gitbox", I don't
end up with the same contents; see the sequence of commands,
below, where the last commit is different.

$ git remote -v
github  https://github.com/apache/commons-math.git (fetch)
github  https://github.com/apache/commons-math.git (push)
origin  https://gitbox.apache.org/repos/asf/commons-math.git (fetch)
origin  https://gitbox.apache.org/repos/asf/commons-math.git (push)
$ git fetch origin
feature__MATH-1563__genetic_algorithm:feature__MATH-1563__genetic_algorithm__GB
$ git log feature__MATH-1563__genetic_algorithm__GB
commit c573e368d7c3e0275fe374b460560d368dcd737c
Author: avbasak1 
Date:   Sun Mar 13 00:54:18 2022 +0100

MATH-1563: Introducing new implementation of GA functionality (WIP).

commit 57dda85533fbac18389a3ddc70e3640aa4484a91
[...]
$ git fetch github
feature__MATH-1563__genetic_algorithm:feature__MATH-1563__genetic_algorithm__GH
$ git log feature__MATH-1563__genetic_algorithm__GH | head
commit 99ca99198449c9ccfc28fbd0987e9c6a2611e0e6
Author: avbasak1 
Date:   Sun Mar 13 00:54:18 2022 +0100

MATH-1563: Introducing new implementation of GA functionality (WIP).

Closes #208.

commit 57dda85533fbac18389a3ddc70e3640aa4484a91
[...]


Any idea of what is going on?

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-04-11 Thread Gilles Sadowski
> >final SelectionPolicy selectionPolicy,
> >ConvergenceListener... convergenceListeners) {
> >this.crossoverPolicy = crossoverPolicy;
> >this.mutationPolicy = mutationPolicy;
> >this.selectionPolicy = selectionPolicy;
> >updateListenerRigistry(convergenceListeners);
> >}
> >---CUT---
> >Readers of the HTML-generated doc can already click on the various
> >arguments within the signature; so there is no need to add visual noise
> >in the source code just to be able to click from within the Javadoc part
> >just above that signature.
> >The Javadoc block above should be
> >---CUT---
> >/**
> > * @param crossoverPolicy Crossover policy.
> > * @param mutationPolicy Mutation policy.
> > * @param selectionPolicy Selection policy.
> > * @param convergenceListeners Collection of user-defined listeners.
> > */
> >---CUT---
> >[Note the absence of "The" and the presence of a final "period".]
> >
> -- Are we following this standard in the commons-math project?

Yes.
Not everybody follows strict rules, and certainly not all "Commons"
components have the same rules (unfortunately), but since it was
decided that this module belongs in "Commons Math", I'd like to
minimize inconsistent coding styles (converging on what is used in
non-"legacy" source files).

> This would
> initiate changes in almost all classes.

Do you mean "all classes" in the GA module?
Then so be it. ;-)

>
> >A blank line is welcome to separate ideas ("logical" blocks of code)
> >However, there should not be an empty line after a closing brace if
> >it is followed by another closing brace.
> >Also, in all recent codes, there is no blank line between the instance
> >fields; the (mandatory) Javadoc is enough to logically (and visually)
> >separate the fields.
> -- I have rectified this.

Thanks!

> [...]
>
> >* Are annotations (@SafeVarargs, ...) necessary?  Please document.
> -- This annotation is necessary for any parameterized vararg. This is also
> used in legacy classes like o.a.c.m.l.a.i.FieldHermiteInterpolator and
> o.a.c.m.l.o.n.RungeKuttaFieldStepInterpolator.

Hmm, another point to discuss later.

>
> >In "AdaptiveGeneticAlgorithm":
> >* There should be a single constructor (same remark as above).
> -- Removed the constructor with default argument.
>
> >* Why the use of reflection ("isAssignableFrom")?
> -- Replaced it by instanceof.

Marginally better ;-) it still does not say why the statistics is disabled
depending on the operator type...

Regards,
Gilles

>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: New component proposal: commons-plugins

2022-04-07 Thread Gilles Sadowski
Le jeu. 7 avr. 2022 à 14:34, Gary Gregory  a écrit :
>
> A slight tangent: a smaller simpler component idea? The log4j variable
> interpolation, the ${lookup:variable} type of logic is in many places:
> deprecated in Commons Lang, now in Commons Text, also implemented in
> Commons Configuration. We could bring in the Log4j version, now safer than
> other implementations into Commons Text or a new component and everyone
> depends on this new version.

+1

TBD in another thread (name, etc.) ?

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-04-03 Thread Gilles Sadowski
Hello.

Le mar. 29 mars 2022 à 17:08, Avijit Basak  a écrit :
>
> Hi All
>
>  Please find my comments below.
>
> [...]
>
> --I have made the changes and created a new PR. Kindly review the same and
> share your thoughts.
> https://github.com/apache/commons-math/pull/208

I've merged PR #208 into the feature branch (please open a
new one for changes entailed by the comments below).
I again had to delete the branch (and recreate it with the merged
changes from PR #208).  [I must be missing something about the
correct git workflow...]

There seems to be something wrong in the "examples-ga-tsp"
application (fitness does not change).

At the end of the run, one should be able to quickly assess the
goodness of the solution; the new code prints a line with many
"Node [...]" elements while the "--legacy" switch prints the "best"
fitness and a list of indices.  In either case, the solution should
consist of the list of visited cities (one per line) and the total
distance.

I can't seem to find how the logger is configured.  Currently, all
"INFO" messages are logged to the "standard error" console; one
should be able to e.g. redirect output to a file, or set the log level.

There is still a mix between library code and application code (but
this is to be discussed in MATH-1643.

>From browsing the library code, I'm tempted to believe that the
dependency towards a logging framework is not necessary (or
underused).  I think that such a feature could be left to the application
layer (per the "ConvergenceListener" registry).
Likewise, the "PopulationStatisticsLogger" is not general enough to
be worth being part of the library.

A few (nit-pick) remarks about code style in general.
Javadoc is incomplete: All methods must be documented.
Please avoid redundant links like e.g.
---CUT---
/**
 * @param crossoverPolicy  The {@link CrossoverPolicy}
 * @param mutationPolicy   The {@link MutationPolicy}
 * @param selectionPolicy  The {@link SelectionPolicy}
 * @param convergenceListeners An optional collection of
 * {@link ConvergenceListener} with
variable arity
 */
@SafeVarargs
protected AbstractGeneticAlgorithm(final CrossoverPolicy crossoverPolicy,
final MutationPolicy mutationPolicy,
final SelectionPolicy selectionPolicy,
ConvergenceListener... convergenceListeners) {
this.crossoverPolicy = crossoverPolicy;
this.mutationPolicy = mutationPolicy;
this.selectionPolicy = selectionPolicy;
updateListenerRigistry(convergenceListeners);
}
---CUT---
Readers of the HTML-generated doc can already click on the various
arguments within the signature; so there is no need to add visual noise
in the source code just to be able to click from within the Javadoc part
just above that signature.
The Javadoc block above should be
---CUT---
/**
 * @param crossoverPolicy Crossover policy.
 * @param mutationPolicy Mutation policy.
 * @param selectionPolicy Selection policy.
 * @param convergenceListeners Collection of user-defined listeners.
 */
---CUT---
[Note the absence of "The" and the presence of a final "period".]

A blank line is welcome to separate ideas ("logical" blocks of code)
However, there should not be an empty line after a closing brace if
it is followed by another closing brace.
Also, in all recent codes, there is no blank line between the instance
fields; the (mandatory) Javadoc is enough to logically (and visually)
separate the fields.

In "AbstractGeneticAlgorithm":
* There should be a single constructor (handling default values should
   be left to the application layer).  [This would allow the removal of
   "updateListenerRigistry" method (note: There is a typo in that name).]
* Are annotations (@SafeVarargs, ...) necessary?  Please document.

In "AdaptiveGeneticAlgorithm":
* There should be a single constructor (same remark as above).
* Why the use of reflection ("isAssignableFrom")?

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [ALL] components still using Travis

2022-03-29 Thread Gilles Sadowski
Hello.

Le mar. 29 mars 2022 à 15:23, Alex Herbert  a écrit :
>
> [...]
>
> The feature I liked from Travis was the integration of coverage reports
> from coveralls. This would red light a PR in the main page if coverage had
> dropped. There is no coveralls GHA. I installed the Codecov action for
> [Collections] as a test a few months ago and it has been sending coverage
> emails to our dev list since. You can click through and scroll the report.
> I do not think it red lights a PR for low coverage so you have to read the
> report. I am undecided if I prefer the report. No-one else seems to have
> commented either way. But I think it important that any PR has automated
> checks that the new code is executed.

IMO, it would be a regression if the move to GH has removed that feature.

Gilles

>
> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [ALL] components still using Travis

2022-03-29 Thread Gilles Sadowski
Hello.

Le mar. 29 mars 2022 à 13:22, sebb  a écrit :
>
> It looks like there is a general move to switch from Travis to GitHub Actions.

AFAIK, there has never been a clear explanation for that move.

> AFAICT the following components are still using Travis:
>
> geometry
> jelly
> jxpath
> math
> numbers
> rng
> weaver

statistics

>
> Do we need to move these as well?

For consistency, or ease of maintenance, provided it is indeed
the case that everything can be configured from ".asf.yaml" (i.e.
no "login in on GH" required).

Regards,
Gilles

>
> BTW, emails from GHA runs can now be directed to project mailing
> lists, which is great (*)
> See: https://s.apache.org/asfyaml-gha
>
> e.g. update .asf.yaml to include:
> notifications:
> ...
>   jobs: notificati...@commons.apache.org
>
> Sebb.
> (*) Travis always had this, but recently switched to a new email
> system which means all such mails have to be moderated.
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-03-28 Thread Gilles Sadowski
Hello.

Le lun. 28 mars 2022 à 00:32, Sumanth Rajkumar
 a écrit :
>
> Thanks Alex and Gilles for your feedback
>
> So currently Commons Math transform depends on Common complex numbers, and
> the API involves usage of Complex Object Arrays instead of primitive array
> data structures
>
> I also briefly looked into other library implementations besides Jtransform
> and EJML that are not pure java but have java bindings such as JBLAS[1] and
> NAG[2]
> All of the implementation use single array data structures to represent
> Complex Lists and higher dimensional matrices
>
> Since these involve parallel data pipelines I looked into libraries that
> use SIMD [3] operations that use GP GPU (jcuda [4][5] /aparapi [6]) and CPU
> (Java 17 Vector API [7])

Thanks for the investigation!

>
> Given all the alternative implementations, I agree it does not make sense
> to re implement transforms here.

Some transforms are (already) implemented here.
Of course, it makes sense to wonder whether to keep maintaining those
codes, or rely on external dependencies.  The decision would depend on
performance comparisons and whether users are able (or allowed) to
interface with native libraries.  [I know of one project where "pure Java"
was a requirement...]

>
> So instead would it be useful to provide users with
> 1) a complex numbers Linear Algebra and transforms API to compile against
> and run with any of existing providers (apache commons, jtransform, EJML,
> jblas)
> AND
> 2) a service provider interface to allow adapter implementations to
> integrate existing and future providers such as jcudas/aparapi/vector APIs
>
> Do the commons library modularization dependency requirements apply to
> compile time dependencies only or runtime also?

Which "dependency requirements" are you referring to?

> To minimize bloat, the runtime dependencies could be made optional and need
> not be transitively included by default

Flexibility would be ideal, indeed.

>
> Providing a Complex linear algebra and transforms API that can run with
> different runtime providers would allow users to take advantage of hardware
> capabilities and gracefully fallback to reference implementations
> It could allow users to take advantage of Java 17 Vector APIs when
> available without refactoring their existing libraries

That looks great.

>
> Also do the Apache projects/ license allow for integration with non Apache
> software (jtransform /jblas do not use Apache license, jcuda uses MIT
> license) ?

Licence issues are detailed here:
  https://www.apache.org/legal/resolved.html

Best regards,
Gilles

>
> Thanks
> Sumanth
>
> [1] http://jblas.org/
>
> [2] https://www.nag.com/numeric/nl/nagdoc_latest/clhtml/c06/c06conts.html
>
> [3] https://blogs.oracle.com/javamagazine/post/programming-the-gpu-in-java
>
> [4] http://jcuda.org/jcuda/jcufft/JCufft.html
>
> [5]
> https://github.com/jcuda/jcuda-samples/tree/master/JCudaSamples/src/main/java/jcuda/jcufft/samples
>
> [6] http://aparapi.github.io/
>
> [7] https://openjdk.java.net/jeps/414
>
>
> On Tue, 22 Mar 2022 at 10:07, Gilles Sadowski  wrote:
>
> > Hello.
> >
> > > [...]
> > > >
> > > > Are we expecting complex-numbers to be an efficient pure java library
> > that
> > > > could be used by other java libraries such as commons-imaging for data
> > > > compression (DCT /JPEG lossy compression)?
> > > >
> > >
> > > Numbers should be seen as a toolbox to be used by other Java
> > applications.
> > > The best location for routines is something to discuss on the mailing
> > list.
> > > In the example of DCT, I am not aware if imaging currently has an encoder
> > > implementation for this. There is a decoder:
> > > org/apache/commons/imaging/formats/jpeg/decoder/Dct.jav
> >
> > Also:
> >
> > https://gitbox.apache.org/repos/asf?p=commons-math.git;a=blob;f=commons-math-transform/src/main/java/org/apache/commons/math4/transform/FastCosineTransform.java
> >
> > It would be a maintenance boon if "Commons" could come up with
> > a consensus about which components must be dependency-free and
> > which could depend on other (lower-level) "Commons" components.
> >
> > [Imaging] is clearly higher-level than [Math] and that such non-obvious
> > algorithms should be maintained in a single place.  Through the process
> > of modularizing [Math], we have "commons-math-transform" module,
> > with zero dependency, so it would bring zero bloat to [Imaging] users if
> > we consolidate usage.
> >
> > Of course, that would imply testing and benchmarking all current
> > implementations, and retain the best (taking various axes into account:
> > performance, robustness, flexibility).
> >
> > Regards,
> > Gilles
> >
> > > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-03-28 Thread Gilles Sadowski
Hello.

Le lun. 28 mars 2022 à 10:15, Avijit Basak  a écrit :
>
> [...]
>
> >The various "Standalone" classes also look quite similar; consolidating the
> >"examples-ga" module (including full Javadoc) is necessary.
> -- Could you please elaborate it more. IMHO as StandAlone classes are
> dedicated to the specific module only, it would remain separate. Since we
> have used a single domain to show utility of the different
> types(adaptive/simple) of GA some classes have become similar.
>
> >I still don't
> >understand why there are "...-legacy" modules in module "examples-ga".
> >If you want to offer the option of running the "old" implementation, you
> >could add a "legacy" flag (as "@Option" in the "Standalone" application).
> -- There was a discussion on this some time back. The sole purpose of
> keeping the legacy example module is for comparison with the new
> implementation. It will be easier for anyone to visualize the quality
> improvement we achieved here. I don't want to mix(by legacy flag) this
> anyway with the new implementation.
>

Just quickly commenting on this point.

IIUC, your purpose is for users to be able to run (an example
application of) the old implementation.

This can be achieved by having all the "legacy" codes within
module
  commons-math-examples/examples-ga/examples-ga-math-functions
(note: No "legacy" in the module's name), within a dedicated
  o.a.c.m.examples.ga.mathfunctions.legacy
package.

This code is then called by the exact same code/application as
for the new implementation (with the corresponding command
line switch):
  $ java -jar examples-ga-app.jar --legacy ... rest of the args ...

Users can thus perform 2 runs; once with "--legacy" and one
without it, and reach some conclusions.

The duplicate codes only bring maintenance burden (to ensure
that the "legacy" and non-"legacy" modules do indeed aim at
solving the same problem).
Whenever we then decide that the new code has been thoroughly
tested, removal of the
  o.a.c.m.examples.ga.mathfunctions.legacy
package will be a minimal change (as compared to the removal
of a module).

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[Math] Remove "NullArgumentException"

2022-03-25 Thread Gilles Sadowski
Hello.

Consistently with recent discussion (e.g. new implementation of
"genetic algorithms"), we've informally agreed to stick with the
standard exceptions (defined by the JDK).
Case in point: "NullArgumentException"[1] should be removed from
the code base, in favour of "NullPointerException".[2]

Regards,
Gilles

[1] 
https://gitbox.apache.org/repos/asf?p=commons-math.git;a=blob;f=commons-math-legacy-exception/src/main/java/org/apache/commons/math4/legacy/exception/NullArgumentException.java;h=ec371bb20a7acc41692789029c9bfa424cf14fb9;hb=HEAD
[2] 
https://docs.oracle.com/javase/8/docs/api/java/lang/NullPointerException.html

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: Re: [geometry] PointMap and PointSet

2022-03-23 Thread Gilles Sadowski
Hi.

Le mer. 23 mars 2022 à 03:27, Matt Juntunen
 a écrit :
>
> Gilles,
>
> > Say, for example, that "V" holds a single (floating-point) value.  We
> > insert entries
> >  map.put(x, 2);
> >  map.put(y, 8);
> > assuming that "x" and "y" and just barely different, according to the
> > chosen "precision context".  Then:
> > z = (x + y) / 2; // Pseudo-code.
>  > m = map.get(z);
> > Does "m" equal "2" or "8", depending on whether "z" is (however
> > slightly) closer to either "x" or "y"?  Or is it "5" (interpolated value)?
>
> It would be either 2 or 8. In the current implementations the first
> matching entry is returned and since entries are typically searched
> low to high, the entry corresponding to the lower of the two keys
> would be returned. However, I do not consider this "lowest match wins"
> behavior to be part of the public API since it depends on the
> implementation details.

For sure, any functionality must start from some low-level data
structure with some prescribed behaviour.  Here, we assume that
the "mechanism" returns "2" or "8" (depending on the "details").

My point is rather that the "cloud of points" abstraction seems to
require a higher-level API (for which "PointMap" would, in turn,
be an "implementation detail" too).
Within that abstraction, querying the value at location between
"x" and "y" would return some interpolation (i.e. any user-defined
"combiner") of the data stored within a given "radius" of the
queried location.
This would make more sense (IMO) than the application having
to deal with a result ("2" or "8") that is implementation-dependent:
Such an additional API layer would allow the caller to specify the
"combiner" as, for example, "the average of the values", the result
would then be univocally defined ("5").
[Obviously, when specifying a radius smaller than the "precision
context", the behaviour would be identical to a direct query to the
underlying "PointMap".]

>
> > But is it the right behaviour in all cases, or should there be a
> > "replacement policy" (to apply whenever points are already stored
> > within the "precision context" neighbourhood)?
>
> This seems to me like additional logic that could be built on top of
> PointMap/Set, probably using the distance query methods in
> GEOMETRY-146.

Indeed; my question aimed at pointing to the importance of providing
such an API.

> Do you have a use case in mind here?

In 2D: create an image (i.e. rectangular regular grid) that represents
the (interpolated) data associated with the (scattered) points.

Another (unrelated to the above discussion) feature: Allow different
precision contexts in different regions of the space (cf. [1]).

Best,
Gilles

[1] https://en.wikipedia.org/wiki/Unstructured_grid

>
> Regards,
> Matt
>
> On Tue, Mar 22, 2022 at 1:05 PM Gilles Sadowski  wrote:
> >
> > Le mar. 22 mars 2022 à 14:46, Matt Juntunen
> >  a écrit :
> > >
> > > Hello,
> > >
> > > Unless there are any other comments on the PR, I'm going to plan on
> > > merging it into master within the next couple of days.
> > >
> >
> > Thanks for providing this new functionality.
> >
> > Do you envision that [Geometry] will also provide ways to manipulate
> > data stored in the map (the "V" in e.g. "PointMap")?
> >
> > Say, for example, that "V" holds a single (floating-point) value.  We
> > insert entries
> >   map.put(x, 2);
> >   map.put(y, 8);
> > assuming that "x" and "y" and just barely different, according to the
> > chosen "precision context".  Then:
> >   z = (x + y) / 2; // Pseudo-code.
> >   m = map.get(z);
> > Does "m" equal "2" or "8", depending on whether "z" is (however
> > slightly) closer to either "x" or "y"?  Or is it "5" (interpolated value)?
> >
> > This is related to the feature which I mentioned in GEOMETRY-146.
> > I get that the low-level data-structure cannot "make up" a value that
> > is not actually stored but it seems that the next step would be an API
> > that lets the user specify what it means to retrieve data from the map.
> >
> > Then, there is also
> >   map.put(z, 10);
> > Currently "10" will replace either the value at "x" or the value at "y".
> > But is it the right behaviour in all cases, or should there be a
> > "replacement policy" (to apply whenever points are already stored
> > within the "precision context" neighbourhood)?
> >
> > Does this make sense?
> >
> > Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: Re: [geometry] PointMap and PointSet

2022-03-22 Thread Gilles Sadowski
Le mar. 22 mars 2022 à 14:46, Matt Juntunen
 a écrit :
>
> Hello,
>
> Unless there are any other comments on the PR, I'm going to plan on
> merging it into master within the next couple of days.
>

Thanks for providing this new functionality.

Do you envision that [Geometry] will also provide ways to manipulate
data stored in the map (the "V" in e.g. "PointMap")?

Say, for example, that "V" holds a single (floating-point) value.  We
insert entries
  map.put(x, 2);
  map.put(y, 8);
assuming that "x" and "y" and just barely different, according to the
chosen "precision context".  Then:
  z = (x + y) / 2; // Pseudo-code.
  m = map.get(z);
Does "m" equal "2" or "8", depending on whether "z" is (however
slightly) closer to either "x" or "y"?  Or is it "5" (interpolated value)?

This is related to the feature which I mentioned in GEOMETRY-146.
I get that the low-level data-structure cannot "make up" a value that
is not actually stored but it seems that the next step would be an API
that lets the user specify what it means to retrieve data from the map.

Then, there is also
  map.put(z, 10);
Currently "10" will replace either the value at "x" or the value at "y".
But is it the right behaviour in all cases, or should there be a
"replacement policy" (to apply whenever points are already stored
within the "precision context" neighbourhood)?

Does this make sense?

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-03-22 Thread Gilles Sadowski
Hello.

> [...]
> >
> > Are we expecting complex-numbers to be an efficient pure java library that
> > could be used by other java libraries such as commons-imaging for data
> > compression (DCT /JPEG lossy compression)?
> >
>
> Numbers should be seen as a toolbox to be used by other Java applications.
> The best location for routines is something to discuss on the mailing list.
> In the example of DCT, I am not aware if imaging currently has an encoder
> implementation for this. There is a decoder:
> org/apache/commons/imaging/formats/jpeg/decoder/Dct.jav

Also:
https://gitbox.apache.org/repos/asf?p=commons-math.git;a=blob;f=commons-math-transform/src/main/java/org/apache/commons/math4/transform/FastCosineTransform.java

It would be a maintenance boon if "Commons" could come up with
a consensus about which components must be dependency-free and
which could depend on other (lower-level) "Commons" components.

[Imaging] is clearly higher-level than [Math] and that such non-obvious
algorithms should be maintained in a single place.  Through the process
of modularizing [Math], we have "commons-math-transform" module,
with zero dependency, so it would bring zero bloat to [Imaging] users if
we consolidate usage.

Of course, that would imply testing and benchmarking all current
implementations, and retain the best (taking various axes into account:
performance, robustness, flexibility).

Regards,
Gilles

> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers][gsoc] GSoC 2022 - NUMBERS-186 Proposal

2022-03-21 Thread Gilles Sadowski
Le lun. 21 mars 2022 à 23:12, Alex Herbert  a écrit :
>
> Hi,
>
> This lost the dev@commons in the to address. I am forwarding to the list to
> include the history.

>From a quick read of the quoted messages below, I believe I must
point out that there is an FFT implementation in Commons Math.[1]
It could be construed as a (high priority) use case.  Thus, it should
be included in benchmarks and possibly adapted to work with the
proposed data-structure(s).

Regards,
Gilles

[1] 
https://gitbox.apache.org/repos/asf?p=commons-math.git;a=blob;f=commons-math-transform/src/main/java/org/apache/commons/math4/transform/FastFourierTransform.java

>
> On Sun, 20 Mar 2022 at 16:49, Sumanth Rajkumar 
> wrote:
>
> > Thanks for the feedback Alex!
> >
> > As suggested, I reviewed the JTransforms and ComplexUtils class in the
> > complex streams package.
> >
> > The existing complex utils class has methods to convert to and from Array
> > data structures (the forms used in JTransform) to Complex class.
> >
> > I can come up with a Java 8/Streams based API for implementing complex
> > FFT algorithms of the types in JTransforms and support various methods in
> > ComplexUtils
> > The streams based complex operations API should allow for decoupling the
> > backing data structures.
> > This should make it possible to use single API to create an unit test
> > suite to benchmark/compare different backing data structures such as single
> > arrays, floats or even polar representations
> >
> > As part of the project, I could implement a subset of the FFT operations
> > in JTransform using the new streams based Complex Numbers API and
> > benchmark it against JTransform implementation
> >
> >
> > I understand that we are in the GSOC discussion phase. I am trying to
> > understand the background of this project and the requirements in order to
> > come up with my GSOC proposal
> >
> > Can you provide with more information on the envisaged usage of
> > Commons-Numbers (especially Complex Numbers), its current usage/users and
> > the vision/roadmap for future enhancements
> >
> > Are we expecting complex-numbers to be an efficient pure java library that
> > could be used by other java libraries such as commons-imaging for data
> > compression (DCT /JPEG lossy compression)?
> >
> > Are there other Java/Non-Java (C/Python) libraries that provide similar
> > features that I can look into for design inspiration and also benchmark
> > Complex Numbers with?
> >
> > Thanks
> > Sumanth
> >
> >
> > On Tue, 15 Mar 2022 at 20:50, Alex Herbert 
> > wrote:
> >
> >> Hi Sumanth,
> >>
> >> These changes to use static methods with functional interfaces is an
> >> improvement. However I would advise that we consider the use cases for this
> >> functionality to ensure that any design does not prevent extension and also
> >> allows full flexibility to achieve various tasks.
> >>
> >> For example:
> >>
> >> - multiply all the complex numbers in one list with another
> >> - wrap an existing complex number data structure, for example the FFT
> >> result produced by JTransforms [1]
> >>
> >> This project originates from a previous enhancement request that was made
> >> to store a large set of complex numbers efficiently. The argument was that
> >> the 16 bytes to store 2 doubles is inflated by the object allocation to
> >> store a Complex, perhaps by even double the 16 bytes. The natural storage
> >> would be two arrays of doubles, but what about 1 linear array packed as
> >> real/imag for each number. This will be able to store half as many numbers
> >> but access to each will take advantage of efficient caching when
> >> reading/writing memory. The JTransforms library (and others) may have ideas
> >> for useful data structures.
> >>
> >> Unfortunately I cannot find if there was a Jira ticket for this or it is
> >> only in the mailing archives. I've added links to the GSoC ticket for the
> >> other tickets that mention complex number array utils and streams. However
> >> these do not have a use case. Perhaps an investigation of the functionality
> >> in the unreleased commons-number-complex-streams package would be the place
> >> to start. The original author of that package is not actively involved in
> >> the development any more.
> >>
> >> I should also point out the process for GSoC. It is outlined here [2]. In
> >> short the initial period is about understanding 

Re: [ALL] consider moving to a directory per release, rather than binaries and source

2022-03-16 Thread Gilles Sadowski
Le mer. 16 mars 2022 à 19:00, Mark Thomas  a écrit :
>
> On 16/03/2022 17:53, sebb wrote:
> > As the subject says.
> >
> > We currently use separate directories for binaries and source, each of
> > which may contain multiple versions.
> >
> > This is a bit awkward to maintain compared with a directory per
> > release which would contain both binaries and source.
> >
> > I think we should consider moving to individual release directories.
> >
> > This would mean changes to various scripts etc, so would not be trivial.
> >
> > If we do decide to do so, it would make sense to try this on a
> > component that normally only has one current version on release.
> >
> > WDYT?
>
> I like the idea in general. It makes managing releases a little easier.
>
> However, there would be an impact is on users that have scripted
> downloads. The change in the location will require changes to all of
> those scripts. Does the benefit (primarily for us) justify the cost of
> those changes (primarily for users)? This might be something to thing
> about when we have a new major version.

What about
 * changing the location if it makes things simpler for us, and
 * writing a script that generates the old layout (symbolic links)
   to keep things simple for users.
?

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [geometry] PointMap and PointSet

2022-03-16 Thread Gilles Sadowski
Hi.

Le mer. 16 mars 2022 à 15:42, Matt Juntunen
 a écrit :
>
> Hello,
>
> > I suggest to carefully consider whether to return a "SimpleEntry"
> (and prominently note that any sort of concurrent modification is
> a caller responsibility).
>
> I see what you mean and I think that would be a good idea. However,
> the sticking point is that the 1D implementations for both Euclidean
> and spherical space internally use the JDK's TreeMap class to store
> entries, due to its superior performance when compared to the
> AbstractBucketPointMap class used for other dimensions. TreeMap
> returns immutable Map.Entry instances from its entry-access methods
> (e.g., ceilingEntry, floorEntry),

The apidocs[1] state:
---CUT---
Map.Entry ceilingEntry(K key)
   Returns a key-value mapping associated with the least key greater
than or equal to the given key, or null if there is no such key.
---CUT---

> so there is not a straightforward
> way for us to implement the behavior you propose for these dimensions.

AFAICT, (im)mutability of the returned entry is not part of the
JDK-mandated API.
So, assuming that the behaviour is implementation-dependent,
it can be chosen to be different for different dimensions on the
basis of which behaviour is most "natural" for applications.

> The options I see are:
> 1. Have all returned entries be immutable (the current behavior).
> 2. Return specialized Map.Entry implementations for the 1D cases that
> call the "put" method on the underlying map when "setValue" is called.
>
> Option #2 seems less than ideal so unless there is another approach
> that I'm missing, I vote for #1.

I agree that the situation is somewhat unsatisfying.  But, as said, I'd
favour #1 only if there were an actual security promise.  Otherwise,
immutability is a false claim.
Unless I'm mistaken, calling "put" in order to update the "value" is
necessarily less performant than calling "setValue" (map search in
the former, no-op in the latter).

Regards,
Gilles

[1] 
https://docs.oracle.com/javase/8/docs/api/java/util/TreeMap.html#ceilingEntry-K-

> Regards,
> Matt
>
>
> On Wed, Mar 16, 2022 at 9:48 AM Gilles Sadowski  wrote:
> >
> > Hi.
> >
> > Le mer. 16 mars 2022 à 03:17, Matt Juntunen
> >  a écrit :
> > >
> > > Hello,
> > >
> > > I've made the following changes to the PR:
> > > - removed the "resolveKey" method from PointMap
> > > - renamed PointMap.resolveEntry to PointMap.getEntry and
> > > PointSet.resolve to PointSet.get
> > > - added an entry on PointMap/PointSet to the user guide
> > > - addressed Github comments (thanks, Bruno!)
> > >
> > > I ran some performance tests regarding the immutable entry instance
> > > created in the PointMap.getEntry method and there seems to be no
> > > impact.
> > >
> > > > Furthermore, what is actually meant here by "immutable
> > > instance" (since the "value" could be mutable without the
> > > map being aware of the fact)?
> > >
> > > It is immutable in that the object reference used as the entry value
> > > cannot be changed. This reference could point to a mutable object.
> > > This is the same behavior as other Map implementations.
> >
> > I don't see that "reference immutability" is mandated by the
> > "Map" interface (see e.g. [1]).
> >
> > I've noted many times that I generally favour (true) immutability:
> > It makes much sense for "small" data-structures (e.g. for future
> > potential optimizations[2]).
> >
> > However, the "cloud of points" data-structure is at the opposite
> > of the spectrum from this POV:  It is intended to contain a large
> > number of points whose "key" should indeed be (truly) immutable
> > but whose value would likely need to be mutable for many actual
> > use cases.
> > If a "SimpleImmutableEntry" is returned, then in order to modify
> > the map's "value" contents, one has to (IIUC)
> >  * retrieve the entry,
> >  * create a new value,
> >  * call "put" (on the map)
> > rather than
> >  * retrieve the entry
> >  * call "setValue" (on the entry).
> > So we have a somewhat crippled API that does not bring any
> > advantage since reference immutability doesn't provide any
> > security to the map's user (any other caller who is being passed
> > the same map, is able to change its contents anyways).
> >
> > I suggest to carefully consider whether to return a "SimpleEntry"
> > (and prominently note that any sort of concurrent modification is
> > a caller responsibility).
> >
> > Regards,
> > Gilles
> >
> > [1] 
> > https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#entrySet--
> > [2] 
> > https://cr.openjdk.java.net/~briangoetz/valhalla/sov/02-object-model.html
> >
> > >>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [geometry] PointMap and PointSet

2022-03-16 Thread Gilles Sadowski
Hi.

Le mer. 16 mars 2022 à 03:17, Matt Juntunen
 a écrit :
>
> Hello,
>
> I've made the following changes to the PR:
> - removed the "resolveKey" method from PointMap
> - renamed PointMap.resolveEntry to PointMap.getEntry and
> PointSet.resolve to PointSet.get
> - added an entry on PointMap/PointSet to the user guide
> - addressed Github comments (thanks, Bruno!)
>
> I ran some performance tests regarding the immutable entry instance
> created in the PointMap.getEntry method and there seems to be no
> impact.
>
> > Furthermore, what is actually meant here by "immutable
> instance" (since the "value" could be mutable without the
> map being aware of the fact)?
>
> It is immutable in that the object reference used as the entry value
> cannot be changed. This reference could point to a mutable object.
> This is the same behavior as other Map implementations.

I don't see that "reference immutability" is mandated by the
"Map" interface (see e.g. [1]).

I've noted many times that I generally favour (true) immutability:
It makes much sense for "small" data-structures (e.g. for future
potential optimizations[2]).

However, the "cloud of points" data-structure is at the opposite
of the spectrum from this POV:  It is intended to contain a large
number of points whose "key" should indeed be (truly) immutable
but whose value would likely need to be mutable for many actual
use cases.
If a "SimpleImmutableEntry" is returned, then in order to modify
the map's "value" contents, one has to (IIUC)
 * retrieve the entry,
 * create a new value,
 * call "put" (on the map)
rather than
 * retrieve the entry
 * call "setValue" (on the entry).
So we have a somewhat crippled API that does not bring any
advantage since reference immutability doesn't provide any
security to the map's user (any other caller who is being passed
the same map, is able to change its contents anyways).

I suggest to carefully consider whether to return a "SimpleEntry"
(and prominently note that any sort of concurrent modification is
a caller responsibility).

Regards,
Gilles

[1] https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#entrySet--
[2] https://cr.openjdk.java.net/~briangoetz/valhalla/sov/02-object-model.html

>>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [geometry] PointMap and PointSet

2022-03-15 Thread Gilles Sadowski
Hi.

Le mar. 15 mars 2022 à 00:47, Matt Juntunen
 a écrit :
>
> Hello,
>
> > Do I understand correctly that the "resolveEntry" method which
> you added behaves as my above "getEntry"?
>
> Correct.
>
> > If so, the latter can
> replace both "resolve" methods, for a (c)leaner API.
>
> That would work. I would need to add a matching "get" method to
> PointSet to provide the same functionality there. One consideration
> here is that the "resolveEntry" method creates and returns an
> immutable Entry instance with each call. The "resolveKey" method
> avoids this.

I had missed that subtlety; but it entails the question of what
this functionality's intended usage is; e.g. would a user often
need to access the "key" but not the associated "value"?

Furthermore, what is actually meant here by "immutable
instance" (since the "value" could be mutable without the
map being aware of the fact)?

> I'm not sure if this will have an impact on performance.
> I'll try reducing the API as you suggest and include it in the PR if
> it doesn't make a difference in performance.
>
> Do you prefer the "get" verb over "resolve",

Yes (I'm missing what is being resolved; it's just something
being accessed).

Best,
Gilles

> e.g. "getEntry" vs "resolveEntry"?
>
> Regards,
> Matt
>
> On Mon, Mar 14, 2022 at 2:19 PM Gilles Sadowski  wrote:
> >
> > Hello.
> >
> > Le lun. 14 mars 2022 à 16:19, Matt Juntunen
> >  a écrit :
> > >
> > > Gilles,
> > >
> > > > it would be great to keep the tutorials/userguide in sync.
> > >
> > > Sounds good. I'll update the user guide in this PR.
> > >
> > > > I'm a little bit confused: Isn't it always the case that
> > >   getEntry(p).getKey()
> > > will return the originally inserted (i.e. "canonical") point (i.e. not 
> > > "p")?
> > >
> > > Map does not contain a "getEntry" method. If it did, that would indeed
> > > be preferable.
> >
> > Do I understand correctly that the "resolveEntry" method which
> > you added behaves as my above "getEntry"?  If so, the latter can
> > replace both "resolve" methods, for a (c)leaner API.
> >
> > > > Unless I'm missing a standard use-case, the specialized methods
> > > "closestFirst" and "farthestFirst" don't seem useful (and wasteful
> > > of computing resources: If iterating over the whole set, why would
> > > one want to start from some particular point?).
> > >
> > > Could you post this comment on the JIRA issue and we can continue the
> > > discussion there?
> >
> > Done.
> >
> > Regards,
> > Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] New JIRA fields

2022-03-15 Thread Gilles Sadowski
Hello.

FTR: I've asked on the d...@community.apache.org" ML[1]
whether it would be more appropriate to add those additional
fields to the COMDEV JIRA project, but no clear feedback
there either...

In the absence of a clarification or alternative suggestions,
I'll assume "lazy consensus", and ask INFRA to proceed[2]
for the "math-related" components of "Commons" i.e.
  Commons RNG
  Commons Numbers
  Commons Geometry
  Commons Statistics
  Commons Math

Please list ASAP any other Commons sub-project for which
those additional JIRA fields may be useful.

Regards,
Gilles

[1] https://lists.apache.org/thread/zdp9ln3gpdq6l5fpflx1j7v0hzv1s8o8
[2] https://issues.apache.org/jira/browse/INFRA-22893

Le sam. 19 févr. 2022 à 01:00, Gilles Sadowski  a écrit :
>
> Le ven. 18 févr. 2022 à 23:56, sebb  a écrit :
> >
> > Wrong list?
>
> No, why?  Answers would primarily serve to more precisely define the
> tasks in the Commons project's JIRA, even though the motivation
> is the info collected for the global list of tasks (across all projects).
>
> FTR:
> https://issues.apache.org/jira/browse/INFRA-22893
>
> >
> > On Fri, 18 Feb 2022 at 12:49, Gilles Sadowski  wrote:
> > >
> > > Hello.
> > >
> > > To help with automatic collection of tasks for GSoC[1] INFRA is
> > > willing to add fields to our JIRA template.
> > >
> > > Proposal:
> > > * (multi-valued) "potential-mentor" [DONE]
> > > * (enumeration) "difficulty-level"
> > >
> > > Please suggest which choices should be selectable as "difficulty-level".
> > >
> > > A "Skill Level" field was already available but I think that the
> > > current list does not provide useful hints to a newcomer (either
> > > a GCoC candidate or a would-be contributor) about what prior
> > > knowledge would be required to handle the task.
> > > IMHO, there should be a way to define (free entry?) fields akin
> > > to (perhaps with a better name):
> > > * "required-knowledge" (things that candidates can learn by
> > > themselves like how to build the project, etc.)
> > > * "nice-to-have-knowledge" (i.e. for GSoC something that could
> > > be acquired during the "bonding period")
> > >
> > > Regards,
> > > Gilles
> > >
> > > [1] 
> > > https://cwiki.apache.org/confluence/display/COMDEV/GSoC+2022+Ideas+list

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [geometry] PointMap and PointSet

2022-03-14 Thread Gilles Sadowski
Hello.

Le lun. 14 mars 2022 à 16:19, Matt Juntunen
 a écrit :
>
> Gilles,
>
> > it would be great to keep the tutorials/userguide in sync.
>
> Sounds good. I'll update the user guide in this PR.
>
> > I'm a little bit confused: Isn't it always the case that
>   getEntry(p).getKey()
> will return the originally inserted (i.e. "canonical") point (i.e. not "p")?
>
> Map does not contain a "getEntry" method. If it did, that would indeed
> be preferable.

Do I understand correctly that the "resolveEntry" method which
you added behaves as my above "getEntry"?  If so, the latter can
replace both "resolve" methods, for a (c)leaner API.

> > Unless I'm missing a standard use-case, the specialized methods
> "closestFirst" and "farthestFirst" don't seem useful (and wasteful
> of computing resources: If iterating over the whole set, why would
> one want to start from some particular point?).
>
> Could you post this comment on the JIRA issue and we can continue the
> discussion there?

Done.

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [numbers] GSoC 2022 - NUMBERS-186 Proposal

2022-03-13 Thread Gilles Sadowski
Hello.

Le dim. 13 mars 2022 à 02:08, Sumanth Rajkumar
 a écrit :
>
> SUBJECT: A proposal for Commons numbers  (NUMBERS-186)
>
> Hi Alex,
>
> I am Sumanth and new to Open source development. I would like to start by
> participating in GSOC 22.
>
> I came across your NUMBERS-186 GSOC mini project idea and took an initial
> stab at it

Thanks for your interest.

>
> If I understand the wish list correctly, we need to implement an efficient
> Complex List collection that supports all operations defined in the Complex
> class
>
> In my first attempt, I have implemented a ComplexList that is backed by two
> primitive double arrays for real and imaginary parts. The backing arrays
> grow as needed similar to the ArrayList implementation and
> supports all operations of the List interface
>
> I have added methods for operations from the Complex class with the
> following variations for each operation
>   a) Operation for a single complex number at given index
>   b) Operation for range of complex numbers with startIndex and length
>
> ComplexList applies all the operations in place modifying the backing
> arrays as per the requirement. I have also added support for an
> ImmutableComplexList that returns a copy of the List for each of the
> operations.

Preliminary remarks/questions:
* In this project, we avoid "import static".
* What is the purpose of having "protected" methods/fields?
* "final" should be used to declare every constant.

>
> My partial implementation (with TODOs for many operations) is available
> here.
> https://github.com/sumanth-rajkumar/commons-numbers/blob/sumanth-gsoc-22/commons-numbers-complex/src/main/java/org/apache/commons/numbers/complex/ComplexList.java

The many "TODO" methods should be left out to ease review.
[Better than having to wonder about code that looks (hopefully
temporarily) wrong.]

Unit tests should provide full coverage, also to simplify review.

> I would like to re-use the implementations from Complex class. For some of
> the simpler operations I exposed the private static methods from Complex
> class. However, in order to fully reuse all operations, the Complex class
>  will require more refactoring...

What do you suggest (provide one concrete example)?

>
> Am I on the right track?
> Will it be ok to refactor the Complex class to allow reuse of operation
> methods between Complex and ComplexList classes or should I just copy the
> implementations to List class?

No, code duplication should be avoided.

>
> I have also added a sample unit test for ComplexList similar to the
> existing ComplexTest.
>
> Based on feedback, I plan to complete all the TODO implementations &
> comments, add full unit test coverage and any other tasks required to raise
> a PR
>
> Further, if there is interest, I also plan to extend the ComplexList for
> higher dimensions (2D, 3D, 4D etc..)

This class could be useful:
  
https://commons.apache.org/proper/commons-numbers/commons-numbers-arrays/apidocs/org/apache/commons/numbers/arrays/MultidimensionalCounter.html

Alex will probably further comment on whether this is going in the
right direction.  In particular, we should look for a way to "apply"
the various complex functions to all the numbers in a list, without
repeating the loop "boiler-plate" code.

Regards,
Gilles

>
> -Sumanth
>
> [1] https://issues.apache.org/jira/browse/NUMBERS-186
> [2]
> https://markmail.org/message/n4zpcxh7d7knq5tb?q=NUMBERS-186+list:org%2Eapache%2Ecommons%2Edev/
>
>
> I have posted two ideas for GSoC mini projects under:
> >
> > https://issues.apache.org/jira/browse/STATISTICS-54
> > https://issues.apache.org/jira/browse/NUMBERS-186
> >
> > Alex
> >

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [geometry] PointMap and PointSet

2022-03-13 Thread Gilles Sadowski
Hello Matt.

Le dim. 13 mars 2022 à 15:41, Matt Juntunen
 a écrit :
>
> Hello,
>
> > Is there a gentle introduction to how it works and/or the intended
> use cases?
>
> Not specifically. The implementations are used the same way as JDK
> Maps and Sets so usage should be very familiar. As far as the internal
> implementation details, I've tried to describe that in the javadocs
> for the implementing classes.
>
> One example use case is construction of meshes from a stream of
> triangles. This is used internally in
> o.a.c.geometry.euclidean.threed.mesh.SimpleTriangleMesh. Another use
> case is finding unique entries in a cloud of points, where many points
> are close but not exactly equal to each other. This case was actually
> posted on the user mailing list (I believe) way back when I started
> implementing this feature.

I know; but as the code base provides more and more functionality
(thank you!) it would be great to keep the tutorials/userguide in sync.
A simple "How to..." is often enough (and faster than browsing the
Javadoc) in order to get at the most common usage.

>
> > Does it entail issues about some use cases or applications that
> need this functionality?  Or do they not generally care about that
> contract?
> If so, maybe this collection shouldn't implement the standard JDK
> interfaces (?).
>
> No, there shouldn't be any issues. java.util.TreeMap documents that
> it's behavior is well-defined and consistent even when a Comparator
> that doesn't match equals is given, such as
> String.CASE_INSENSITIVE_ORDER. This is the same sort of situation. The
> map/set is still quite useful even without the strict contract.
>
> > Where does the anticipation come from?
>
> The approach I used for helping to maintain somewhat balanced trees in
> Euclidean 2D and 3D and spherical 2D regardless of insertion order is
> not based on a well-known algorithm or paper since I was unable to
> find one. The literature on the subject seems to focus on situations
> where the inserted points are all known beforehand and can be inserted
> in a particular order. I did not want to enforce this condition on the
> API. What I ended up with is just an idea I had for tree balancing
> that seems to work pretty well. As such, I fully expect that there
> will be a better option discovered later on.

IMHO, the above two Q & A are worth mentioning in the userguide.
The second especially may attract some user's attention who could
provide the missing info.  [Of course, it should also appear at the
relevant places in the Javadoc.]

>
> > I don't quite follow; which are the corresponding "non-canonical"
> accessors?
>
> My thought here is that there will be situations where a set of points
> is placed into a map/set and then these points are queried using
> values determined from some other source, such as through computations
> of some sort.

Indeed.

> These query points may vary from the originally inserted
> points by distances allowed by the Precision.DoubleEquivalence. In
> these cases, it's useful to be able to obtain the exact value of the
> originally inserted (i.e. "canonical") point. This is the purpose of
> the "resolve" methods.

I'm a little bit confused: Isn't it always the case that
  getEntry(p).getKey()
will return the originally inserted (i.e. "canonical") point (i.e. not "p")?

Anyways, I'd suggest that this be illustrated in the userguide (linked
to a working application in "commons-geometry-examples").

>
> > Is there a notion of neighbours (as in: return the "n" entries that
> are closest to a given point)?
>
> I am picturing that functionality being implemented in a follow-up issue. [1]

Thanks.
However, my impression is that the API should be more general:
---CUT---
public Iterable closestInRange(P point, double radius);
---CUT---

Unless I'm missing a standard use-case, the specialized methods
"closestFirst" and "farthestFirst" don't seem useful (and wasteful
of computing resources: If iterating over the whole set, why would
one want to start from some particular point?).

Regards,
Gilles

>
> Regards,
> Matt
>
> [1] https://issues.apache.org/jira/browse/GEOMETRY-146
>
> > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-03-12 Thread Gilles Sadowski
Hello.

Le lun. 28 févr. 2022 à 07:11, Avijit Basak  a écrit :
>
> Hi All
>
> Please see my comments below.
>
> > [...]
> >I just had a very quick look.
> >IIUC, you always provide "convenience" methods (e.g. the various
> >signatures for the "evolve" functionality).
> >Prior to merging into "master", we should simplify and limit the
> >discussion to the core functionality, i.e. not try and make decisions
> >for the user (like default values, ...).  Please keep the API as simple
> >as possible
> -- I have removed the mentioned evolve method.
> However, I had to catch two checked exceptions (InterruptedException,
> ExecutionException) and rethrow them. As of now I have handled them using
> the GeneticIllegalArgumentException. I think we need to introduce another
> exception class to handle this. Please share your thought regarding this.

I don't think that it's the right way to go; instantiating an "ExecutorService"
belongs to the GA application, not the GA library (whose relevant classes
need "only" be thread-safe).
There is some misunderstanding to be clarified in a dedicated discussion
(please file a new JIRA ticket).

Side note: Conflicts and duplicate commits have accumulated in the
dedicated "feature__MATH-1563__genetic_algorithm" branch.
I did not know how to proceed in order to avoid ending up with a messy
history in "master"; so I created a new branch (with the same name) with
all the new GA-related files added as a single commit.

Currently, this branch (based on your PR #205) fails the default goal,
because of a CheckStyle issue.  You shoudl always check locally that
running "mvn" without arguments does not generate any errors.

I also noticed that classes in "examples-ga" use "forbidden" library
classes: "GeneticIllegalArgumentException" is an "internal" class; we
must not advertize such classes in the example applications.
In general, it seems that "examples-ga" contains several classes and
methods that do not need to be "public".  This is especially true for
classes like "MathFunction" and "Coordinate".  [Having those "private"
helps users to tell what is part of the library's functionality from what is
just "dummy" placeholder code.]

Finally (for now), I've just noticed that there exist several classes named
"MathFunction", with same implementation!
Code duplication must be avoided, especially where we purport to display
best practices.
The various "Standalone" classes also look quite similar; consolidating the
"examples-ga" module (including full Javadoc) is necessary.  I still don't
understand why there are "...-legacy" modules in module "examples-ga".
If you want to offer the option of running the "old" implementation, you
could add a "legacy" flag (as "@Option" in the "Standalone" application).

Please use the new branch for all these ("cleanup") changes, as the basis
a PR (with a *single* commit).  Thanks.

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [geometry] PointMap and PointSet

2022-03-12 Thread Gilles Sadowski
Hello.

Le ven. 11 mars 2022 à 16:18, Matt Juntunen
 a écrit :
>
> Hello,
>
> I recently posted a PR [1] for GEOMETRY-142 [2], which is for adding
> PointMap and PointSet implementations. These are Map and Set
> implementations specifically designed to use Points as keys.

Is there a gentle introduction to how it works and/or the intended
use cases?

> They
> support fuzzy key comparison, meaning that points do not have to be
> exactly equal to each other in order to be considered equal by the
> map/set. (Note that this means these types do not follow the strict
> Map/Set contracts since they are not consistent with equals. This is
> documented in the PointMap/PointSet javadocs.)

Does it entail issues about some use cases or applications that
need this functionality?  Or do they not generally care about that
contract?
If so, maybe this collection shouldn't implement the standard JDK
interfaces (?).

> I've completely hidden
> the implementation details from the public API

Thanks.

> since I anticipate
> changes in the future with regard to the algorithms used.

Where does the anticipation come from?

> Instances
> are created through factory classes in each space. Ex:
>
> PointMap map = EuclideanCollections.pointMap3D(precision);
> PointSet set = SphericalCollections.pointSet2S(precision);
>
> Since fuzzy key comparison is used, I've added the following methods
> to the interfaces to allow access to the exact, "canonical" version of
> the key stored in the collection.
>
> PointMap  {
> // return the key corresponding to pt, or null if not found
> P resolveKey(P pt);
>
> // return the map entry corresponding to pt, or null if not found
> Map.Entry resolveEntry(P pt);
> }
>
> PointSet {
> // return the key corresponding to pt, or null if not found
> P resolve(P pt);
> }

I don't quite follow; which are the corresponding "non-canonical"
accessors?

>
> Reviews and comments are welcome.

Is there a notion of neighbours (as in: return the "n" entries that
are closest to a given point)?

Regards,
Gilles

>
> Regards,
> Matt Juntunen
>
>
> [1] https://github.com/apache/commons-geometry/pull/194
> [2] https://issues.apache.org/jira/projects/GEOMETRY/issues/GEOMETRY-142
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] GSoC 2022

2022-02-25 Thread Gilles Sadowski
Le ven. 25 févr. 2022 à 04:39, Matt Juntunen
 a écrit :
>
> I just added a similar placeholder issue for geometry:

Thanks!
I've added GEOMETRY-144 to the list.

Regards,
Gilles

> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] GSoC 2022

2022-02-23 Thread Gilles Sadowski
Ping.

Nothing for "Geometry", "Statistics", ... (?)
;-)

Regards,
Gilles

Le mer. 9 févr. 2022 à 14:57, Gilles Sadowski  a écrit :
>
> Hi.
>
> >>> [...]
> > > >
> > > > Shall we open a "GSoC 2022" report in each concerned JIRA project?
> > >
> > > Yes. I think we just create some tickets and tag them with the
> > > appropriate tag (GSOC 2022 ?). There should be some left over from
> > > last time to repurpose or use as templates for new ones.
> >
> > Actually, I was thinking of creating one global "GSoC 2022" issue
> > in each component, that would list all the topics and a complete
> > description of their respective goal,
>
> Done for "Commons Math":
>https://issues.apache.org/jira/browse/MATH-1641
>
> Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-02-21 Thread Gilles Sadowski
Hello.

Le lun. 21 févr. 2022 à 06:56, Avijit Basak  a écrit :
>
> Hi All
>
> Please find my comments below:
>
> [...]
> >
> >Another misunderstanding (probably); we must figure out where
> >the parallelism will be implemented.
> >IIUC the current state of the code, optimizing multiple populations
> >in parallel would be the same as launching multiple JVMs; I'd want
> >to explore low-level parallelism (i.e. at the "Chromosome" level).
> -- I have implemented both muti-threading and multi-population parallelism.

I just had a very quick look.
IIUC, you always provide "convenience" methods (e.g. the various
signatures for the "evolve" functionality).
Prior to merging into "master", we should simplify and limit the
discussion to the core functionality, i.e. not try and make decisions
for the user (like default values, ...).  Please keep the API as simple
as possible.

Thanks,
Gilles

>>>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-02-21 Thread Gilles Sadowski
Hello.

Le lun. 21 févr. 2022 à 06:56, Avijit Basak  a écrit :
>
> Hi All
>
> Please find my comments below:
>
> >The build fails because of CheckStyle errors:
> >https://app.travis-ci.com/github/apache/commons-math/builds/246683712
> --Fixed the issues

Could you please squash all the new commits?
[Of course, it's great to spell out the various changes, in the
"long" description of that commit.  Note that this grouping is
in contrast to what would be done in "master", where each
type of change should have its own commit.  Here the
intention is to keep the history clean until the agreed-on
code is merged to "master".]

Thanks,
Gilles

> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-02-19 Thread Gilles Sadowski
upport from the library,
and can be implemented at the application level.

> The code snippet below shows the body of the method which will be executed
> inside the task.
> --CUT--
>
> //selection
> ChromosomePair pair = getSelectionPolicy().select(current);
>
> // crossover
> if (randGen.nextDouble() < getCrossoverRate()) {
> // apply crossover policy to create two offspringoport
> pair = getCrossoverPolicy().crossover(pair.getFirst(), pair.getSecond());
> }
>
> // mutation
> if (randGen.nextDouble() < getMutationRate()) {
> // apply mutation policy to the chromosomes
> pair = new ChromosomePair(
> getMutationPolicy().mutate(pair.getFirst()),
> getMutationPolicy().mutate(pair.getSecond()));
> }
>
> return pair;
>
> --CUT--

One of the issue with above code is, again, that "randGen"
must be thread-safe (and it is not, usually).
Also, it doesn't say anything about how to ensure that
the fitness computation is thread-safe (and if you assume
that it will be computed outside that "task", then the
performance gain will be very low).

> >
> >I'd surmise that "multiple instances of AbstractGeneticAlgorithm"
> >is an application concern; unless I'm missing something, it's
> >not what I've in mind when talking about multi-threading.
> >Actually, I was wondering whether we could implement the
> >analog of what is in the "commons-math-neuralnet" module,
> >where
> >* "Neuron" is the counterpart "Chromosome"
> >* "Network" is the counterpart of "Population".
> --"multiple instance of AbstractGeneticAlgorithm" is related to parallel GA
> with multiple populations not multi-threading.

Yes, as I also mentioned above.
But I'm interested in where multi-threading can be implemented to
be used in both cases (single population and multiple populations).

> Users can also implement parallel GA in a synchronous manner although that
> won't be a recommended way.
> Multi-threading is only a way to improve performance using a user's multi
> core CPU.
> The threads in the thread pool would only be used to execute the task as
> mentioned in the previous comment.

That's where I've some doubt.
But be free to implement benchmarks that demonstrate the
expected performance improvement.

> I think we have some misunderstanding over here. It is better to do an
> implementation first and start the discussion.
> It would be more productive.

Agreed. ;-)

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] New JIRA fields

2022-02-18 Thread Gilles Sadowski
Le ven. 18 févr. 2022 à 23:56, sebb  a écrit :
>
> Wrong list?

No, why?  Answers would primarily serve to more precisely define the
tasks in the Commons project's JIRA, even though the motivation
is the info collected for the global list of tasks (across all projects).

FTR:
https://issues.apache.org/jira/browse/INFRA-22893

>
> On Fri, 18 Feb 2022 at 12:49, Gilles Sadowski  wrote:
> >
> > Hello.
> >
> > To help with automatic collection of tasks for GSoC[1] INFRA is
> > willing to add fields to our JIRA template.
> >
> > Proposal:
> > * (multi-valued) "potential-mentor" [DONE]
> > * (enumeration) "difficulty-level"
> >
> > Please suggest which choices should be selectable as "difficulty-level".
> >
> > A "Skill Level" field was already available but I think that the
> > current list does not provide useful hints to a newcomer (either
> > a GCoC candidate or a would-be contributor) about what prior
> > knowledge would be required to handle the task.
> > IMHO, there should be a way to define (free entry?) fields akin
> > to (perhaps with a better name):
> > * "required-knowledge" (things that candidates can learn by
> > themselves like how to build the project, etc.)
> > * "nice-to-have-knowledge" (i.e. for GSoC something that could
> > be acquired during the "bonding period")
> >
> > Regards,
> > Gilles
> >
> > [1] https://cwiki.apache.org/confluence/display/COMDEV/GSoC+2022+Ideas+list

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[All] New JIRA fields

2022-02-18 Thread Gilles Sadowski
Hello.

To help with automatic collection of tasks for GSoC[1] INFRA is
willing to add fields to our JIRA template.

Proposal:
* (multi-valued) "potential-mentor" [DONE]
* (enumeration) "difficulty-level"

Please suggest which choices should be selectable as "difficulty-level".

A "Skill Level" field was already available but I think that the
current list does not provide useful hints to a newcomer (either
a GCoC candidate or a would-be contributor) about what prior
knowledge would be required to handle the task.
IMHO, there should be a way to define (free entry?) fields akin
to (perhaps with a better name):
* "required-knowledge" (things that candidates can learn by
themselves like how to build the project, etc.)
* "nice-to-have-knowledge" (i.e. for GSoC something that could
be acquired during the "bonding period")

Regards,
Gilles

[1] https://cwiki.apache.org/confluence/display/COMDEV/GSoC+2022+Ideas+list

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] Maintenance (Re: [GitHub] [... PR] #104: Maven Wrapper [...])

2022-02-17 Thread Gilles Sadowski
Hi.

Le jeu. 17 févr. 2022 à 21:33, Itamar C  a écrit :
>
> Hello,
>
> My suggestion: I'll work on the methods without renaming and after the
> migration is completed, if we decide to rename, it's not difficult to
> rename all test methods with a script and put in a new PR.
>
> A simple regexp like
> "^\\s*@Test\\s*\n\\s*(.+)\\s+test(\\w)(\\w+)\\s*\\(.*"
> changing to
> m.group(1) + " " + m.group(2).toLowerCase() + m.group(3) + "("
> would do the trick.
>
> Maybe it's time to create a ticket in Jira for this discussion to move
> there?

Which discussion (since this thread covered more than one subject)?
If you mean the "migration to Junit 5" task for [Codec], it's already
there.[1]
If you mean the method rename (to remove the "test" prefix), then
the "dev" ML is where to continue the discussion (and/or start a vote
if there is no clear agreement).

Regards,
Gilles

[1] https://issues.apache.org/jira/projects/CODEC/issues/CODEC-285

>>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] Maintenance (Re: [GitHub] [... PR] #104: Maven Wrapper [...])

2022-02-17 Thread Gilles Sadowski
Hello.

Le jeu. 17 févr. 2022 à 16:18, Gary Gregory  a écrit :
>
> Well, it is explicitly in the sense that I would guess that 95% of the test
> methods in Commons follows that style and that one our documented
> guidelines is "follow the style of the file you are editing".

When migrating to the newer Junit, the "same style" rule is
intentionally broken; hence it is *not* obvious that one should
not also change the method name.
It certainly would not hurt to add a sentence to that effect, and
it would avoid repeating ourselves.

Gilles

>
> Gary
>
> On Thu, Feb 17, 2022, 09:16 Gilles Sadowski  wrote:
>
> > Hello.
> >
> > Le jeu. 17 févr. 2022 à 13:11, Gary Gregory  a
> > écrit :
> > >
> > > I have encountered what Sebb mentions more than once, I do like the
> > "test"
> > > prefix to make it obvious what is and is not intended to be a test. Same
> > > reason I like to make test methods public: clear intent. I know Junit 5
> > > proposes to change these conventions, the benefit do not outweigh the
> > > convention we use in Commons today for me.
> >
> > OK.
> > But shouldn't we make that explicit somewhere (or is it already?), in
> > order to let people know that we considered it and made a choice,
> > (thus reducing the chance that a contribution is based on another
> > convention that's perhaps becoming more natural for new developers)?
> >
> > Thanks,
> > Gilles
> >
> > > > > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] Maintenance (Re: [GitHub] [... PR] #104: Maven Wrapper [...])

2022-02-17 Thread Gilles Sadowski
Hello.

Le jeu. 17 févr. 2022 à 13:11, Gary Gregory  a écrit :
>
> I have encountered what Sebb mentions more than once, I do like the "test"
> prefix to make it obvious what is and is not intended to be a test. Same
> reason I like to make test methods public: clear intent. I know Junit 5
> proposes to change these conventions, the benefit do not outweigh the
> convention we use in Commons today for me.

OK.
But shouldn't we make that explicit somewhere (or is it already?), in
order to let people know that we considered it and made a choice,
(thus reducing the chance that a contribution is based on another
convention that's perhaps becoming more natural for new developers)?

Thanks,
Gilles

> > > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] Maintenance (Re: [GitHub] [... PR] #104: Maven Wrapper [...])

2022-02-16 Thread Gilles Sadowski
Hello.

> [...]
>
> One more practical question: since the tests are not anymore based on the
> methods name and are indicated by annotations now, I've seen tests without
> this "test" in the beginning. Looks like common practice (including it's
> the way it's presented in the JUnit 5 docs). Since I'll dig into all the
> tests, I can make this change as well. I like this style, because it looks
> more "clean" to me. What do you think, should I change the methods names as
> well?
>

Gary notes the practical reason for not mixing types of changes
but you can certainly start a discussion about changing the
convention.  I agree that, in
---CUT---
@Test
public void testSomething() {
// ...
}
---CUT---
there is one "test" too many.

Regards,
Gilles

> > [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-02-16 Thread Gilles Sadowski
and mutation rates?
> >> -- The difference between GeneticAlgorithm and AdaptiveGeneticAlgorithm
> is
> >> the ability to adapt crossover and mutation probability. However,  as per
> >> my understanding enum encapsulation is appropriate with the same set and
> >> type of constructor arguments, where the arguments can be provided during
> >> enum declaration. In our case the arguments would be provided by the
> client
> >> program and cannot be pre-initialized as part of an enum declaration.
> >
> >Why not?
> >A constant rate seems like a (trivial) type of adaptive rate.
>
> -- Our algorithm classes accept various user provided arguments like
> crossover, mutation and selection operators. Enum declaration requires all
> of the arguments to be provided at the declaration time in the class itself
> like
> ---CUT---
> public enum RandomSource {
>   JDK(*ProviderBuilder.RandomSourceInternal.JDK*),
> ...
> }
> ---CUT---
> I am not sure how to achieve this for our algorithm classes.

I'm a bit lost here; I don't get the relationship of this with my
remark above (constant vs adaptive)...  Maybe you could file
JIRA report to clarify and further discussion?

> >
> >>
> >> (7)
> >> >The currently available GA implementations are sequential.
> >> >IIUC, the "nextGeneration" methods should provide an option
> >> >that processes the population using multiple threads.
> >> --This needs to be done. However,  I would like to address this along
> with
> >> parallel GA i.e. convergence of multiple populations together.
> >
> >The two features (multi-thread vs multiple populations) should
> >be implemented independently:  Users that only need the "basic"
> >GA should also be able to take advantage of their machine's
> >multiple CPUs.
> >[This is related to the design issue which I mentioned previously.]
> >
> -- I am thinking to leverage user's multiple CPUs for doing
> multi-population GA.

OK (sort-of, since "the devil is in the details", and I'm not sure
that we mean the same thing by "multi", see below).

> It would a global approach where same thread pool
> would be used for both purposes. Another class would be introduced for
> executing parallel genetic algorithm which would accept multiple instances
> of AbstractGeneticAlgorithm class and converge them in parallel. Users who
> does not care for robustness would go for current implementations of the
> algorithm with single population. For a better optimization quality users
> would chose the new class.

As hinted by my comment is the previous message, I've still to
clarify my own expectations; but I vaguely sense some lost
opportunity for simpler usage simpler and increased performance
through the caller just needs to specify the number of "worker
threads".

Do we at least agree that
1. Adding/retrieving a "Chromosome" to/from a "Population"
must be thread-safe (and is not trivial)
2. Fitness computation is where most time is usually spent
(so that multi-threading must be achieved at that granularity)
?

I'd surmise that "multiple instances of AbstractGeneticAlgorithm"
is an application concern; unless I'm missing something, it's
not what I've in mind when talking about multi-threading.
Actually, I was wondering whether we could implement the
analog of what is in the "commons-math-neuralnet" module,
where
* "Neuron" is the counterpart "Chromosome"
* "Network" is the counterpart of "Population".

Regards,
Gilles

>>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-numbers] 03/04: Use Double.MIN_VALUE for the SAFE_MIN constant.

2022-02-16 Thread Gilles Sadowski
Hello.

Le mer. 16 févr. 2022 à 16:01, Alex Herbert  a écrit :
>
> On Wed, 16 Feb 2022 at 14:07, Gilles Sadowski  wrote:
>
> > >
> > > Use Double.MIN_VALUE for the SAFE_MIN constant.
> >
> > Nit-pick: s/VALUE/NORMAL/ ?
>
> Oops. However the value is correct (the existing unit test for it was
> unchanged). The commit log is wrong which I cannot change now without
> some variant of force push.

Oh, yes; maybe with "revert" (?).  But don't bother, actually. ;-)

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] Maintenance (Re: [GitHub] [... PR] #104: Maven Wrapper [...])

2022-02-16 Thread Gilles Sadowski
Hello.

Le mer. 16 févr. 2022 à 15:05, Itamar C  a écrit :
>
> That's my first message on this list, but I've been reading the messages
> for some months.
>
> I'm interested in contributing to Apache Commons, but it was a little
> confusing to find where to help. Some of the issues in Jira are very old,
> but not solved. It's difficult for a newcomer like me to understand if that
> issue is not relevant (and maybe should have been discarded) or if it's
> only a matter of lack of someone to get it and have it done. Maybe using
> some labels, much like it's used in GH, like "help wanted", "bug",
> "enhancement" or even "good first issue".

Well, that this information can be inaccurate also results from
"not enough maintainers".
Help is certainly welcome on that front too, i.e. review JIRA
reports and comment there about your findings ("inconsistent",
"already solved", "PR available (and current)", ...).

>
> There is too much information and some of them are inconsistent. For
> example:
> https://issues.apache.org/jira/projects/CODEC/issues/CODEC-286 is open
> but
> https://github.com/apache/commons-codec/pull/40 is closed
>
> Maybe there should be a better integration between GH and Jira. Use GH
> Actions to change the Jira issue status on some events.

I'm not sure I understand what this means in practice, but there
is already too much noise caused by automatic messages (whose
information contents is lower than the time it takes to figure that
out!).

> There are also some issues that were closed and reopened, because there is
> no consensus on what should be done, like CODED-253 [open] (and CODEC-257
> [closed]), about moving from Java 7 to 8.

Oops. :-}

> That's the kind of thing that
> scares newcomers.

"Commons" is a strange beast...

> I'm even afraid that this message can be misinterpreted
> as somehow aggressive. That's not my intention, I only want to give you a
> view from outside.

If you are a user, please consider yourself "inside". :-)
You are welcome to start a discussion on the ML, comment on
your preference about some issue, ...

>
> I've finally found that CODEC-285 is an issue that maybe I can help
> (Upgrade to JUnit v5.6.0). But there are already 4 PRs there. I'm not sure
> from where I should start: create a branch from the master or from some
> branch in those PRs? Starting from the master it's possible to have
> conflicts when merging all PRs. My plan is really to convert *all* tests in
> CODEC to Junit 5. Can I do it in a single massive PR or should I create a
> PR for each package?

A good start for another post (changing the "Subject:" line).

>
> Should I step in on this issue?

Sure. Welcome!

> Can someone guide me in those small doubts
> I have?

Hopefully yes.

Best,
Gilles

>
> Regards,
>
> Itamar Carvalho
>
>
>
>>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-numbers] 03/04: Use Double.MIN_VALUE for the SAFE_MIN constant.

2022-02-16 Thread Gilles Sadowski
Hello.

Le mer. 16 févr. 2022 à 14:47,  a écrit :
>
> This is an automated email from the ASF dual-hosted git repository.
>
> aherbert pushed a commit to branch master
> in repository https://gitbox.apache.org/repos/asf/commons-numbers.git
>
> commit 3be4da9c24b0bde0ffc8e68387d2b03c966365cd
> Author: aherbert 
> AuthorDate: Wed Feb 16 12:33:14 2022 +
>
> Use Double.MIN_VALUE for the SAFE_MIN constant.

Nit-pick: s/VALUE/NORMAL/ ?

Regards,
Gilles

> ---
>  .../main/java/org/apache/commons/numbers/core/Precision.java  | 11 
> +++
>  1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git 
> a/commons-numbers-core/src/main/java/org/apache/commons/numbers/core/Precision.java
>  
> b/commons-numbers-core/src/main/java/org/apache/commons/numbers/core/Precision.java
> index 323c338..0d1ccb0 100644
> --- 
> a/commons-numbers-core/src/main/java/org/apache/commons/numbers/core/Precision.java
> +++ 
> b/commons-numbers-core/src/main/java/org/apache/commons/numbers/core/Precision.java
> @@ -43,8 +43,10 @@ public final class Precision {
>   * Safe minimum, such that {@code 1 / SAFE_MIN} does not overflow.
>   * In IEEE 754 arithmetic, this is also the smallest normalized
>   * number 2-1022.
> + *
> + * @see Double#MIN_NORMAL
>   */
> -public static final double SAFE_MIN;
> +public static final double SAFE_MIN = Double.MIN_NORMAL;
>
>  /** Exponent offset in IEEE754 representation. */
>  private static final long EXPONENT_OFFSET = 1023L;
> @@ -59,13 +61,6 @@ public final class Precision {
>   *  constants: MATH-721
>   */
>  EPSILON = Double.longBitsToDouble((EXPONENT_OFFSET - 53L) << 52);
> -
> -/*
> - * This was previously expressed as = 0x1.0p-1022
> - * However, OpenJDK (Sparc Solaris) cannot handle such small
> - * constants: MATH-721
> - */
> -SAFE_MIN = Double.longBitsToDouble((EXPONENT_OFFSET - 1022L) << 52);
>  }
>
>  /**

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] Maintenance (Re: [GitHub] [... PR] #104: Maven Wrapper [...])

2022-02-14 Thread Gilles Sadowski
Le lun. 14 févr. 2022 à 16:11, Xeno Amess  a écrit :
>
> >  Code not actively developed does not attract newcomers.
> Well I have to say the reason for "codes not actively developed" is a
> strong lack of alive committers, or more detailed, reviewers.

In part, yes, but they are the consequence of each other (i.e.
a vicious circle).

The main cause IMHO, is that we loose[1] the spirit of being
dedicated to a project/component.  Contributions are "dropped"
as PR (usually) without follow-up (either here or on JIRA).

> I have still 100+ unsolved prs in commons projects, some of which be 1 or 2
> years ago, but it seems there just are not enough reviewers, and pr lists
> in every repo grows longer and longer.

That's the downside (in plain view?) of GH for a project that lacks
human resources (like "Commons") necessary for trimming the list
of PRs in a timely manner.
As you said in another comment, the PRs look like a stack; if there
is just one active committer, they rapidly become stale (because
"master" changes).  Yet the OP quite often just lets them rot there.
A "dedicated" contributor should follow development, update the
PRs, collect them in JIRA, based on the type of issue that they fix.
IMHO, that work would establish a natural priority, speed up the
review (and become a measure of dedication[2]).

> But on the other hand, free reviewers who have both ability and willingness
> to review, well, are really lacking, it is the  truth.

[1] Largely "thanks" to GitHub IMO.
[2] Which the number of PRs cannot be by itself.

>
> Gilles Sadowski  于2022年2月14日周一 22:34写道:
>
> > Le lun. 14 févr. 2022 à 14:34, Gary Gregory  a
> > écrit :
> > >
> > > My guess is that this is a combination of the maturity of the components
> >
> > The "maturity" rationale is not an explanation; it is a cause.
> > Code not actively developed does not attract newcomers.
> > It is not an "opinion" anymore; it is backed by the fact that
> > "Commons Math" API modernization had stalled on the basis
> > of that rationale; yet since the path has been unblocked, work
> > on [RNG], [Numbers], [Geometry], [Statistics] and [Math] itself
> > demonstrated how much room there was for improving[1] those
> > "mature" codes.
> >
> > Regards,
> > Gilles
> >
> > [1] Thanks to all who did it!
> >
> > > and people having moved on to jobs or hobbies that no longer requires
> > these
> > > components.
> > >
> > > Gary
> > >
> > >>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] Maintenance (Re: [GitHub] [... PR] #104: Maven Wrapper [...])

2022-02-14 Thread Gilles Sadowski
Hello.

Le lun. 14 févr. 2022 à 16:46, Xeno Amess  a écrit :
>
> (sigh) Do you think make some public activities would help? Like helding
> some online summer camp or something?

Well, there is, at least, GSoC.
Yet, AFAIK, there is no prior thinking about how to respond to
such initiatives, not even whether to respond.[1]
It occurs to me that it is not necessary to be a PMC member, or
even a committer, in order to help within GSoC (or just team with
newcomers until all the issues with their PR are ironed out).

Regards,
Gilles

[1] https://markmail.org/message/2qckwxw2x4ue36sd

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] Maintenance (Re: [GitHub] [... PR] #104: Maven Wrapper [...])

2022-02-14 Thread Gilles Sadowski
Le lun. 14 févr. 2022 à 14:34, Gary Gregory  a écrit :
>
> My guess is that this is a combination of the maturity of the components

The "maturity" rationale is not an explanation; it is a cause.
Code not actively developed does not attract newcomers.
It is not an "opinion" anymore; it is backed by the fact that
"Commons Math" API modernization had stalled on the basis
of that rationale; yet since the path has been unblocked, work
on [RNG], [Numbers], [Geometry], [Statistics] and [Math] itself
demonstrated how much room there was for improving[1] those
"mature" codes.

Regards,
Gilles

[1] Thanks to all who did it!

> and people having moved on to jobs or hobbies that no longer requires these
> components.
>
> Gary
>
>>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



[All] Maintenance (Re: [GitHub] [... PR] #104: Maven Wrapper [...])

2022-02-14 Thread Gilles Sadowski
Hello.

Le lun. 14 févr. 2022 à 00:23, GitBox  a écrit :
>
>
> nhojpatrick commented on pull request #104:
> URL: https://github.com/apache/commons-codec/pull/104#issuecomment-1038472244
>
>
>@garydgregory i agree it could be considered clutter. If all projects are 
> kept active current it's never an issue.
>From experience I'm coming from working with dead or hibernated projects, 
> when moving company/job/team/project and having to kick start something 
> building. It use to work on the cicd server, but someone updated that to a 
> newer maven version or java version so the project i'm working doesn't work 
> anymore. So 1st things I now always setup is maven wrapper (takari before it 
> was merge) and enforcer, so the project itself knows was it was last 
> configured to build under.
>

Thanks for the feedback.
It has been my understanding that one purpose of "Commons"
(and any other community project) is to gather (human) resources
in order to keep the code bases "alive".[1]
So IMHO, the top priority should be to extend the maintenance
team(s).  The shift of focus from the community's still official forum
(this ML) to GitHub is aggravating[2][3] the maintenance problem:
Most components now rely on less than the 3 required votes for
release, and can thus easily become "attic" candidates.

Regards,
Gilles

[1] The concept of "mature" library (often floated around here) has
been proven (in light of the JDK evolutions) to be a hindrance rather
than the sine qua non of user code stability.
[2] Backed by the numbers provided the project's report to the ASF
board (where the number of "committers" is utterly misleading wrt
its actual effect on maintenance capacity).
[3] Despite other advantages (not TBD in this thread) brought by the
platform (mainly for itself IMO).

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [Math] Review of "genetic algorithm" module

2022-02-14 Thread Gilles Sadowski
ntations are sequential.
> >IIUC, the "nextGeneration" methods should provide an option
> >that processes the population using multiple threads.
> --This needs to be done. However,  I would like to address this along with
> parallel GA i.e. convergence of multiple populations together.

The two features (multi-thread vs multiple populations) should
be implemented independently:  Users that only need the "basic"
GA should also be able to take advantage of their machine's
multiple CPUs.
[This is related to the design issue which I mentioned previously.]

>
> (8)
> >Do not use explicit "\n" and "\r" characters.[1]
> --Done

Thanks,
Gilles

>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-compress] branch master updated: Address CodeQL issues in pack200/unpack200 packages.

2022-02-10 Thread Gilles Sadowski
Hello.

Le jeu. 10 févr. 2022 à 20:40, Gary Gregory  a écrit :
>
> On Wed, Feb 9, 2022 at 11:06 AM Gilles Sadowski 
> wrote:
>
> > Le mer. 9 févr. 2022 à 16:32, Gary Gregory  a
> > écrit :
> > >
> > > Crafting a compressed file for a test fixture that causes a integer
> > > overflow deep in our code would be quite hard to reverse engineer.
> >
> > I of course agree that it would be overkill; hence the question, again:
> > Why
> >   int size = ...
> >   size = ExactMath.add(size, nameSize);
> > instead of
> >   int size = ...
> >   size = Math.toIntExact(Math.addExact(size, nameSize));
> > ?
> >
>
> Uh... you tell me why the core you propose is better.

It is better because it is self-documenting.

> >
> > The method in "ExactMath" considers that "nameSize"
> > must not be of type "long".  If so, the fix does not belong
> > in that method.
> > Alternatively, it must be documented that the class is a
> > shortcut workaround for a specific issue (to be clarified)
> > and should not be mistaken as a general-purpose utility
> > to add two numbers (in which case "ExactMath" and
> > "add" are bad names for the intended purpose).
> >
>
> I've updated the Javadoc. Note that the class is reused in 13 call sites
> already to avoid pattern duplication.

OK, this eventually makes it clear that your sense of "better"
is tied to the number of spared characters...

It still does not explain why the second argument cannot be
a "long".  If it's true, why not deprecate the method and define
one with the appropriate signature?

> Also note that the class is
> documented as private, not that this will stop people from using it.

IIUC, the class is unnecessary, but if it doesn't bother anyone
else, the thread has been long enough.

Regards,
Gilles

> Gary
>
>
> > >
> > > Gary
> > >
> > > On Wed, Feb 9, 2022, 09:38 Gilles Sadowski  wrote:
> > >
> > > > Le mer. 9 févr. 2022 à 15:16, Gary Gregory  a
> > > > écrit :
> > > > >
> > > > > Observe
> > > >
> > > > Perhaps I'm dense, but my detailed remark about the commit
> > > > reflects that I did.
> > > >
> > > > > that ExactMath delegates to Math after performing the necessary
> > > > > additional Math calls.
> > > >
> > > > As per the first part of the remark, it is not obvious why those
> > > > hoops are necessary; thus documenting the rationale would
> > > > prevent someone scrapping them (with just as terse a commit
> > > > message).
> > > > The other part of the remark signals a potential bug and/or
> > > > unintended behaviour; this also calls for clarification (and/or
> > > > unit tests).
> > > >
> > > > Thanks,
> > > > Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-compress] branch master updated: Address CodeQL issues in pack200/unpack200 packages.

2022-02-09 Thread Gilles Sadowski
Le mer. 9 févr. 2022 à 16:32, Gary Gregory  a écrit :
>
> Crafting a compressed file for a test fixture that causes a integer
> overflow deep in our code would be quite hard to reverse engineer.

I of course agree that it would be overkill; hence the question, again:
Why
  int size = ...
  size = ExactMath.add(size, nameSize);
instead of
  int size = ...
  size = Math.toIntExact(Math.addExact(size, nameSize));
?

The method in "ExactMath" considers that "nameSize"
must not be of type "long".  If so, the fix does not belong
in that method.
Alternatively, it must be documented that the class is a
shortcut workaround for a specific issue (to be clarified)
and should not be mistaken as a general-purpose utility
to add two numbers (in which case "ExactMath" and
"add" are bad names for the intended purpose).

>
> Gary
>
> On Wed, Feb 9, 2022, 09:38 Gilles Sadowski  wrote:
>
> > Le mer. 9 févr. 2022 à 15:16, Gary Gregory  a
> > écrit :
> > >
> > > Observe
> >
> > Perhaps I'm dense, but my detailed remark about the commit
> > reflects that I did.
> >
> > > that ExactMath delegates to Math after performing the necessary
> > > additional Math calls.
> >
> > As per the first part of the remark, it is not obvious why those
> > hoops are necessary; thus documenting the rationale would
> > prevent someone scrapping them (with just as terse a commit
> > message).
> > The other part of the remark signals a potential bug and/or
> > unintended behaviour; this also calls for clarification (and/or
> > unit tests).
> >
> > Thanks,
> > Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-compress] branch master updated: Address CodeQL issues in pack200/unpack200 packages.

2022-02-09 Thread Gilles Sadowski
Le mer. 9 févr. 2022 à 15:16, Gary Gregory  a écrit :
>
> Observe

Perhaps I'm dense, but my detailed remark about the commit
reflects that I did.

> that ExactMath delegates to Math after performing the necessary
> additional Math calls.

As per the first part of the remark, it is not obvious why those
hoops are necessary; thus documenting the rationale would
prevent someone scrapping them (with just as terse a commit
message).
The other part of the remark signals a potential bug and/or
unintended behaviour; this also calls for clarification (and/or
unit tests).

Thanks,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] GSoC 2022

2022-02-09 Thread Gilles Sadowski
Hi.

>>> [...]
> > >
> > > Shall we open a "GSoC 2022" report in each concerned JIRA project?
> >
> > Yes. I think we just create some tickets and tag them with the
> > appropriate tag (GSOC 2022 ?). There should be some left over from
> > last time to repurpose or use as templates for new ones.
>
> Actually, I was thinking of creating one global "GSoC 2022" issue
> in each component, that would list all the topics and a complete
> description of their respective goal,

Done for "Commons Math":
   https://issues.apache.org/jira/browse/MATH-1641

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-compress] branch master updated: Address CodeQL issues in pack200/unpack200 packages.

2022-02-09 Thread Gilles Sadowski
Hello.

Le mer. 9 févr. 2022 à 02:59,  a écrit :
>
> This is an automated email from the ASF dual-hosted git repository.
>
> ggregory pushed a commit to branch master
> in repository https://gitbox.apache.org/repos/asf/commons-compress.git
>
>
> The following commit(s) were added to refs/heads/master by this push:
>  new 666e787  Address CodeQL issues in pack200/unpack200 packages.
> 666e787 is described below
>
> commit 666e787a17e4e7321b70e99e55acf27b6382ab17
> Author: Gary Gregory 
> AuthorDate: Tue Feb 8 20:59:31 2022 -0500
>
> Address CodeQL issues in pack200/unpack200 packages.
>
> Throw ArithmeticExceptioninstead of silently overflowing.
> ---
>  .../compress/archivers/cpio/CpioArchiveEntry.java  |  3 +-
>  .../compress/harmony/pack200/BHSDCodec.java|  6 ++-
>  .../compress/harmony/pack200/FileBands.java|  3 +-
>  .../commons/compress/harmony/pack200/RunCodec.java |  8 ++--
>  .../compress/harmony/unpack200/BandSet.java|  3 +-
>  .../apache/commons/compress/utils/ExactMath.java   | 44 
> ++
>  6 files changed, 59 insertions(+), 8 deletions(-)
>
> diff --git 
> a/src/main/java/org/apache/commons/compress/archivers/cpio/CpioArchiveEntry.java
>  
> b/src/main/java/org/apache/commons/compress/archivers/cpio/CpioArchiveEntry.java
> index 57c77f5..5e5e7ad 100644
> --- 
> a/src/main/java/org/apache/commons/compress/archivers/cpio/CpioArchiveEntry.java
> +++ 
> b/src/main/java/org/apache/commons/compress/archivers/cpio/CpioArchiveEntry.java
> @@ -30,6 +30,7 @@ import java.util.Objects;
>  import java.util.concurrent.TimeUnit;
>
>  import org.apache.commons.compress.archivers.ArchiveEntry;
> +import org.apache.commons.compress.utils.ExactMath;

Why is this class used rather than "Math.addExact"[1]?
[If there is a reason, then the Javadoc of method "add(int,long)"
should document the purpose and rationale of the odd signature
(and its caveat that it can throw even for computations that would
not overflow).]

Regards,
Gilles

[1] 
https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#addExact-int-int-

>
>  /**
>   * A cpio archive consists of a sequence of files. There are several types of
> @@ -572,7 +573,7 @@ public class CpioArchiveEntry implements CpioConstants, 
> ArchiveEntry {
>  }
>  int size = this.headerSize + 1; // Name has terminating null
>  if (name != null) {
> -size += nameSize;
> +size = ExactMath.add(size, nameSize);
>  }
>  final int remain = size % this.alignmentBoundary;
>  if (remain > 0) {
> diff --git 
> a/src/main/java/org/apache/commons/compress/harmony/pack200/BHSDCodec.java 
> b/src/main/java/org/apache/commons/compress/harmony/pack200/BHSDCodec.java
> index 8bd7020..5117481 100644
> --- a/src/main/java/org/apache/commons/compress/harmony/pack200/BHSDCodec.java
> +++ b/src/main/java/org/apache/commons/compress/harmony/pack200/BHSDCodec.java
> @@ -22,6 +22,8 @@ import java.io.InputStream;
>  import java.util.ArrayList;
>  import java.util.List;
>
> +import org.apache.commons.compress.utils.ExactMath;
> +
>  /**
>   * A BHSD codec is a means of encoding integer values as a sequence of bytes 
> or vice versa using a specified "BHSD"
>   * encoding mechanism. It uses a variable-length encoding and a modified 
> sign representation such that small numbers are
> @@ -243,7 +245,7 @@ public final class BHSDCodec extends Codec {
>  band[i] -= cardinality;
>  }
>  while (band[i] < smallest) {
> -band[i] += cardinality;
> +band[i] = ExactMath.add(band[i], cardinality);
>  }
>  }
>  }
> @@ -260,7 +262,7 @@ public final class BHSDCodec extends Codec {
>  band[i] -= cardinality;
>  }
>  while (band[i] < smallest) {
> -band[i] += cardinality;
> +band[i] = ExactMath.add(band[i], cardinality);
>  }
>  }
>  }
> diff --git 
> a/src/main/java/org/apache/commons/compress/harmony/pack200/FileBands.java 
> b/src/main/java/org/apache/commons/compress/harmony/pack200/FileBands.java
> index 746b900..a394978 100644
> --- a/src/main/java/org/apache/commons/compress/harmony/pack200/FileBands.java
> +++ b/src/main/java/org/apache/commons/compress/harmony/pack200/FileBands.java
> @@ -25,6 +25,7 @@ import java.util.TimeZone;
>
>  import org.apache.commons.compress.harmony.pack200.Archive.PackingFile;
>  import org.apache.commons.compress.harmony.pack200.Archive.SegmentUnit;
> +import org.

[Math] Review of "genetic algorithm" module

2022-02-06 Thread Gilles Sadowski
Hello.

A few remarks (as of PR #205) and questions:

(1)
A commit log message should strive to be informative
for the reviewer; saying the like of "fixed minor bugs" does
not convey anything.
Even minor changes, like e.g. formatting cleanup, should be
designated as such.
For this PR, the message (which I've amended) was misleading
because the change was not about bugs, but about removing
GUI code (and its dependency).

(2)
The "GeneticException" class seems to mostly deal with "illegal"
arguments; hence it should be a subclass of the JDK's standard
"IllegalArgumentException" (and be renamed accordingly).
If other condition types are needed, then another internal class
should be defined with the corresponding standard semantics.
[Exception messages need review for spelling and formatting.]

(3)
IMO Javadoc should avoid redundant phrases like "This class" as
the first words of a class description.
A similar remark holds for fields in "GeneticException" class: Since
the name of the field is self-documenting, duplication in the Javadoc
is visual noise ("Message template" is concise and clear enough).
Similarly, simple accessors don't need the exact same sentence
repeated twice (a single "@return ..." tag is sufficient).

(4)
Class "ConvergenceListenerRegistry" is generic but its code
contains undocumented "@SuppressWarnings" annotations.
Moreover, it is a singleton, and not thread-safe.
Why should there be such a global "registry"?
Since it is only accessed by the "AbstractGeneticAlgorithm" class,
it could be defined as a private inner class.

(5)
In class "AbstractGeneticAlgorithm", methods "getCrossoverPolicy"
"getMutationPolicy", "getElitismRate" are public, yet they are only
ever called by a subclass.

(6)
Why support inheritance for "AbstractGeneticAlgorithm"?
Why would users need their own subclass, rather than call those
implemented within the library (currently, "GeneticAlgorithm" and
"AdaptiveGeneticAlgorithm")?
Couldn't we encapsulate the choice of algorithm in an "enum",
similar to "RandomSource" in [RNG].
Do I understand correctly that the (only?) difference between the
two classes is the ability to adapt crossover and mutation rates?

(7)
The currently available GA implementations are sequential.
IIUC, the "nextGeneration" methods should provide an option
that processes the population using multiple threads.

(8)
Do not use explicit "\n" and "\r" characters.[1]


Regards,
Gilles

[1] See 
https://docs.oracle.com/javase/8/docs/api/java/lang/System.html#lineSeparator--

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All] GSoC 2022

2022-02-02 Thread Gilles Sadowski
Hello.

Le mer. 2 févr. 2022 à 10:47, Alex Herbert  a écrit :
>
> On Mon, 31 Jan 2022 at 15:06, Gilles Sadowski  wrote:
> >
> > Hello.
> >
> > Le jeu. 27 janv. 2022 à 18:09, Alex Herbert  a 
> > écrit :
> > >
> > > I would be willing to go through GSOC again.
> >
> > Thanks; I know that back in 2020, it had been a disproportionate
> > amount of work...
> >
> > > I think that the
> > > statistics component could again serve as a project. There are some
> > > packages in Math that could be moved to make use of the updated
> > > distributions (e.g. math.stat.inference)
> >
> > That would be great, although I seem to notice that there
> > might be some dependency issues...
> >
> > > or perhaps a reworking of the
> > > math.stat.descriptive package to support using them with streams.
> >
> > +1
> >
> > > In the last iteration (GSOC 2020) we failed to get enough of a picture
> > > of the competence of candidates in the 'bonding phase' before places
> > > were formally allocated. I think we should require that a candidate
> > > can:
> > >
> > > - Open a PR on GitHub to add a feature in the topic area. It should be
> > > of non-trivial complexity and delivered to a quality ready to merge.
> >
> > Do you think that the above "stream support" could be that task?
>
> Yes. A simple class to compute a summary statistic such as:
>
> public interface Statistic {
> void add(R x);
> }
> public interface DoubleStatistic extends Statistic,
> DoubleConsumer, DoubleSupplier {
> // Composite interface
> }
>
> public Mean implements DoubleStatistic {
>   static Mean create();
>   // Overrides
>   public void accept(double x);
>   public void add(Mean m);
>   public double getAsDouble();
> }
>
> Used as:
>
> DoubleStream s;
> double u = s.collect(Mean::create, Mean::accept, Mean::add).getAsDouble();

To simplify the above, would we also provide
---CUT---
public Mean ... {
  // ...
  public static double collect(DoubleStream s) {
return s.collect(Mean::create, Mean::accept, Mean::add).getAsDouble();
  }
}
---CUT---

>
> The implementation(s) can be updated and expanded later using
> different underlying algorithms (simple sum, extended precision sum,
> rolling mean) by passing a choice to the create method.
>
> The project will involve how to move from this simple statistic to
> supporting IntStream, LongStream, DoubleStream as appropriate and
> allow combining statistics efficiently to obtain a customised summary
> statistic, perhaps by enum.
>
> This is for the StorelessUnivariateStatistic in Commons Math. A more
> detailed examination of the existing functionality would be required
> and use cases generated for each to understand how these can be
> supported in streams.

This study could be indeed started in the "bonding" period and will
fairly clearly indicate the candidate's potential.

> >
> > > - Show knowledge of the topic area beyond this single feature,
> > > demonstrating ability to continue to significantly contribute through
> > > a 3 month period in the subject area.
> >
> > That seems more fuzzy to define and assess (?).
>
> I agree; choosing candidates is a fuzzy area. This was meant to
> summarise my understanding of how we chose candidates last time. It is
> based on their proposal submitted to GSOC but also impressions from
> the bonding period.

As you noted in your post-GSoc 2020 suggestions, the issue
stemmed from not having a concrete way to evaluate the bonding
period.
This should be solved (for "[Statistics]") with your proposal above.

I'd be glad to get help with defining concrete tasks for the ideas
below. :-)

> >
> > Some ideas (for "Commons Math"):
> > 1. Redesign and modularization of the "ml" package
> >   -> main goal: enable multi-thread usage
> > 2. Abstracting the linear algebra utilities
> >   -> main goal: allow (runtime?) switch to alternative implementations
> > 3. Redesign and modularization of the "random" package
> >   -> main goal: general support of low-discrepancy sequences
> > 4. Refactoring and modularization of the "special" package
> >  -> main goal: ensure accuracy and performance and better API,
> >  add other functions (?).
> >
> > > Without this set of skills there will be little progress in the formal
> > > code period.
> >
> > :-}
> >
> > Shall we open a "GSoC 2022" report in each concerned JIRA project?
>
> Yes. I think we just cr

Re: [MATH][GA] Build Failure for PR #204

2022-02-02 Thread Gilles Sadowski
Hi.

Le mer. 2 févr. 2022 à 09:29, Avijit Basak  a écrit :
>
> Hi All
>
> Please see my comments below.
>
> [...]
>
>
> And there was this old issue that the "" should contain
> the name of the top-level package, i.e. "math4", not "math".
> -- There was a review comment for PR#197 to remove 4 from artifactid.
> "aherbert <https://github.com/aherbert> on Sep 25, 2021
> <https://github.com/apache/commons-math/pull/197#discussion_r716075956>
>
> Remove the 4 from math4. The version is specified separately from the
> artifact ID."

Indeed, it seems that there are discrepant expectations or a
misunderstanding about how to compose the "".
In "Commons Math", it contains "math4" as (IIUC) a unique
identifier of the top-level package (that is updated with every
major version).  Because of that latter convention, it is true that
the "4" is redundant with the (major) version number.
However, it could also be construed that the redundancy may
be useful for stressing that artefacts with different major versions
can be used together (without "JAR hell").
That view of having the "packageId" as part of the artifact's name
is used in some other components (e.g. "[Lang]"[1]) but not all
(e.g. "[IO]"[2])...

>
> I've updated the feature branch with those changes. Please rebase.
>
> I've not yet looked at the code, but a question arose from looking at
> the dependencies: What is "jfreechart" used for in the "examples"?
> -- jfreechart is used to do a graphical plot of the optimization process.
>
> I've just updated the "k-means" example, removing the GUI along
> the way.  In general, I think that the example applications should
> follow the KISS principle (which here translates to:  Only write to the
> console or to files).  Since we don't intend to write full-fledged
> applications, building/testing should be as smooth as possible: GUIs
> entail unnecessary hassle for someone working from a remote
> (text) terminal.
> -- I shall remove that and the corresponding part of the code.

Thanks.
Please note that I don't suggest that you remove the tracking of
the optimization process (it is useful to have a trace in order to
check that evolution proceeds as expected), instead of displaying
a GUI, you can save snapshots (either in text form or, if the
check is more easily done graphically, by using the "[Imaging]"
component[3]).

Regards,
Gilles

> [...]
>

[1] 
https://gitbox.apache.org/repos/asf?p=commons-lang.git;a=blob;f=pom.xml;h=4f12fdf537fd56a69d1b94567e22de99761ec775;hb=HEAD#l28
[2] 
https://gitbox.apache.org/repos/asf?p=commons-io.git;a=blob;f=pom.xml;h=8f61ca0177a056a80dda656dbb70a9774adac548;hb=HEAD#l26
[3] See e.g. the "kmeans/image" module.

>
> Thanks & Regards
> --Avijit Basak
>
> On Tue, 1 Feb 2022 at 05:24, Gilles Sadowski  wrote:
>
> > Hello.
> >
> > Le lun. 31 janv. 2022 à 06:27, Avijit Basak  a
> > écrit :
> > >
> > > Hi All
> > >
> > > Please find my comments below.
> > >
> > > >There is no attachment (I think that the ML manager strips those).
> > > >Please copy/paste the relevant part of the console log (or provide
> > > >a link to it).
> > > --The build was done locally with a fresh clone of the feature branch.
> >
> > Strange that the "pom.xml" in PR #204 still refers to version 1.0 of
> > Commons Numbers, instead of version 1.1-SNAPSHOT.
> > This creates many "NoClassDefFound" errors that were fixed with
> > commit 7e2213f2e5a536ad49d549d21f9eed9e71db5638 in branch
> > "feature__MATH-1563__genetic_algorithm" branch 6 days ago.
> >
> > Anyways, after fetching your PR and rebasing on that branch, the
> > build is successful.
> >
> > Nevertheless, I had to fix/consolidate many POM files that contained
> > a slew of duplicate declarations (the "dependency management" is
> > done at the highest possible level, to ensure version consistency).
> > Also, please use the same formatting rules as in existing files (in
> > POM files, the indentation is 2 spaces).
> >
> > And there was this old issue that the "" should contain
> > the name of the top-level package, i.e. "math4", not "math".
> >
> > I've updated the feature branch with those changes. Please rebase.
> >
> > I've not yet looked at the code, but a question arose from looking at
> > the dependencies: What is "jfreechart" used for in the "examples"?
> > I've just updated the "k-means" ex

Re: [All][Math] Jenkins: multiple branches (how to ...?)

2022-02-01 Thread Gilles Sadowski
Le dim. 30 janv. 2022 à 19:34, Gilles Sadowski  a écrit :
>
> I filed a JIRA report:
>https://issues.apache.org/jira/browse/INFRA-22818

No answer yet; I created another Jenkins job in the meantime:
  https://ci-builds.apache.org/job/Commons/job/commons-math__ga_branch/

Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [MATH][GA] Build Failure for PR #204

2022-01-31 Thread Gilles Sadowski
Hello.

Le lun. 31 janv. 2022 à 06:27, Avijit Basak  a écrit :
>
> Hi All
>
> Please find my comments below.
>
> >There is no attachment (I think that the ML manager strips those).
> >Please copy/paste the relevant part of the console log (or provide
> >a link to it).
> --The build was done locally with a fresh clone of the feature branch.

Strange that the "pom.xml" in PR #204 still refers to version 1.0 of
Commons Numbers, instead of version 1.1-SNAPSHOT.
This creates many "NoClassDefFound" errors that were fixed with
commit 7e2213f2e5a536ad49d549d21f9eed9e71db5638 in branch
"feature__MATH-1563__genetic_algorithm" branch 6 days ago.

Anyways, after fetching your PR and rebasing on that branch, the
build is successful.

Nevertheless, I had to fix/consolidate many POM files that contained
a slew of duplicate declarations (the "dependency management" is
done at the highest possible level, to ensure version consistency).
Also, please use the same formatting rules as in existing files (in
POM files, the indentation is 2 spaces).

And there was this old issue that the "" should contain
the name of the top-level package, i.e. "math4", not "math".

I've updated the feature branch with those changes. Please rebase.

I've not yet looked at the code, but a question arose from looking at
the dependencies: What is "jfreechart" used for in the "examples"?
I've just updated the "k-means" example, removing the GUI along
the way.  In general, I think that the example applications should
follow the KISS principle (which here translates to:  Only write to the
console or to files).  Since we don't intend to write full-fledged
applications, building/testing should be as smooth as possible: GUIs
entail unnecessary hassle for someone working from a remote
(text) terminal.

> Please find the log below. Kindly let me know once the build is successful.
> The command was "*mvn clean verify apache-rat:check checkstyle:check
> pmd:check spotbugs:check javadoc:javadoc*".

As Alex noted, you should ensure that the build is successful with
the supported version of the JDK (i.e. Java 8 currently).
[If you encounter problems with a later version, it's always nice to
file a JIRA report, but fixing such issues is probably low priority.]

Regards,
Gilles

[1] 
https://github.com/apache/commons-math/blob/7a6c35b7396dbf0fb06cde5c29ce405821416d68/pom.xml#L66



> ...
> main:
> [INFO] Executed tasks
> [INFO]
> [INFO] <<< maven-javadoc-plugin:3.2.0:javadoc (default-cli) <
> generate-sources @ commons-math4-legacy <<<
> [INFO]
> [INFO]
> [INFO] --- maven-javadoc-plugin:3.2.0:javadoc (default-cli) @
> commons-math4-legacy ---
> [INFO] No previous run data found, generating javadoc.
> [INFO]
> 19 errors
> [INFO]
> 
> [INFO] Reactor Summary for Apache Commons Math 4.0-SNAPSHOT:
> [INFO]
> [INFO] Apache Commons Math  SUCCESS [
> 11.692 s]
> [INFO] Miscellaneous core classes . SUCCESS [
> 34.790 s]
> [INFO] Artificial neural networks . SUCCESS [
> 26.235 s]
> [INFO] Transforms . SUCCESS [
> 29.626 s]
> [INFO] genetic algorithm .. SUCCESS [
> 34.566 s]
> [INFO] Exception classes (Legacy) . SUCCESS [
> 25.376 s]
> [INFO] Miscellaneous core classes (Legacy)  SUCCESS [
> 37.143 s]
> [INFO] Apache Commons Math (Legacy) ... FAILURE [02:31
> min]
> [INFO] Example applications ... SKIPPED
> [INFO] SOFM ... SKIPPED
> [INFO] SOFM: Chinese Rings  SKIPPED
> [INFO] SOFM: Traveling Salesman Problem ... SKIPPED
> [INFO] K-Means  SKIPPED
> [INFO] K-Means: Image Clustering .. SKIPPED
> [INFO] examples-genetic-algorithm . SKIPPED
> [INFO] examples-ga-math-functions . SKIPPED
> [INFO] examples-ga-tsp  SKIPPED
> [INFO]
> 
> [INFO] BUILD FAILURE
> [INFO]
> 
> [INFO] Total time:  05:51 min
> [INFO] Finished at: 2022-01-30T11:35:44+05:30
> [INFO]
> 
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-javadoc-plugin:3.2.0:javadoc (default-cli)
> on project

Re: [All] GSoC 2022

2022-01-31 Thread Gilles Sadowski
Hello.

Le jeu. 27 janv. 2022 à 18:09, Alex Herbert  a écrit :
>
> I would be willing to go through GSOC again.

Thanks; I know that back in 2020, it had been a disproportionate
amount of work...

> I think that the
> statistics component could again serve as a project. There are some
> packages in Math that could be moved to make use of the updated
> distributions (e.g. math.stat.inference)

That would be great, although I seem to notice that there
might be some dependency issues...

> or perhaps a reworking of the
> math.stat.descriptive package to support using them with streams.

+1

> In the last iteration (GSOC 2020) we failed to get enough of a picture
> of the competence of candidates in the 'bonding phase' before places
> were formally allocated. I think we should require that a candidate
> can:
>
> - Open a PR on GitHub to add a feature in the topic area. It should be
> of non-trivial complexity and delivered to a quality ready to merge.

Do you think that the above "stream support" could be that task?

> - Show knowledge of the topic area beyond this single feature,
> demonstrating ability to continue to significantly contribute through
> a 3 month period in the subject area.

That seems more fuzzy to define and assess (?).

Some ideas (for "Commons Math"):
1. Redesign and modularization of the "ml" package
  -> main goal: enable multi-thread usage
2. Abstracting the linear algebra utilities
  -> main goal: allow (runtime?) switch to alternative implementations
3. Redesign and modularization of the "random" package
  -> main goal: general support of low-discrepancy sequences
4. Refactoring and modularization of the "special" package
 -> main goal: ensure accuracy and performance and better API,
 add other functions (?).

> Without this set of skills there will be little progress in the formal
> code period.

:-}

Shall we open a "GSoC 2022" report in each concerned JIRA project?

Regards,
Gilles

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons Daemon 1.2.5 based on RC1

2022-01-31 Thread Gilles Sadowski
>
> [...]
>
> On the same topic, should we increase the version number to 1.3.0?

Yes.

Regards,
Gilles

> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [All][Math] Jenkins: multiple branches (how to ...?)

2022-01-30 Thread Gilles Sadowski
Hello.

Le sam. 29 janv. 2022 à 22:41, Bruno Kinoshita
 a écrit :
>
> [...]
>
> Just logged in now and confirmed that there is an "asfinfra" room. Over the
> weekend I only found messages from the bots about JIRA/GitHub for INFRA.
> But they were chatting in the channel Wed/Thu/Friday. So if you join you
> should be able to get it answered pretty quickly over the next week, or use
> JIRA if you don't want to use Slack, they normally respond quickly over
> there too.

I filed a JIRA report:
   https://issues.apache.org/jira/browse/INFRA-22818

Gilles

>> [...]

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



<    1   2   3   4   5   6   7   8   9   10   >