Re: [MATH][GA] Issues in "commons-math4-ga2" design
Hi All Please see my comments below. Kindly share further thoughts. > [...] >I'm not sure what you mean: The examples just run a GA-like algorithm, >but (AFAICT) do not compare the output to some expected outcome. -- I have some code changes in the "examples-ga-math-functions" module to compare results of two modules "commons-math4-ga" and "commons-math4-ga2". A graphical approach using JFreeChart has been adopted for the same. A new value "COMPARE" has been introduced for the "--api" input argument to initiate the comparison. The "commons-math4-ga" module consistently provided better results than "commons-math4-ga2". The code is kept in my repo https://github.com/avijit-basak/commons-math/tree/feature/MATH-1563_comparison . I did not raise a PR till now. This is only kept in my repo for comparison. Could you please check if the feature__MATH-1563__genetic_algorithm branch does contain changes from master of apache repo. >> This variant design is more appropriate for a *generalized population based >> stochastic optimizer* which can accommodate other algorithms like >> multi-agent gradient descent/simulated annealing, genetic algorithm(already >> implemented), particle swarm optimization and large neighbourhood search >> etc. >> If we want to stick to this new design I would rather suggest *renaming* of >> the existing interfaces so that the API can be more generic and can be used >> for all other algorithms. GA should be a specific implementation for that >> API. >> However, we might have to think more on the multiple operator scenarios. > >An interesting suggestion. If the generalized API can be achieved >easily, I'm all for it. >However, I wonder how useful it will be, as every actual optimizer >implementation may > * require substantial adaptations to fit the common API > * need extensions to provide access to specific features (which > would decrease the usefulness of the common API for users). [...] -- We can avoid that for now as that will be a bigger task. > >[1] My main argument for the "GA variant" is that it is much simpler, for > what seems equivalent functionality (bugs, or misinterpretation of > expected behaviour, notwithstanding): Current counts of lines of > code is 696 vs 2038. -- The variant only contains options for binary genotype but the "commons-math4-ga" module provides options for other genotypes too. So, we may not compare the lines of code. However, considering the optimization result and options of genotypes I would still vote for "commons-math4-ga" instead of its new variant. Thanks & Regards --Avijit Basak On Thu, 29 Sept 2022 at 22:42, Gilles Sadowski wrote: > Hello. > > Le jeu. 29 sept. 2022 à 14:07, Avijit Basak a > écrit : > > > > Hi All > > > > Please find my comments below: > > > > > > > >> Hi All > > >> > > >> The newly proposed design of "commons-math4-ga2" has two > primary > > >> issues which I would like to mention here. > > >> > > >> *1) GA logic*: The design does not conform to the basic genetic > algorithm > > >I understand the concern about providing the standard ("historical") GA. > > >The theorem assumes the standard GA, but the example shows that > > >convergence is also achieved with the variant. > > > > -- Yes the new variant can accommodate the standard GA too. > > > > > > > >> However, the new design proposed as part of "commons-math4-ga2" > > >> deviates from the basic logic. It does not distinguish the operators > i.e. > > >> crossover and mutation and treats them uniformly. The order of > > >> operator application is also not considered. > > > > > >All intended as "features". ;-) > > >[One being that, in the variant implementation, it is possible to apply > > >any number of operators, not just one specific crossover followed by > > >one mutation.] > > > > > >Shouldn't we be able (IIUC) to define the standard GA procedure by > > >an extension of the API like the following (untested): > > >---CUT--- > > >public class CrossoverThenMutate > > >extends AbstractCrossover { > > >private AbstractCrossover c; > > >private AbstractMutation m; > > > [...] > > >private List mutate(G parent, > > > UniformRandomProvider rng) { > > >final List p = new ArrayList(1); > > >p.add(parent); > > >
Re: [MATH][GA] Issues in "commons-math4-ga2" design
Hi All Please find my comments below: > >> Hi All >> >> The newly proposed design of "commons-math4-ga2" has two primary >> issues which I would like to mention here. >> >> *1) GA logic*: The design does not conform to the basic genetic algorithm >I understand the concern about providing the standard ("historical") GA. >The theorem assumes the standard GA, but the example shows that >convergence is also achieved with the variant. -- Yes the new variant can accommodate the standard GA too. > >> However, the new design proposed as part of "commons-math4-ga2" >> deviates from the basic logic. It does not distinguish the operators i.e. >> crossover and mutation and treats them uniformly. The order of >> operator application is also not considered. > >All intended as "features". ;-) >[One being that, in the variant implementation, it is possible to apply >any number of operators, not just one specific crossover followed by >one mutation.] > >Shouldn't we be able (IIUC) to define the standard GA procedure by >an extension of the API like the following (untested): >---CUT--- >public class CrossoverThenMutate >extends AbstractCrossover { >private AbstractCrossover c; >private AbstractMutation m; > [...] >private List mutate(G parent, > UniformRandomProvider rng) { >final List p = new ArrayList(1); >p.add(parent); >return m.apply(p, rng); >} >} >---CUT--- > >AFAICT, a standard GA would thus be performed if this combined >operator would be used as a unique operator in the GA variant. --If we consider this approach we may need to modify our examples which assume the standard GA. This variant design is more appropriate for a *generalized population based stochastic optimizer* which can accommodate other algorithms like multi-agent gradient descent/simulated annealing, genetic algorithm(already implemented), particle swarm optimization and large neighbourhood search etc. If we want to stick to this new design I would rather suggest *renaming* of the existing interfaces so that the API can be more generic and can be used for all other algorithms. GA should be a specific implementation for that API. However, we might have to think more on the multiple operator scenarios. > >> Along with that it executes >> parent selection two times instead of one. > >That would also be taken care of with the above combined operator. > >> These are clear deviations from the standard approach used so far and would >> require a fix. >> >> >> *2) Determination of mutation probability*: The newly proposed design of >> "commons-math4-ga2" determines the probability of mutation at the algorithm >> level. Same approach was used in math 3.x implementation. However, this >> approach considers the probability of mutation at the chromosome level not >> at the allele/gene level. I have found a considerable difference in the >> quality of optimization between two cases. Determining the mutation >> probability at the gene/allele level has given a >> considerably better result. > >A runnable test case (that creates a comparison) would be quite useful >to illustrate the feature. > >> Usage of mutation probability at the chromosome >> level would only ensure mutation of a single allele irrespective of >> probability > >? >In the basic implementation for the "binary" genotype (in class >"o.a.c.m.ga2.gene.binary.Mutation"), there is a loop over all the >alleles. > >> or chromosome size. There is no such limitation in case the >> mutation probability is decided at the allele level and can be easily >> controlled by users for fine tuning. This has helped to improve the >> optimization quality thus providing better results. This is only related to >> mutation not crossover. But we can maintain an uniform approach and let the >> operator decide on the probability. > >I don't understand. >Please refer to the class mentioned above and describe the required >modifications. -- E.g. assume the user is having a chromosome population of size 10 and chromosome length is 10. mutation probability no of alleles modified per chromosome no of alleles modified in population .2 2 20 .1 1 10 .05 -- 5
[MATH][GA] Issues in "commons-math4-ga2" design
Hi All The newly proposed design of "commons-math4-ga2" has two primary issues which I would like to mention here. *1) GA logic*: The design does not conform to the basic genetic algorithm concepts proposed by John Holland. The pseudocode representing the original algorithm logic is provided below: --CUT-- while(!converged(population)) { Population newPopulation = new Population(); for(int i = 0; i < size(population)/2; i++) { // select parents ChromosomePair parents = select(population); // do crossover ChromosomePair offsprings = crossover(parents); //do mutation Chromosome chromosome1 = mutate(offsprings[0]); Chromosome chromosome2 = mutate(offsprings[1]); // Add mutated chromosomes to population newPopulation.add(chromosome1); newPopulation.add(chromosome2); } } --CUT-- However, the implementation proposed in "commons-math4-ga2" can be represented by the pseudocode provided below. --CUT-- while(!converged(population)) { List operators; Population newPopulation = new Population(); for(int i = 0; i < size(population)/2; i++) { for(GeneticOperator operator : operators) { // select parents ChromosomePair parents = select(population); // apply operator ChromosomePair offsprings = operator.apply(parents); // Add chromosomes to population newPopulation.add( offsprings[0] ); newPopulation.add( offsprings[1] ); } } } --CUT-- N.B. The use of probability and elitism has been avoided to keep the logic simplified. The first one has been used by the engineering community for decades and is proved to be effective. There is also a mathematical model based on schema theorem( https://en.wikipedia.org/wiki/Holland%27s_schema_theorem#:~:text=The%20Schema%20Theorem%20says%20that,the%20power%20of%20genetic%20algorithms.) to support the effectiveness of the algorithm. Same has been followed by me for implementation of "commons-math4-ga" module. However, the new design proposed as part of "commons-math4-ga2" deviates from the basic logic. It does not distinguish the operators i.e. crossover and mutation and treats them uniformly. The order of operator application is also not considered. Along with that it executes parent selection two times instead of one. These are clear deviations from the standard approach used so far and would require a fix. *2) Determination of mutation probability*: The newly proposed design of "commons-math4-ga2" determines the probability of mutation at the algorithm level. Same approach was used in math 3.x implementation. However, this approach considers the probability of mutation at the chromosome level not at the allele/gene level. I have found a considerable difference in the quality of optimization between two cases. Determining the mutation probability at the gene/allele level has given a considerably better result. Usage of mutation probability at the chromosome level would only ensure mutation of a single allele irrespective of probability or chromosome size. There is no such limitation in case the mutation probability is decided at the allele level and can be easily controlled by users for fine tuning. This has helped to improve the optimization quality thus providing better results. This is only related to mutation not crossover. But we can maintain an uniform approach and let the operator decide on the probability. Please share further thoughts. Thanks & Regards -- Avijit Basak
Re: [Math] Review of "genetic algorithm" module
st. >> > >> >A class to be used as a key only needs to implement "equals" and >> >"hashCode". >> -- The current chromosome class implements Comparable interface which uses >> chromosome fitness for comparison. Use of both Comparable and equals() >> might introduce inconsistencies. > >An example? -- Inconsistency can appear in case we provide a custom implementation of equals and hashcode following the representation of chromosome or use the default implementation. Since Comparable uses the fitness value to compare and as described above two chromosomes with separate representations can have the same fitness value this might result in inconsistency. But in the new module chromosome does not implement Comparable, so there is no possibility of the same. > >> > >> >> > >> >> >(6) >> >> >o.a.c.m.ga.chromsome.AbstractChromosome >> >> > >> >> >Field "fitness" is not "final", yet it could be: a "FitnessFunction" >> >> >object (used in "evaluate() to compute that field) is passed to the >> >> >constructor. Is there a reason for the "lazy" evaluation? >> >> >Dropping it would make the instance immutable (and "evaluate()" >> >> >should be renamed to "getFitness()"). >> >> > >> >> >Why should the "FitnessFunction" be stored in every chromosome? >> >> > >> >> -- I have modified the fitness as final and initialized the same in the >> >> constructor. >> > >> >Better, but did you check my proposal in MATH-1618, where >> >Chromosome and fitness are decoupled, and their relationship >> >is held within a "Population" instance? >> --Mentioned earlier. > >I still don't know whether you agree that my proposal makes it >simpler to express a GA. -- I think there are few points where we are not aligned and those are mentioned in the summary section. > >> > >> >> [...] > >> > >> >> > >> >> >(9) >> >> >Naming of factory methods should be harmonized to match the convention >> >> >adopted in components like [RNG] and [Numbers]. >> >> >E.g. instead of "newChromosome(...)", please use "of(...)" or >> "from(...)" >> >> >for "value object", and "create(...)" otherwise. >> >> > >> >> -- I have renamed the same for Chromosome classes. >> >> What about the nextGeneration() method of ListPopulation class. Renaming >> >> this to create() or from() won't convey the purpose of it. >> > >> >I agree, and that's why the new "Population" class (in MATH-1618) does >> >not provide a factory method (see also the "GeneticAlgorithmFactory" >> >class). >> -- We can avoid the same in the current model if we agree to use a default >> implementation of population and remove the Population interface following >> your new model. > >So, do we adopt that "new model"? >Or do you still have objections? -- Mentioned above. > >> > >> >> >(10) >> >> >o.a.c.m.ga.chromosome.AbstractListChromosome >> >> > >> >> >Constructor is called with an argument that is a previously instantiated >> >> >"representation". If the latter is mutable, the caller will be able to >> >> modify >> >> >the underlying data structure of the newly created chromosome. [The >> >> >doc assumes immutability of the representation but this cannot be >> >> >enforced, and mixed ownership can entail subtle bugs.] >> >> -- I think this applies to both representation as well as generic >> parameter >> >> type T. But I don't see any other option but to rely on the user. >> > >> >The Javadoc (at line 84) is misleading in its mention of "immutable". >> > >> >> If you have any suggestions kindly share. >> > >> >I may not understand all the implications, but I'd suggest that the >> >"representation" be instantiated within the control of the library (e.g. >> >through a "builder"/"factory"). >> -- Currently we have the ChromosomeRepresentationUtils for the same. Its >> methods are designed to generate the representations. > >My suggestion is that this design can be improved (a.o. according to my >above suggestion). -- Sure. > >> > >> &g
Re: [Math] Review of "genetic algorithm" module
r, but did you check my proposal in MATH-1618, where >Chromosome and fitness are decoupled, and their relationship >is held within a "Population" instance? --Mentioned earlier. > >> >(7) >> >Spurious "@since" tags: In the new code (in "commons-math-ga" >> >module), none should refer to a version < 4.0. >> > >> -- Some files are taken unchanged from the previous release. I have kept >> the same @since tag for those files. >> Do you need any change here? > >The old and new files are in different packages; "@since" tags >thus make no sense IMO. --I shall change it. > >> >> >(8) >> >@SuppressWarnings("unchecked") >> > >> >By default, I'm a bit suspicious about having to resort to these >> annotations, >> >especially for the kind of algorithms we are trying to implement. >> >What do you think of the alternative approach outlined in the ZIP file >> >attached in MATH-1618: >> >https://issues.apache.org/jira/browse/MATH-1618 >> >? >> -- This annotation is required because we have kept an option to use >> different types of genotypes including primitive. >> Because of that our base interfaces only declares phenotype not genotype. >> This introduced a kind of hierarchy in all operators and chromosome classes >> which required us to use the mentioned annotation. > >I may again be missing something. >Could you please explain the case that makes these annotations >necessary. -- This has been only used to avoid the warning in the place of typecasting. However, I can work to minimize this following your new model. > >> Even with the proposed new architecture we may not be able to avoid the >> same. > >The classes which I've added do not use the annotation... > >> -- It will be good if you can share some more information about the newly >> proposed architecture. The areas of current design which it can improve as >> well as the underlying intention. > >As noted in the comment on the JIRA page, the main intention is >maximal decoupling of functionalities that make up a GA (population, >fitness, selection, operator) and that seems achieved with the provided >classes. > >> > >> >(9) >> >Naming of factory methods should be harmonized to match the convention >> >adopted in components like [RNG] and [Numbers]. >> >E.g. instead of "newChromosome(...)", please use "of(...)" or "from(...)" >> >for "value object", and "create(...)" otherwise. >> > >> -- I have renamed the same for Chromosome classes. >> What about the nextGeneration() method of ListPopulation class. Renaming >> this to create() or from() won't convey the purpose of it. > >I agree, and that's why the new "Population" class (in MATH-1618) does >not provide a factory method (see also the "GeneticAlgorithmFactory" >class). -- We can avoid the same in the current model if we agree to use a default implementation of population and remove the Population interface following your new model. > >> >(10) >> >o.a.c.m.ga.chromosome.AbstractListChromosome >> > >> >Constructor is called with an argument that is a previously instantiated >> >"representation". If the latter is mutable, the caller will be able to >> modify >> >the underlying data structure of the newly created chromosome. [The >> >doc assumes immutability of the representation but this cannot be >> >enforced, and mixed ownership can entail subtle bugs.] >> -- I think this applies to both representation as well as generic parameter >> type T. But I don't see any other option but to rely on the user. > >The Javadoc (at line 84) is misleading in its mention of "immutable". > >> If you have any suggestions kindly share. > >I may not understand all the implications, but I'd suggest that the >"representation" be instantiated within the control of the library (e.g. >through a "builder"/"factory"). -- Currently we have the ChromosomeRepresentationUtils for the same. Its methods are designed to generate the representations. > >> > >> >(11) >> >Do we agree that, in a GA, the most time-consuming task is the fitness >> >computation? Hence IMO, it should be the focus of the multithreading >> >tools (i.e. "ExecutorService"), probably keeping the other parts (namely >> >the genetic operators) within a simple sequential loop (as in class >> >"GeneticAlgorithmFactory" in MATH-1618). >> -- Curren
Re: [Math] Review of "genetic algorithm" module
e tag for those files. Do you need any change here? >(8) >@SuppressWarnings("unchecked") > >By default, I'm a bit suspicious about having to resort to these annotations, >especially for the kind of algorithms we are trying to implement. >What do you think of the alternative approach outlined in the ZIP file >attached in MATH-1618: >https://issues.apache.org/jira/browse/MATH-1618 >? -- This annotation is required because we have kept an option to use different types of genotypes including primitive. Because of that our base interfaces only declares phenotype not genotype. This introduced a kind of hierarchy in all operators and chromosome classes which required us to use the mentioned annotation. Even with the proposed new architecture we may not be able to avoid the same. -- It will be good if you can share some more information about the newly proposed architecture. The areas of current design which it can improve as well as the underlying intention. > >(9) >Naming of factory methods should be harmonized to match the convention >adopted in components like [RNG] and [Numbers]. >E.g. instead of "newChromosome(...)", please use "of(...)" or "from(...)" >for "value object", and "create(...)" otherwise. > -- I have renamed the same for Chromosome classes. What about the nextGeneration() method of ListPopulation class. Renaming this to create() or from() won't convey the purpose of it. >(10) >o.a.c.m.ga.chromosome.AbstractListChromosome > >Constructor is called with an argument that is a previously instantiated >"representation". If the latter is mutable, the caller will be able to modify >the underlying data structure of the newly created chromosome. [The >doc assumes immutability of the representation but this cannot be >enforced, and mixed ownership can entail subtle bugs.] -- I think this applies to both representation as well as generic parameter type T. But I don't see any other option but to rely on the user. If you have any suggestions kindly share. > >(11) >Do we agree that, in a GA, the most time-consuming task is the fitness >computation? Hence IMO, it should be the focus of the multithreading >tools (i.e. "ExecutorService"), probably keeping the other parts (namely >the genetic operators) within a simple sequential loop (as in class >"GeneticAlgorithmFactory" in MATH-1618). -- Current implementation uses separate threads for applying crossover and mutation operators for each pair of selected chromosomes. I think this ensures better utilization of multi-core processors compared to use of multi-threading only for the fitness calculation. -- Some codes are checked in. But there is a conflict in the pull request. So I shall create a new one and delete the old branch itself. Thanks & Regards --Avijit Basak On Fri, 15 Apr 2022 at 03:03, Gilles Sadowski wrote: > Hello. > > > > > [...] > > (1) > o.a.c.m.ga.GeneticAlgorithmTestPermutations > (under "src/test") > > As per your comment in that class, it is a usage example. > Given that its name does not end with "Test", it is not run by the > test suite. Please move it to the "examples" module. > > (2) > I'm missing a high-level doc that would enable a newbie to figure > out what to implement in order to get going. > E.g. what is the interplay between > * genotype > * allele > * phenotype > * decoder > * fitness function > ? > Several classes do not provide explanations (or links) about the > concept which they represent. For example, there is no doc about > what a "RandomKeyDecoder" is, and the reason for using it (or not). > > (3) > o.a.c.m.ga.utils.ChromosomeRepresentationUtils > > It seems to be a "mixed-bag" kind of class (that is being frowned > upon nowadays). > Its comment refers to "random" but some methods are not using > any randomization. Most methods are only used in unit tests. > > (4) > o.a.c.m.ga.RandomProviderManager > > As already discussed, this class should not be part of the public > API, namely because the "getRandomProvider()" method returns > an object that is not thread-safe. > If used internally as "syntactic sugar", it should be located in a > package named "internal"; however I'd tend to remove it > altogether, and call "ThreadLocalRandomSource.current(...)" > explicitly. > > (5) > Why does a "Chromosome" need an "identifier"? > Method "getId()" is only used in "PopulationStatisticalSummaryImpl" > that is an internal class, where it seems that the chromosome itself > (rather than its "id") could serve as the map's key.
Re: [Math] Review of "genetic algorithm" module
pplication layer). [This would allow the removal of > "updateListenerRigistry" method (note: There is a typo in that name).] --This is corrected. >* Are annotations (@SafeVarargs, ...) necessary? Please document. -- This annotation is necessary for any parameterized vararg. This is also used in legacy classes like o.a.c.m.l.a.i.FieldHermiteInterpolator and o.a.c.m.l.o.n.RungeKuttaFieldStepInterpolator. >In "AdaptiveGeneticAlgorithm": >* There should be a single constructor (same remark as above). -- Removed the constructor with default argument. >* Why the use of reflection ("isAssignableFrom")? -- Replaced it by instanceof. -- Created a new PR https://github.com/apache/commons-math/pull/209. Thanks & Regards --Avijit Basak On Sun, 3 Apr 2022 at 19:52, Gilles Sadowski wrote: > Hello. > > Le mar. 29 mars 2022 à 17:08, Avijit Basak a > écrit : > > > > Hi All > > > > Please find my comments below. > > > > [...] > > > > --I have made the changes and created a new PR. Kindly review the same > and > > share your thoughts. > > https://github.com/apache/commons-math/pull/208 > > I've merged PR #208 into the feature branch (please open a > new one for changes entailed by the comments below). > I again had to delete the branch (and recreate it with the merged > changes from PR #208). [I must be missing something about the > correct git workflow...] > > There seems to be something wrong in the "examples-ga-tsp" > application (fitness does not change). > > At the end of the run, one should be able to quickly assess the > goodness of the solution; the new code prints a line with many > "Node [...]" elements while the "--legacy" switch prints the "best" > fitness and a list of indices. In either case, the solution should > consist of the list of visited cities (one per line) and the total > distance. > > I can't seem to find how the logger is configured. Currently, all > "INFO" messages are logged to the "standard error" console; one > should be able to e.g. redirect output to a file, or set the log level. > > There is still a mix between library code and application code (but > this is to be discussed in MATH-1643. > > From browsing the library code, I'm tempted to believe that the > dependency towards a logging framework is not necessary (or > underused). I think that such a feature could be left to the application > layer (per the "ConvergenceListener" registry). > Likewise, the "PopulationStatisticsLogger" is not general enough to > be worth being part of the library. > > A few (nit-pick) remarks about code style in general. > Javadoc is incomplete: All methods must be documented. > Please avoid redundant links like e.g. > ---CUT--- > /** > * @param crossoverPolicy The {@link CrossoverPolicy} > * @param mutationPolicy The {@link MutationPolicy} > * @param selectionPolicy The {@link SelectionPolicy} > * @param convergenceListeners An optional collection of > * {@link ConvergenceListener} with > variable arity > */ > @SafeVarargs > protected AbstractGeneticAlgorithm(final CrossoverPolicy > crossoverPolicy, > final MutationPolicy mutationPolicy, > final SelectionPolicy selectionPolicy, > ConvergenceListener... convergenceListeners) { > this.crossoverPolicy = crossoverPolicy; > this.mutationPolicy = mutationPolicy; > this.selectionPolicy = selectionPolicy; > updateListenerRigistry(convergenceListeners); > } > ---CUT--- > Readers of the HTML-generated doc can already click on the various > arguments within the signature; so there is no need to add visual noise > in the source code just to be able to click from within the Javadoc part > just above that signature. > The Javadoc block above should be > ---CUT--- > /** > * @param crossoverPolicy Crossover policy. > * @param mutationPolicy Mutation policy. > * @param selectionPolicy Selection policy. > * @param convergenceListeners Collection of user-defined listeners. > */ > ---CUT--- > [Note the absence of "The" and the presence of a final "period".] > > A blank line is welcome to separate ideas ("logical" blocks of code) > However, there should not be an empty line after a closing brace if > it is followed by another closing brace. > Also, in all recent codes, there is no blank line between the instance > fields; the (mandatory) Javadoc is enough to logically (and visually) > separate the fields. > > In "AbstractGeneticAlgorithm": > * There should be a single constructor (handling default values should >be left to the application layer). [This would allow the removal of >"updateListenerRigistry" method (note: There is a typo in that name).] > * Are annotations (@SafeVarargs, ...) necessary? Please document. > > In "AdaptiveGeneticAlgorithm": > * There should be a single constructor (same remark as above). > * Why the use of reflection ("isAssignableFrom")? > > Regards, > Gilles > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >
Re: [Math] Review of "genetic algorithm" module
Hi All Please find my comments below. [...] >Just quickly commenting on this point. >IIUC, your purpose is for users to be able to run (an example >application of) the old implementation. > >This can be achieved by having all the "legacy" codes within >module > commons-math-examples/examples-ga/examples-ga-math-functions >(note: No "legacy" in the module's name), within a dedicated > o.a.c.m.examples.ga.mathfunctions.legacy >package. > >This code is then called by the exact same code/application as >for the new implementation (with the corresponding command >line switch): > $ java -jar examples-ga-app.jar --legacy ... rest of the args ... > >Users can thus perform 2 runs; once with "--legacy" and one >without it, and reach some conclusions. > >The duplicate codes only bring maintenance burden (to ensure >that the "legacy" and non-"legacy" modules do indeed aim at >solving the same problem). >Whenever we then decide that the new code has been thoroughly >tested, removal of the > o.a.c.m.examples.ga.mathfunctions.legacy >package will be a minimal change (as compared to the removal >of a module) --I have made the changes and created a new PR. Kindly review the same and share your thoughts. *https://github.com/apache/commons-math/pull/208 <https://github.com/apache/commons-math/pull/208>* Thanks & Regards --Avijit Basak On Mon, 28 Mar 2022 at 18:36, Gilles Sadowski wrote: > Hello. > > Le lun. 28 mars 2022 à 10:15, Avijit Basak a > écrit : > > > > [...] > > > > >The various "Standalone" classes also look quite similar; consolidating > the > > >"examples-ga" module (including full Javadoc) is necessary. > > -- Could you please elaborate it more. IMHO as StandAlone classes are > > dedicated to the specific module only, it would remain separate. Since we > > have used a single domain to show utility of the different > > types(adaptive/simple) of GA some classes have become similar. > > > > >I still don't > > >understand why there are "...-legacy" modules in module "examples-ga". > > >If you want to offer the option of running the "old" implementation, you > > >could add a "legacy" flag (as "@Option" in the "Standalone" > application). > > -- There was a discussion on this some time back. The sole purpose of > > keeping the legacy example module is for comparison with the new > > implementation. It will be easier for anyone to visualize the quality > > improvement we achieved here. I don't want to mix(by legacy flag) this > > anyway with the new implementation. > > > > Just quickly commenting on this point. > > IIUC, your purpose is for users to be able to run (an example > application of) the old implementation. > > This can be achieved by having all the "legacy" codes within > module > commons-math-examples/examples-ga/examples-ga-math-functions > (note: No "legacy" in the module's name), within a dedicated > o.a.c.m.examples.ga.mathfunctions.legacy > package. > > This code is then called by the exact same code/application as > for the new implementation (with the corresponding command > line switch): > $ java -jar examples-ga-app.jar --legacy ... rest of the args ... > > Users can thus perform 2 runs; once with "--legacy" and one > without it, and reach some conclusions. > > The duplicate codes only bring maintenance burden (to ensure > that the "legacy" and non-"legacy" modules do indeed aim at > solving the same problem). > Whenever we then decide that the new code has been thoroughly > tested, removal of the > o.a.c.m.examples.ga.mathfunctions.legacy > package will be a minimal change (as compared to the removal > of a module). > > Regards, > Gilles > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [Math] Review of "genetic algorithm" module
Hi All Please find my comments. [...] >I don't think that it's the right way to go; instantiating an "ExecutorService" >belongs to the GA application, not the GA library (whose relevant classes >need "only" be thread-safe). >There is some misunderstanding to be clarified in a dedicated discussion >(please file a new JIRA ticket). -- I have created a subtask under the same Jira(MATH-1563). Please share your thoughts. https://issues.apache.org/jira/browse/MATH-1643 >Side note: Conflicts and duplicate commits have accumulated in the >dedicated "feature__MATH-1563__genetic_algorithm" branch. >I did not know how to proceed in order to avoid ending up with a messy >history in "master"; so I created a new branch (with the same name) with >all the new GA-related files added as a single commit. >Currently, this branch (based on your PR #205) fails the default goal, >because of a CheckStyle issue. You shoudl always check locally that >running "mvn" without arguments does not generate any errors. >I also noticed that classes in "examples-ga" use "forbidden" library >classes: "GeneticIllegalArgumentException" is an "internal" class; we >must not advertize such classes in the example applications. -- I have replaced the "GeneticIllegalArgumentException" by "IllegalArgumentException". >In general, it seems that "examples-ga" contains several classes and >methods that do not need to be "public". This is especially true for >classes like "MathFunction" and "Coordinate". [Having those "private" >helps users to tell what is part of the library's functionality from what is >just "dummy" placeholder code.] -- I have replaced the MathFunction and CoordinateDecoder with lambda. However, the Coordinate class is a domain object (phenotype). So this needs to remain public. This can be used in more than one place for the entire application. >Finally (for now), I've just noticed that there exist several classes named >"MathFunction", with same implementation! >Code duplication must be avoided, especially where we purport to display >best practices. -- As mentioned above this has been removed. >The various "Standalone" classes also look quite similar; consolidating the >"examples-ga" module (including full Javadoc) is necessary. -- Could you please elaborate it more. IMHO as StandAlone classes are dedicated to the specific module only, it would remain separate. Since we have used a single domain to show utility of the different types(adaptive/simple) of GA some classes have become similar. >I still don't >understand why there are "...-legacy" modules in module "examples-ga". >If you want to offer the option of running the "old" implementation, you >could add a "legacy" flag (as "@Option" in the "Standalone" application). -- There was a discussion on this some time back. The sole purpose of keeping the legacy example module is for comparison with the new implementation. It will be easier for anyone to visualize the quality improvement we achieved here. I don't want to mix(by legacy flag) this anyway with the new implementation. >Please use the new branch for all these ("cleanup") changes, as the basis >a PR (with a *single* commit). -- I have taken the changes and will create a new PR soon with all my changes. Thanks & Regards --Avijit Basak On Sun, 13 Mar 2022 at 06:39, Gilles Sadowski wrote: > Hello. > > Le lun. 28 févr. 2022 à 07:11, Avijit Basak a > écrit : > > > > Hi All > > > > Please see my comments below. > > > > > [...] > > >I just had a very quick look. > > >IIUC, you always provide "convenience" methods (e.g. the various > > >signatures for the "evolve" functionality). > > >Prior to merging into "master", we should simplify and limit the > > >discussion to the core functionality, i.e. not try and make decisions > > >for the user (like default values, ...). Please keep the API as simple > > >as possible > > -- I have removed the mentioned evolve method. > > However, I had to catch two checked exceptions (InterruptedException, > > ExecutionException) and rethrow them. As of now I have handled them using > > the GeneticIllegalArgumentException. I think we need to introduce another > > exception class to handle this. Please share your thought regarding this. > > I don't think that it's the right way to go; instantiating an > "ExecutorService" > belongs to the GA application, not the GA library (whose relevant classes > need &quo
Re: [Math] Review of "genetic algorithm" module
Hi All Please see my comments below. > [...] >I just had a very quick look. >IIUC, you always provide "convenience" methods (e.g. the various >signatures for the "evolve" functionality). >Prior to merging into "master", we should simplify and limit the >discussion to the core functionality, i.e. not try and make decisions >for the user (like default values, ...). Please keep the API as simple >as possible -- I have removed the mentioned evolve method. However, I had to catch two checked exceptions (InterruptedException, ExecutionException) and rethrow them. As of now I have handled them using the GeneticIllegalArgumentException. I think we need to introduce another exception class to handle this. Please share your thought regarding this. Thanks & Regards --Avijit Basak On Mon, 21 Feb 2022 at 20:11, Gilles Sadowski wrote: > Hello. > > Le lun. 21 févr. 2022 à 06:56, Avijit Basak a > écrit : > > > > Hi All > > > > Please find my comments below: > > > > [...] > > > > > >Another misunderstanding (probably); we must figure out where > > >the parallelism will be implemented. > > >IIUC the current state of the code, optimizing multiple populations > > >in parallel would be the same as launching multiple JVMs; I'd want > > >to explore low-level parallelism (i.e. at the "Chromosome" level). > > -- I have implemented both muti-threading and multi-population > parallelism. > > I just had a very quick look. > IIUC, you always provide "convenience" methods (e.g. the various > signatures for the "evolve" functionality). > Prior to merging into "master", we should simplify and limit the > discussion to the core functionality, i.e. not try and make decisions > for the user (like default values, ...). Please keep the API as simple > as possible. > > Thanks, > Gilles > > >>>> [...] > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [Math] Review of "genetic algorithm" module
Hi All Please find my comments below: >The build fails because of CheckStyle errors: >https://app.travis-ci.com/github/apache/commons-math/builds/246683712 --Fixed the issues >>>> [...] >> >> > >> >> >I did not suggest to remove any Javadoc, only to rephrase it as: [...] >> >As hinted by my comment is the previous message, I've still to >> >clarify my own expectations; but I vaguely sense some lost >> >opportunity for simpler usage simpler and increased performance >> >through the caller just needs to specify the number of "worker >> >threads". >> -- We should have both options. Users can execute the algorithm by >> specifying only the number of worker threads with a single population as >> well as optimize multiple populations in a parallel fashion. > >Another misunderstanding (probably); we must figure out where >the parallelism will be implemented. >IIUC the current state of the code, optimizing multiple populations >in parallel would be the same as launching multiple JVMs; I'd want >to explore low-level parallelism (i.e. at the "Chromosome" level). -- I have implemented both muti-threading and multi-population parallelism. > >> For the multi-population option the common thread pool would be reused for >> all populations. >> > >> >Do we at least agree that >> >1. Adding/retrieving a "Chromosome" to/from a "Population" >> >must be thread-safe (and is not trivial) >> >2. Fitness computation is where most time is usually spent >> >(so that multi-threading must be achieved at that granularity) >> >? >> --The way I am thinking of designing a task is that it should accept the >> current population and return an instance of ChromosomePair. >> The chromosomes within this pair would be added to the population by the >> caller thread. Population won't be updated by multiple threads. > >Then, it would not be a multi-threaded library. >This kind of parallelism does not need support from the library, >and can be implemented at the application level. > >> The code snippet below shows the body of the method which will be executed >> inside the task. >> --CUT-- >> >> //selection >> ChromosomePair pair = getSelectionPolicy().select(current); >> >> // crossover >> if (randGen.nextDouble() < getCrossoverRate()) { >> // apply crossover policy to create two offspringoport >> pair = getCrossoverPolicy().crossover(pair.getFirst(), pair.getSecond()); >> } >> >> // mutation >> if (randGen.nextDouble() < getMutationRate()) { >> // apply mutation policy to the chromosomes >> pair = new ChromosomePair( >> getMutationPolicy().mutate(pair.getFirst()), >> getMutationPolicy().mutate(pair.getSecond())); >> } >> >> return pair; >> >> --CUT-- > >One of the issue with above code is, again, that "randGen" >must be thread-safe (and it is not, usually). >Also, it doesn't say anything about how to ensure that >the fitness computation is thread-safe (and if you assume >that it will be computed outside that "task", then the >performance gain will be very low). > -- Current implementation is using a Thread local version of random number generator. >> > >> >I'd surmise that "multiple instances of AbstractGeneticAlgorithm" >> >is an application concern; unless I'm missing something, it's >> >not what I've in mind when talking about multi-threading. >> >Actually, I was wondering whether we could implement the >> >analog of what is in the "commons-math-neuralnet" module, >> >where >> >* "Neuron" is the counterpart "Chromosome" >> >* "Network" is the counterpart of "Population". >> --"multiple instance of AbstractGeneticAlgorithm" is related to parallel GA >> with multiple populations not multi-threading. > >Yes, as I also mentioned above. >But I'm interested in where multi-threading can be implemented to >be used in both cases (single population and multiple populations). > >> Users can also implement parallel GA in a synchronous manner although that >> won't be a recommended way. >> Multi-threading is only a way to improve performance using a user's multi >> core CPU. >> The threads in the thread pool would only be used to execute the task as >> mentioned in the previous comment. > >That's where I've some doubt. >But be free to implement benchmarks that demonstrate the >expected performance improvement. > >> I think we hav
Re: [Math] Review of "genetic algorithm" module
es the population using multiple threads. >> >> --This needs to be done. However, I would like to address this along >> with >> >> parallel GA i.e. convergence of multiple populations together. >> > >> >The two features (multi-thread vs multiple populations) should >> >be implemented independently: Users that only need the "basic" >> >GA should also be able to take advantage of their machine's >> >multiple CPUs. >> >[This is related to the design issue which I mentioned previously.] >> > >> -- I am thinking to leverage user's multiple CPUs for doing >> multi-population GA. > >OK (sort-of, since "the devil is in the details", and I'm not sure >that we mean the same thing by "multi", see below). > >> It would a global approach where same thread pool >> would be used for both purposes. Another class would be introduced for >> executing parallel genetic algorithm which would accept multiple instances >> of AbstractGeneticAlgorithm class and converge them in parallel. Users who >> does not care for robustness would go for current implementations of the >> algorithm with single population. For a better optimization quality users >> would chose the new class. > >As hinted by my comment is the previous message, I've still to >clarify my own expectations; but I vaguely sense some lost >opportunity for simpler usage simpler and increased performance >through the caller just needs to specify the number of "worker >threads". -- We should have both options. Users can execute the algorithm by specifying only the number of worker threads with a single population as well as optimize multiple populations in a parallel fashion. For the multi-population option the common thread pool would be reused for all populations. > >Do we at least agree that >1. Adding/retrieving a "Chromosome" to/from a "Population" >must be thread-safe (and is not trivial) >2. Fitness computation is where most time is usually spent >(so that multi-threading must be achieved at that granularity) >? --The way I am thinking of designing a task is that it should accept the current population and return an instance of ChromosomePair. The chromosomes within this pair would be added to the population by the caller thread. Population won't be updated by multiple threads. The code snippet below shows the body of the method which will be executed inside the task. --CUT-- //selection ChromosomePair pair = getSelectionPolicy().select(current); // crossover if (randGen.nextDouble() < getCrossoverRate()) { // apply crossover policy to create two offspring pair = getCrossoverPolicy().crossover(pair.getFirst(), pair.getSecond()); } // mutation if (randGen.nextDouble() < getMutationRate()) { // apply mutation policy to the chromosomes pair = new ChromosomePair( getMutationPolicy().mutate(pair.getFirst()), getMutationPolicy().mutate(pair.getSecond())); } return pair; --CUT-- > >I'd surmise that "multiple instances of AbstractGeneticAlgorithm" >is an application concern; unless I'm missing something, it's >not what I've in mind when talking about multi-threading. >Actually, I was wondering whether we could implement the >analog of what is in the "commons-math-neuralnet" module, >where >* "Neuron" is the counterpart "Chromosome" >* "Network" is the counterpart of "Population". --"multiple instance of AbstractGeneticAlgorithm" is related to parallel GA with multiple populations not multi-threading. Users can also implement parallel GA in a synchronous manner although that won't be a recommended way. Multi-threading is only a way to improve performance using a user's multi core CPU. The threads in the thread pool would only be used to execute the task as mentioned in the previous comment. I think we have some misunderstanding over here. It is better to do an implementation first and start the discussion. It would be more productive. Thanks & Regards --Avijit Basak On Thu, 17 Feb 2022 at 01:09, Gilles Sadowski wrote: > Hello. > > Le mer. 16 févr. 2022 à 17:31, Avijit Basak a > écrit : > > > > Hi All > > > > Please find my comments. > > > > >> (2) > > >> >The "GeneticException" class seems to mostly deal with "illegal" > > >> >arguments; hence it should be a subclass of the JDK's standard > > >> >"IllegalArgumentException" (and be renamed accordingly). > > >> >If other condition types are needed, then another internal class > > >> >should be defined with the corresponding standard semantics. > > >> --IMHO if we think of a single excep
Re: [Math] Review of "genetic algorithm" module
o be done. However, I would like to address this along with >> parallel GA i.e. convergence of multiple populations together. > >The two features (multi-thread vs multiple populations) should >be implemented independently: Users that only need the "basic" >GA should also be able to take advantage of their machine's >multiple CPUs. >[This is related to the design issue which I mentioned previously.] > -- I am thinking to leverage user's multiple CPUs for doing multi-population GA. It would a global approach where same thread pool would be used for both purposes. Another class would be introduced for executing parallel genetic algorithm which would accept multiple instances of AbstractGeneticAlgorithm class and converge them in parallel. Users who does not care for robustness would go for current implementations of the algorithm with single population. For a better optimization quality users would chose the new class. [...] Thanks & Regards --Avijit Basak On Mon, 14 Feb 2022 at 15:37, Gilles Sadowski wrote: > Hello. > > Le lun. 14 févr. 2022 à 08:03, Avijit Basak a > écrit : > > > > Hi All > > > > Thanks for the review comments. Please find my comments below. > > > > (1) > > [...] > > > > (2) > > >The "GeneticException" class seems to mostly deal with "illegal" > > >arguments; hence it should be a subclass of the JDK's standard > > >"IllegalArgumentException" (and be renamed accordingly). > > >If other condition types are needed, then another internal class > > >should be defined with the corresponding standard semantics. > > --IMHO if we think of a single exception class we should extend it only > > from RuntimeException. > > "single exception class" is not a requirement (it cannot be since > we agreed some time ago that it was better to align with the JDK's > delineation of error conditions (IAE, NPE, ILSE, AE, ...). > > > If we think of multiple exception classes in one > > module we may need to think of a base exception class. Other classes > would > > extend the same. > > Please no. We'd taken that approach in "Commons Math" (cf. > base class now in module "commons-math-legacy-exception"), > as I've mentioned already IIRC: It was a failed experiment IMO. > [For more details, please refer to the archive of the "dev" ML.] > > > The approach mentioned above would mix up these two. > > Please share your opinion regarding this. > > Eventually, all new components ([RNG], [Number], [Geometry], ...) > adopted the simple approach of non-public API (ideally private > or package-private) exception classes only for the developer's > use (and the purpose of which is limited to avoiding duplication). > > > > > >[Exception messages need review for spelling and formatting.] > > -- It will be really helpful if you can point out some specific examples. > > We can fix this when the PR has reached some stability. > > > > > (3) > > >IMO Javadoc should avoid redundant phrases like "This class" as > > >the first words of a class description. > > --Refractored the javadoc comments. Please review and mention if you need > > any further changes. > > I've not looked yet, but thanks for taking it into account. > Similarly to the previous point, these clean-ups can happen later. > > > >A similar remark holds for fields in "GeneticException" class: Since > > >the name of the field is self-documenting, duplication in the Javadoc > > >is visual noise ("Message template" is concise and clear enough). > > --Removal of the javadoc comments produces a checkstyle error. > > I did not suggest to remove any Javadoc, only to rephrase it as: > ---CUT--- > /** "Message template". */ > ---CUT--- > > > [...] > > > > (4) > > >Class "ConvergenceListenerRegistry" is generic but its code > > >contains undocumented "@SuppressWarnings" annotations. > > >Moreover, it is a singleton, and not thread-safe. > > >Why should there be such a global "registry"? > > >Since it is only accessed by the "AbstractGeneticAlgorithm" class, > > >it could be defined as a private inner class. > > --Made it a private inner class. > > Thanks. > [We should nevertheless address the other issues mentioned in > the above paragraph.] > > > > > (5) > > >In class "AbstractGeneticAlgorithm", methods "getCrossoverPolicy" > > >"getMutationPolicy", "getElitismRate" are public, yet they are
Re: [Math] Review of "genetic algorithm" module
Hi All Thanks for the review comments. Please find my comments below. (1) >A commit log message should strive to be informative >for the reviewer; saying the like of "fixed minor bugs" does >not convey anything. >Even minor changes, like e.g. formatting cleanup, should be >designated as such. >For this PR, the message (which I've amended) was misleading >because the change was not about bugs, but about removing >GUI code (and its dependency). --I have maintained a detailed commit message this time. (2) >The "GeneticException" class seems to mostly deal with "illegal" >arguments; hence it should be a subclass of the JDK's standard >"IllegalArgumentException" (and be renamed accordingly). >If other condition types are needed, then another internal class >should be defined with the corresponding standard semantics. --IMHO if we think of a single exception class we should extend it only from RuntimeException. If we think of multiple exception classes in one module we may need to think of a base exception class. Other classes would extend the same. The approach mentioned above would mix up these two. Please share your opinion regarding this. >[Exception messages need review for spelling and formatting.] -- It will be really helpful if you can point out some specific examples. (3) >IMO Javadoc should avoid redundant phrases like "This class" as >the first words of a class description. --Refractored the javadoc comments. Please review and mention if you need any further changes. >A similar remark holds for fields in "GeneticException" class: Since >the name of the field is self-documenting, duplication in the Javadoc >is visual noise ("Message template" is concise and clear enough). --Removal of the javadoc comments produces a checkstyle error. >Similarly, simple accessors don't need the exact same sentence >repeated twice (a single "@return ..." tag is sufficient). --Modified. (4) >Class "ConvergenceListenerRegistry" is generic but its code >contains undocumented "@SuppressWarnings" annotations. >Moreover, it is a singleton, and not thread-safe. >Why should there be such a global "registry"? >Since it is only accessed by the "AbstractGeneticAlgorithm" class, >it could be defined as a private inner class. --Made it a private inner class. (5) >In class "AbstractGeneticAlgorithm", methods "getCrossoverPolicy" >"getMutationPolicy", "getElitismRate" are public, yet they are only >ever called by a subclass. --Modified the public to protected. (6) >Why support inheritance for "AbstractGeneticAlgorithm"? >Why would users need their own subclass, rather than call those >implemented within the library (currently, "GeneticAlgorithm" and >"AdaptiveGeneticAlgorithm")? >Couldn't we encapsulate the choice of algorithm in an "enum", >similar to "RandomSource" in [RNG]. >Do I understand correctly that the (only?) difference between the >two classes is the ability to adapt crossover and mutation rates? -- The difference between GeneticAlgorithm and AdaptiveGeneticAlgorithm is the ability to adapt crossover and mutation probability. However, as per my understanding enum encapsulation is appropriate with the same set and type of constructor arguments, where the arguments can be provided during enum declaration. In our case the arguments would be provided by the client program and cannot be pre-initialized as part of an enum declaration. (7) >The currently available GA implementations are sequential. >IIUC, the "nextGeneration" methods should provide an option >that processes the population using multiple threads. --This needs to be done. However, I would like to address this along with parallel GA i.e. convergence of multiple populations together. (8) >Do not use explicit "\n" and "\r" characters.[1] --Done Thanks & Regards --Avijit Basak On Mon, 7 Feb 2022 at 07:57, Gilles Sadowski wrote: > Hello. > > A few remarks (as of PR #205) and questions: > > (1) > A commit log message should strive to be informative > for the reviewer; saying the like of "fixed minor bugs" does > not convey anything. > Even minor changes, like e.g. formatting cleanup, should be > designated as such. > For this PR, the message (which I've amended) was misleading > because the change was not about bugs, but about removing > GUI code (and its dependency). > > (2) > The "GeneticException" class seems to mostly deal with "illegal" > arguments; hence it should be a subclass of the JDK's standard > "IllegalArgumentException" (and be renamed accordingly). > If other condition t
Re: [MATH][GA] Build Failure for PR #204
Hi Please see my comments below. [...] >Please note that I don't suggest that you remove the tracking of >the optimization process (it is useful to have a trace in order to >check that evolution proceeds as expected), instead of displaying >a GUI, you can save snapshots (either in text form or, if the >check is more easily done graphically, by using the "[Imaging]" >component[3]). -- As per the suggestion I have removed the GUI display of the convergence process. Instead the default log based tracker has been kept for convergence traceability. I have created a new PR#205 after rebase and closed the old one(PR#204). Thanks & Regards --Avijit Basak On Wed, 2 Feb 2022 at 19:58, Gilles Sadowski wrote: > Hi. > > Le mer. 2 févr. 2022 à 09:29, Avijit Basak a > écrit : > > > > Hi All > > > > Please see my comments below. > > > > [...] > > > > > > And there was this old issue that the "" should contain > > the name of the top-level package, i.e. "math4", not "math". > > -- There was a review comment for PR#197 to remove 4 from artifactid. > > "aherbert <https://github.com/aherbert> on Sep 25, 2021 > > <https://github.com/apache/commons-math/pull/197#discussion_r716075956> > > > > Remove the 4 from math4. The version is specified separately from the > > artifact ID." > > Indeed, it seems that there are discrepant expectations or a > misunderstanding about how to compose the "". > In "Commons Math", it contains "math4" as (IIUC) a unique > identifier of the top-level package (that is updated with every > major version). Because of that latter convention, it is true that > the "4" is redundant with the (major) version number. > However, it could also be construed that the redundancy may > be useful for stressing that artefacts with different major versions > can be used together (without "JAR hell"). > That view of having the "packageId" as part of the artifact's name > is used in some other components (e.g. "[Lang]"[1]) but not all > (e.g. "[IO]"[2])... > > > > > I've updated the feature branch with those changes. Please rebase. > > > > I've not yet looked at the code, but a question arose from looking at > > the dependencies: What is "jfreechart" used for in the "examples"? > > -- jfreechart is used to do a graphical plot of the optimization process. > > > > I've just updated the "k-means" example, removing the GUI along > > the way. In general, I think that the example applications should > > follow the KISS principle (which here translates to: Only write to the > > console or to files). Since we don't intend to write full-fledged > > applications, building/testing should be as smooth as possible: GUIs > > entail unnecessary hassle for someone working from a remote > > (text) terminal. > > -- I shall remove that and the corresponding part of the code. > > Thanks. > Please note that I don't suggest that you remove the tracking of > the optimization process (it is useful to have a trace in order to > check that evolution proceeds as expected), instead of displaying > a GUI, you can save snapshots (either in text form or, if the > check is more easily done graphically, by using the "[Imaging]" > component[3]). > > Regards, > Gilles > > > [...] > > > > [1] > https://gitbox.apache.org/repos/asf?p=commons-lang.git;a=blob;f=pom.xml;h=4f12fdf537fd56a69d1b94567e22de99761ec775;hb=HEAD#l28 > [2] > https://gitbox.apache.org/repos/asf?p=commons-io.git;a=blob;f=pom.xml;h=8f61ca0177a056a80dda656dbb70a9774adac548;hb=HEAD#l26 > [3] See e.g. the "kmeans/image" module. > > > > > Thanks & Regards > > --Avijit Basak > > > > On Tue, 1 Feb 2022 at 05:24, Gilles Sadowski > wrote: > > > > > Hello. > > > > > > Le lun. 31 janv. 2022 à 06:27, Avijit Basak a > > > écrit : > > > > > > > > Hi All > > > > > > > > Please find my comments below. > > > > > > > > >There is no attachment (I think that the ML manager strips those). > > > > >Please copy/paste the relevant part of the console log (or provide > > > > >a link to it). > > > > --The build was done locally with a fresh clone of the feature > branch. > > > > > > Strange that the "pom.xml" in PR #204 still refers to version 1.0 of > > > Commons Numbers, instead of version 1.1-SNAPSHOT. > > > This cre
Re: [MATH][GA] Build Failure for PR #204
Hi All Please see my comments below. >Strange that the "pom.xml" in PR #204 still refers to version 1.0 of >Commons Numbers, instead of version 1.1-SNAPSHOT. >This creates many "NoClassDefFound" errors that were fixed with >commit 7e2213f2e5a536ad49d549d21f9eed9e71db5638 in branch >"feature__MATH-1563__genetic_algorithm" branch 6 days ago. -- I could not work further on PR#204. As there was an issue with the local build, I did not try to merge any further changes. Anyways, after fetching your PR and rebasing on that branch, the build is successful. --Thanks for the confirmation. Nevertheless, I had to fix/consolidate many POM files that contained a slew of duplicate declarations (the "dependency management" is done at the highest possible level, to ensure version consistency). Also, please use the same formatting rules as in existing files (in POM files, the indentation is 2 spaces). -- I have missed these two points. And there was this old issue that the "" should contain the name of the top-level package, i.e. "math4", not "math". -- There was a review comment for PR#197 to remove 4 from artifactid. "aherbert <https://github.com/aherbert> on Sep 25, 2021 <https://github.com/apache/commons-math/pull/197#discussion_r716075956> Remove the 4 from math4. The version is specified separately from the artifact ID." I've updated the feature branch with those changes. Please rebase. I've not yet looked at the code, but a question arose from looking at the dependencies: What is "jfreechart" used for in the "examples"? -- jfreechart is used to do a graphical plot of the optimization process. I've just updated the "k-means" example, removing the GUI along the way. In general, I think that the example applications should follow the KISS principle (which here translates to: Only write to the console or to files). Since we don't intend to write full-fledged applications, building/testing should be as smooth as possible: GUIs entail unnecessary hassle for someone working from a remote (text) terminal. -- I shall remove that and the corresponding part of the code. > Please find the log below. Kindly let me know once the build is successful. > The command was "*mvn clean verify apache-rat:check checkstyle:check > pmd:check spotbugs:check javadoc:javadoc*". >As Alex noted, you should ensure that the build is successful with >the supported version of the JDK (i.e. Java 8 currently). >[If you encounter problems with a later version, it's always nice to >file a JIRA report, but fixing such issues is probably low priority.] --I have updated my JDK to version 8. The build is successful now. Thanks. Thanks & Regards --Avijit Basak On Tue, 1 Feb 2022 at 05:24, Gilles Sadowski wrote: > Hello. > > Le lun. 31 janv. 2022 à 06:27, Avijit Basak a > écrit : > > > > Hi All > > > > Please find my comments below. > > > > >There is no attachment (I think that the ML manager strips those). > > >Please copy/paste the relevant part of the console log (or provide > > >a link to it). > > --The build was done locally with a fresh clone of the feature branch. > > Strange that the "pom.xml" in PR #204 still refers to version 1.0 of > Commons Numbers, instead of version 1.1-SNAPSHOT. > This creates many "NoClassDefFound" errors that were fixed with > commit 7e2213f2e5a536ad49d549d21f9eed9e71db5638 in branch > "feature__MATH-1563__genetic_algorithm" branch 6 days ago. > > Anyways, after fetching your PR and rebasing on that branch, the > build is successful. > > Nevertheless, I had to fix/consolidate many POM files that contained > a slew of duplicate declarations (the "dependency management" is > done at the highest possible level, to ensure version consistency). > Also, please use the same formatting rules as in existing files (in > POM files, the indentation is 2 spaces). > > And there was this old issue that the "" should contain > the name of the top-level package, i.e. "math4", not "math". > > I've updated the feature branch with those changes. Please rebase. > > I've not yet looked at the code, but a question arose from looking at > the dependencies: What is "jfreechart" used for in the "examples"? > I've just updated the "k-means" example, removing the GUI along > the way. In general, I think that the example applications should > follow the KISS principle (which here translates to: Only write to the > console or to files). Since we don't intend to write full-fledged > applications, building/testing should be as smooth as possible: GUIs > entail unnecessary hassle for someon
Re: [MATH][GA] Build Failure for PR #204
math4\legacy\ode\nonstiff\GraggBulirschStoerIntegrator.java:65: error: attribute not supported in HTML5: cellpadding [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerIntegrator.java:65: error: attribute not supported in HTML5: summary [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\sampling\FieldStepNormalizer.java:45: error: attribute not supported in HTML5: summary [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\sampling\StepNormalizer.java:43: error: attribute not supported in HTML5: summary [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\stat\ranking\NaturalRanking.java:44: error: attribute not supported in HTML5: cellpadding [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\stat\ranking\NaturalRanking.java:44: error: attribute not supported in HTML5: summary [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\package-info.java:130: error: attribute not supported in HTML5: summary [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\package-info.java:141: error: attribute not supported in HTML5: summary [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerStepInterpolator.java:45: error: attribute border for table only accepts "" or "1", use CSS instead: BORDER [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerStepInterpolator.java:45: error: attribute not supported in HTML5: width [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerStepInterpolator.java:45: error: attribute not supported in HTML5: cellpadding [ERROR] * [ERROR] ^ [ERROR] C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\src\main\java\org\apache\commons\math4\legacy\ode\nonstiff\GraggBulirschStoerStepInterpolator.java:45: error: attribute not supported in HTML5: summary [ERROR] * [ERROR] ^ [ERROR] [ERROR] Command line was: cmd.exe /X /C ""C:\Program Files\jdk-11.0.12\bin\javadoc.exe" @options @packages" [ERROR] [ERROR] Refer to the generated Javadoc files in 'C:\Personal\Work\opensource\apache-commons-maths\commons-math\commons-math-legacy\target\site\apidocs' dir. [ERROR] [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :commons-math4-legacy >P.S. I'll stop trying to "rebase" the feature branch on the current >state of "master" because I stumble on the same conflict every >time (on the "pom.xml" file)... --This will be helpful to accelerate the further development of the GA module. The rebase and merging process is consuming lots of additional time. Thanks & Regards --Avijit Basak On Sun, 30 Jan 2022 at 21:27, Gilles Sadowski wrote: > Hello. > > Le dim. 30 janv. 2022 à 13:43, Avijit Basak a > écrit : > > > > Hi All > > > > I have taken a fresh clone of the feature branch > "feature__MATH-1563__genetic_algorithm" from apache's repository and > executed the build. The build failed without my changes added
Re: [MATH][GA] Build Failure for PR #204
Hi All I have taken a fresh clone of the feature branch "feature__MATH-1563__genetic_algorithm" from apache's repository and executed the build. The build failed without my changes added to it. The summary report is attached herewith. Kindly look into it and do the needful. Thanks & Regards --Avijit Basak On Tue, 25 Jan 2022 at 19:02, Gilles Sadowski wrote: > Hello. > > I just did "git push" (no "force" this time) on the feature branch. > The problem arose from changes applied a few hours ago in > "master" and not merge into the other branch yet. Sorry; please > rebase on the latest update. > > Regards, > Gilles > > Le mar. 25 janv. 2022 à 12:31, Alex Herbert a > écrit : > > > > On Tue, 25 Jan 2022 at 05:43, Avijit Basak > wrote: > > > > > > Hi All > > > > > >I have missed the build report URL in my previous mail. Please > find > > > the same here. > > > > > > > https://app.travis-ci.com/github/apache/commons-math/builds/245277914 > > > > The version of Commons Numbers in the parent pom is the released > > version 1.0. However some of the legacy classes are now using new code > > added to the gamma package in the unreleased version 1.1-SNAPSHOT. > > > > The master branch is correctly using the 1.1-SNAPSHOT. So somewhere > > the feature branch feature__MATH-1563__genetic_algorithm has not been > > kept totally in sync with master. > > > > I can rebase the feature branch on master to correct this. But the > > result would require a force push and anyone else using this branch > > would have to reset their local copy. > > > > Or I can merge master into the feature branch which creates annoying > > merge commits in the history and the commit logs for the 1563 feature > > are interspersed with all the other commits performed on master while > > development was underway (i.e all commits are in date order > > irrespective of the branch they occurred on). > > > > I believe last time Gilles resolved a lot of the repeat commits using > > a force push. If this is only being used as a work-in-progress (WIP) > > by one developer then I do not see a need to avoid force push. This > > would keep all the commits for the feature together in the git history > > when it is eventually merged to master. > > > > Gilles, do you have any opinion on how to manage the WIP feature branch. > > > > Alex > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [MATH][GA] Build Failure for PR #204
Hi All I have missed the build report URL in my previous mail. Please find the same here. https://app.travis-ci.com/github/apache/commons-math/builds/245277914 Thanks & Regards --Avijit Basak On Tue, 25 Jan 2022 at 11:10, Avijit Basak wrote: > Hi All > > I have created a new PR(#*204*) to check in changes related to > "commons-math-ga" and "examples-ga" modules. The changes need to be merged > to feature branch "feature__MATH-1563__genetic_algorithm". Unfortunately > the build failed in the legacy module which is updated with changes from > the feature branch "feature__MATH-1563__genetic_algorithm". I have no > changes in legacy module. I have executed maven checkstyle, PMD and spotbug > checks in "commons-math-ga" and "examples-ga" modules which passed > successfully. > Could anyone help me to resolve this. > > Thanks & Regards > -- Avijit Basak >
[MATH][GA] Build Failure for PR #204
Hi All I have created a new PR(#*204*) to check in changes related to "commons-math-ga" and "examples-ga" modules. The changes need to be merged to feature branch "feature__MATH-1563__genetic_algorithm". Unfortunately the build failed in the legacy module which is updated with changes from the feature branch "feature__MATH-1563__genetic_algorithm". I have no changes in legacy module. I have executed maven checkstyle, PMD and spotbug checks in "commons-math-ga" and "examples-ga" modules which passed successfully. Could anyone help me to resolve this. Thanks & Regards -- Avijit Basak
Re: [Math] Please review GA implementation
Hi All I have taken all changes from the feature branch and put my code for commons-math-ga and examples-ga modules. A new PR(#204) is also created and the previous PR(#203) is closed. Thanks & Regards --Avijit Basak On Sat, 22 Jan 2022 at 23:02, Gilles Sadowski wrote: > Le sam. 22 janv. 2022 à 15:30, Avijit Basak a > écrit : > > > > Hi > > > > >Please be sure to use my latest (forced) update. > > >[I removed the many duplicate commits I had introduced by mistake.] > > > > --Is the change only in commons-math-examples/pom.xml file? Please > confirm. > > No, there can be other commits, as I try to keep this branch > up-to-date with "master". > > > My changes are creating conflict after this push. I shall create a new > PR. > > Thanks, > Gilles > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >
Re: [Math] Please review GA implementation
Hi >Please be sure to use my latest (forced) update. >[I removed the many duplicate commits I had introduced by mistake.] --Is the change only in commons-math-examples/pom.xml file? Please confirm. My changes are creating conflict after this push. I shall create a new PR. Thanks & Regards --Avijit Basak On Fri, 21 Jan 2022 at 06:15, Gilles Sadowski wrote: > Hi. > > Le jeu. 20 janv. 2022 à 17:58, Gilles Sadowski a > écrit : > > > > Hello. > > > > Le jeu. 20 janv. 2022 à 13:09, Avijit Basak a > écrit : > > > > > > Hi All > > > > > >I have restructured the examples-ga module to fix a few issues > and > > > created separate child modules for each type of usages. There was also > a > > > minor bug related to the logger class name in commons-ga module, which > I > > > fixed. A new PR(*#203*) has been created. > > > > Did you ensure that it is up-to-date with the upstream branch? > > Please be sure to use my latest (forced) update. > [I removed the many duplicate commits I had introduced by mistake.] > > > > > Gilles > > > > > [...] > > ----- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [Math] Please review GA implementation
Hi All I have restructured the examples-ga module to fix a few issues and created separate child modules for each type of usages. There was also a minor bug related to the logger class name in commons-ga module, which I fixed. A new PR(*#203*) has been created. Kindly review the same and let me know if you see any issues with it. https://github.com/apache/commons-math/pull/203 Thanks & Regards --Avijit Basak On Wed, 12 Jan 2022 at 20:55, Avijit Basak wrote: > Hi All > > I have lost track of the jar creation process in the examples > module. Everytime before commiting I have executed examples using Eclipse > IDE which ran successfully. I need to modify the examples module. Sorry for > any inconvenience. Please find my additional responses below. > > >There are issues with the expected functionality of the "examples-ga" > module. > > >Assuming that the following command has been issued > > $ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ mvn package > >and has completed successfully, executable JAR files should have been > >created under the "target" directories. > > >For example, issuing this command (example for the "neuralnet" module): > > $ java -jar > commons-math-examples/examples-sofm/tsp/target/examples-sofm-tsp.jar > >outputs > >---CUT--- > >Missing required option '-o=outputFile' > >Usage: [-hV] [-j=numJobs] [-m=maxTrials] [-n=neuronsPerCity] > > -o=outputFile [-s=numSamples] > >Run the application > > -h, --help Show this help message and exit. > > -j=numJobs Number of concurrent tasks (default: 8). > > -m=maxTrials Maximal number of trials (default: 10). > > -n=neuronsPerCityAverage number of neurons per city (default: 2.2). > > -o=outputFileOutput file name. > > -s=numSamplesNumber of samples for the training (default: 2000). > > -V, --versionPrint version information and exit. > >---CUT--- > > >The above thus shows that the program runs as expected (passing the > missing > >required argument produces the expected output file). > > >Doing the equivalent for the new examples, e.g. > > $ java -jar > commons-math-examples/examples-ga/examples-ga-tsp/target/examples-ga-mathfunctions.jar > >results in > >---CUT--- > >Exception in thread "main" java.lang.UnsatisfiedLinkError: Can't load > >library: /usr/lib/jvm/java-11-openjdk-amd64/lib/libawt_xawt.so > >at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2630) > >at java.base/java.lang.Runtime.load0(Runtime.java:768) > >at java.base/java.lang.System.load(System.java:1837) > >at java.base/java.lang.ClassLoader$NativeLibrary.load0(Native Method) > >at > java.base/java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2442) > >at > java.base/java.lang.ClassLoader$NativeLibrary.loadLibrary(ClassLoader.java:2498) > >at java.base/java.lang.ClassLoader.loadLibrary0(ClassLoader.java:2694) > >at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2648) > >at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) > >at java.base/java.lang.System.loadLibrary(System.java:1873) > >at java.desktop/java.awt.Toolkit$3.run(Toolkit.java:1399) > >at java.desktop/java.awt.Toolkit$3.run(Toolkit.java:1397) > >at java.base/java.security.AccessController.doPrivileged(Native > Method) > >at java.desktop/java.awt.Toolkit.loadLibraries(Toolkit.java:1396) > >at java.desktop/java.awt.Toolkit.(Toolkit.java:1429) > >at java.desktop/java.awt.Component.(Component.java:621) > >at > org.apache.commons.math4.examples.ga.tsp.TSPOptimizer.main(TSPOptimizer.java:62) > >---CUT--- > > >[Please note that the name of the JAR also looks wrong (a copy/paste > mistake?).] > > --There is an issue in the jar file name. But in my system the TSP > application executed successfully. The commands I have executed are given > below: > $ mvn package > $ java -jar examples-ga-mathfunctions.jar > JDK version in my local system is "1.8.0_301" > > --In the mvn command specified by you JAVA_HOME is assigned as > "/usr/lib/jvm/java-8-openjdk-amd64/" but during execution of jar it is > using java-11 Could you please confirm what is the JDK version used and > ensure same version is used for both. This issue usually comes if the JDK > is not properly installed. > > >Command > > $ java -jar > commons-math-examples/examples-ga/examples-ga-math-functions/target/examples-ga-mathfunctions.jar > >also fails, with the following error > >-
Re: [Math] Please review GA implementation
Hi All I have lost track of the jar creation process in the examples module. Everytime before commiting I have executed examples using Eclipse IDE which ran successfully. I need to modify the examples module. Sorry for any inconvenience. Please find my additional responses below. >There are issues with the expected functionality of the "examples-ga" module. >Assuming that the following command has been issued > $ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/ mvn package >and has completed successfully, executable JAR files should have been >created under the "target" directories. >For example, issuing this command (example for the "neuralnet" module): > $ java -jar commons-math-examples/examples-sofm/tsp/target/examples-sofm-tsp.jar >outputs >---CUT--- >Missing required option '-o=outputFile' >Usage: [-hV] [-j=numJobs] [-m=maxTrials] [-n=neuronsPerCity] > -o=outputFile [-s=numSamples] >Run the application > -h, --help Show this help message and exit. > -j=numJobs Number of concurrent tasks (default: 8). > -m=maxTrials Maximal number of trials (default: 10). > -n=neuronsPerCityAverage number of neurons per city (default: 2.2). > -o=outputFileOutput file name. > -s=numSamplesNumber of samples for the training (default: 2000). > -V, --versionPrint version information and exit. >---CUT--- >The above thus shows that the program runs as expected (passing the missing >required argument produces the expected output file). >Doing the equivalent for the new examples, e.g. > $ java -jar commons-math-examples/examples-ga/examples-ga-tsp/target/examples-ga-mathfunctions.jar >results in >---CUT--- >Exception in thread "main" java.lang.UnsatisfiedLinkError: Can't load >library: /usr/lib/jvm/java-11-openjdk-amd64/lib/libawt_xawt.so >at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2630) >at java.base/java.lang.Runtime.load0(Runtime.java:768) >at java.base/java.lang.System.load(System.java:1837) >at java.base/java.lang.ClassLoader$NativeLibrary.load0(Native Method) >at java.base/java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2442) >at java.base/java.lang.ClassLoader$NativeLibrary.loadLibrary(ClassLoader.java:2498) >at java.base/java.lang.ClassLoader.loadLibrary0(ClassLoader.java:2694) >at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2648) >at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:830) >at java.base/java.lang.System.loadLibrary(System.java:1873) >at java.desktop/java.awt.Toolkit$3.run(Toolkit.java:1399) >at java.desktop/java.awt.Toolkit$3.run(Toolkit.java:1397) >at java.base/java.security.AccessController.doPrivileged(Native Method) >at java.desktop/java.awt.Toolkit.loadLibraries(Toolkit.java:1396) >at java.desktop/java.awt.Toolkit.(Toolkit.java:1429) >at java.desktop/java.awt.Component.(Component.java:621) >at org.apache.commons.math4.examples.ga.tsp.TSPOptimizer.main(TSPOptimizer.java:62) >---CUT--- >[Please note that the name of the JAR also looks wrong (a copy/paste mistake?).] --There is an issue in the jar file name. But in my system the TSP application executed successfully. The commands I have executed are given below: $ mvn package $ java -jar examples-ga-mathfunctions.jar JDK version in my local system is "1.8.0_301" --In the mvn command specified by you JAVA_HOME is assigned as "/usr/lib/jvm/java-8-openjdk-amd64/" but during execution of jar it is using java-11 Could you please confirm what is the JDK version used and ensure same version is used for both. This issue usually comes if the JDK is not properly installed. >Command > $ java -jar commons-math-examples/examples-ga/examples-ga-math-functions/target/examples-ga-mathfunctions.jar >also fails, with the following error >---CUT--- >Error: Could not find or load main class >org.apache.commons.math4.examples.ga.mathfunctions.Dimension2FunctionOptimizer >Caused by: java.lang.ClassNotFoundException: >org.apache.commons.math4.examples.ga.mathfunctions.Dimension2FunctionOptimizer >---CUT--- -- This is due to the wrong package name. I introduced a sub package based on dimension but forgot to modify the same in the pom file. >I noticed that there is an example relating to "Dimension2" and another >to "DimensionN". Isn't the former, in principle, a special case of the latter? --Yes, the former is the special case of the latter. But the way the executable jar is generated I need to keep a single java file with the main method and the number of dimensions needs to be passed as a program argument. --I shall make the changes and create a PR for the new feature branch. Thanks & Reg
Re: [Math] Please review GA implementation
Hi All I would like to add a few words over here. The JIRA was initially created as a proposal to accommodate rank based adaptive probability generation approaches for GA operators like crossover and mutation etc following the referred article. The article mainly describes the adaptive probability generation strategy and compares the two approaches for the same. It does not describe the entire work done in this library. However, during the design phase a few more change requirements were detected to make the library more robust and effective and are not described in the PDF. In order to have a good overview of changes, kindly review the sub-tasks created as part of the issue(MATH-1563). While writing the article I have used the legacy model as the Simple GA implementation. So the results described in the PDF for simple GA would differ from that of the current implementation. The primary reason for this change is calculation of mutation probability at the allele level instead of chromosome level like the legacy model. This has improved the optimization result to a considerable extent even for Simple GA. I have tried to describe all details and reasons for changes in the sub-task descriptions. Kindly let me know if any further clarification is required. Thanks & Regards --Avijit Basak On Sat, 8 Jan 2022 at 16:17, Bruno P. Kinoshita wrote: > Reviewed about 1/4 of the PR, but it was mainly about serialization > (started from the bottom, comparing using GitHub UI [1]). But that code and > tests were looking OK. > > > Will try to go over a few more files, but I also found a PDF in the issue > that I think I will try to read first, to have a better idea of the change. > I haven't read/used anything related to genetic algorithms since university. > > Cheers > Bruno > > > > [1] > https://github.com/apache/commons-math/compare/master..feature__MATH-1563__genetic_algorithm > > On Tuesday, 4 January 2022, 08:57:15 am NZDT, Gilles Sadowski < > gillese...@gmail.com> wrote: > > Hello. > > I've just created a "feature__MATH-1563__genetic_algorithm" branch[1] > in the git repository, with the code provided by Avijit Basak in PR #200, > a proposed replacement of the "o.a.c.math4.legacy.genetics" package.[2] > Reviews welcome. > > Regards, > Gilles > > [1] > https://gitbox.apache.org/repos/asf?p=commons-math.git;a=shortlog;h=refs/heads/feature__MATH-1563__genetic_algorithm > [2] https://issues.apache.org/jira/browse/MATH-1563 > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [MATH] Build Failure
Hi All I have identified a few *protected* methods which could be made *private*. Those are mostly validation methods of input arguments and used internally. Keeping them as protected won't add much value considering future extension. I would like to do the modification. It would be helpful if anyone can confirm the process of checking in new code now. Will it be as part of the same PR(#200) with a new commit? Would the commit message remain the same as earlier? Thanks & Regards --Avijit Basak On Sun, 2 Jan 2022 at 19:05, Avijit Basak wrote: > Hi All > > I have created a new *PR*(*#200*) with all changes under a single > commit message. Kindly review the same and let me know if any further > change is required. > > Thanks & Regards > --Avijit Basak > > On Mon, 27 Dec 2021 at 23:31, Gilles Sadowski > wrote: > >> Hello. >> >> Le lun. 27 déc. 2021 à 16:02, Avijit Basak a >> écrit : >> > >> > Hi All >> > >> > Please ignore my previous mail. The rebase is done successfully. >> > Please let me know if there is any issue. >> >> Here is a the list of commit messages that are should not be >> present (at least not when introducing completely new code): >> >> Merge branch 'feature/MATH-1563-ADAPTIVE' of >> https://github.com/avijitbasak/commons-math.git into >> feature/MATH-1563-ADAPTIVE >> removed 64 by Long.SIZE >> Merge branch 'master' of https://github.com/apache/commons-math.git >> into feature/MATH-1563-ADAPTIVE >> Minor change for UniformRandomProvider >> modified as per PMD recommendations >> updated for checkstyle formatting >> An optimized data structure implementation for binary chromosome >> minor modifications >> Modifications as per review comments >> Developed the new genetic algorithm module following the JIRA MATH-1563. >> >> What I suggested is to check out a pristine copy of "master", and copy >> the new files onto it, and only change whatever needs to be touched for >> the new contents to be handled correctly (i.e. just the POM files I >> guess). >> >> Then generate a _new_ PR (and close #199). >> There should be a _single_ commit with a log message of the form: >> ---CUT--- >> MATH-1563: Introducing new implementation of GA functionality (WIP). >> ---CUT--- >> >> [If you don't want to give more details about all the changes, please >> stick to the above sentence. Note that the convention is that the >> issue identifier be followed by a colon; then, as the commit log >> summary, a single sentence, ending with a period, on the first line.] >> >> Thanks, >> Gilles >> >> > >> > Thanks & Regards >> > --Avijit Basak >> > >> > On Mon, 27 Dec 2021 at 19:21, Avijit Basak >> wrote: >> > >> > > Hi All >> > > >> > > I have tried to rebase. However I found too many conflicts and >> > > most of them are unnecessary. So I aborted the process. Can we avoid >> the >> > > rebase as we have very few commits after the last rebase. Please >> share your >> > > views on this. >> > > >> > > Thanks & Regards >> > > --Avijit Basak >> > > >> > > On Sat, 25 Dec 2021 at 18:45, Gilles Sadowski >> > > wrote: >> > > >> > >> Hello. >> > >> >> > >> I've fetched the current contents of PR 199; locally, the build >> completes >> > >> successfully, so the problem reported by Travis looks strange indeed. >> > >> I would create a branch for further discussion on your GA design but >> > >> please first create a *single* commit that contains all changes wrt >> to >> > >> current "master" with a clear log message (first word *must* be the >> JIRA >> > >> identifier of your proposal (perhaps a new JIRA report would be >> clearer?), >> > >> like: >> > >> ---CUT--- >> > >> MATH-: Refactoring of GA functionality (WIP) >> > >> >> > >> Summary of what has been implemented (with the corresponding JIRA >> > >> reports)... >> > >> >> > >> (optionally) Summary what is under discussion... >> > >> ---CUT--- >> > >> >> > >> Thanks, >> > >> Gilles >> > >> >> > >> >> [...] >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > > -- > Avijit Basak > -- Avijit Basak
Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization
rs, people are welcome to contribute back if > >something they need is missing. > -- I think we have a disconnect here too. If the framework is not > extensible how users can use this in their problem domain. If this is not > extensible then it would never be used. How can we get back the > contribution? >I answered to this above. > > >Your argument of "too much diversity" can be reversed, in that > >it is unlikely that one library would attract everyone that needs a > >genetic algorithm. > -- Even if it cannot attract everyone with out of box features it should be > extensible for those. >I don't agree with making things more complicated for us, now and >in the foreseeable future, in order to satisfy users who don't exist yet >(because the library does not exist yet). -- I don't want to make things complicated for us. GA has a huge amount of usages in diverse fields. Of course we should not try to provide solutions for all. But the only thing I would like to ensure is that this library should be reusable so that anyone can extend it and design solutions for a new domain. We should not put any burden towards this. >Let's focus on making it work within a given scope, and then we can >think of improvements (that will be easy if the design is "structurally" >extensible, even if they are somehow "disabled" in the first release). -- I am against this "disable" option. I have tried to search the list of use cases for GA and found this huge list https://en.wikipedia.org/wiki/List_of_genetic_algorithm_applications My proposal is we should allow extensibility selectively with immutability in place. This won't create any bugs in our code due to extension. > >Better make a design that can handle a fraction of use cases, > >and grow as needed. > --There are already libraries which can solve most common use cases. > Non-extensible nature would block the growth to a considerable extent. >Is there a misunderstanding about what is implied by "extensible"? >Question: Are all classes, in your current design, "immutable"? -- Yes, they are mostly. However, there are some classes with protected/public methods which mutate private fields for internal processing e.g. generationsEvolved field in AbstractGeneticAlgorithm class. However the child classes cannot modify those private fields as there are no direct mutation methods. >If so, that's an excellent basis, and we should stop discussing the >meaning of "extensibility". --I think the design first needs a review. Then we can reinitiate this discussion. > > >> >Extending the functionality, if necessary, should be contributed back > here > >> *-- *Sometimes the GA operators are very much specific to the domain and > >> it's hard to generalise. In those scenarios contributing back to the > >> library might not be possible. > > >In such a case, how likely will it also be that whatever general > >framework this library has put in place, will also not be amenable > >to that domain's specifics? > -- Could you please frame this concern w.r.t. the scheduling example > provided above. ? > > >There is always a scope from which design decisions must be taken. > >If "multi-threading" is in the scope, then the design must avoid > >inheritance (in public classes) in order to much more easily > >ensure the correctness of applications. > -- Immutable design can also take care of multi-threading. >My main point in the discussion is that all classes with "public" access >should be immutable, indeed. -- They should be. > > >> However, if a library cannot be extended for > >> a new domain by users it becomes underutilised over time if not useless. > > >Sure but that is a hypothetical for the long-term. > >However, if the library is buggy or slow, it will not be used at all. > -- Is there any benchmark for speed/performance? GA is always infamous for > resource consumption rather than time. >I'm not sure I understand what you mean here. Thanks & Regards --Avijit Basak On Thu, 23 Dec 2021 at 20:50, Gilles Sadowski wrote: > Hello. > > Le jeu. 23 déc. 2021 à 14:22, Avijit Basak a > écrit : > > > > Hi All > > > > Please see my comments below. > > > > >As I've already indicated, "ThreadLocalRandomSource" is, IMHO, a > > >sort of workaround for a multi-thread application that does not want > > >to bother managing per-thread RNG instance(s). > > -- I am not clear on this. ThreadLocalRandomSource maintains > > an EnumMap>. What is > meant > > by it "does not want to bother managing per-thread RNG instance(s)" Could > > you please elab
Re: [MATH] Build Failure
Hi All I have created a new *PR*(*#200*) with all changes under a single commit message. Kindly review the same and let me know if any further change is required. Thanks & Regards --Avijit Basak On Mon, 27 Dec 2021 at 23:31, Gilles Sadowski wrote: > Hello. > > Le lun. 27 déc. 2021 à 16:02, Avijit Basak a > écrit : > > > > Hi All > > > > Please ignore my previous mail. The rebase is done successfully. > > Please let me know if there is any issue. > > Here is a the list of commit messages that are should not be > present (at least not when introducing completely new code): > > Merge branch 'feature/MATH-1563-ADAPTIVE' of > https://github.com/avijitbasak/commons-math.git into > feature/MATH-1563-ADAPTIVE > removed 64 by Long.SIZE > Merge branch 'master' of https://github.com/apache/commons-math.git > into feature/MATH-1563-ADAPTIVE > Minor change for UniformRandomProvider > modified as per PMD recommendations > updated for checkstyle formatting > An optimized data structure implementation for binary chromosome > minor modifications > Modifications as per review comments > Developed the new genetic algorithm module following the JIRA MATH-1563. > > What I suggested is to check out a pristine copy of "master", and copy > the new files onto it, and only change whatever needs to be touched for > the new contents to be handled correctly (i.e. just the POM files I guess). > > Then generate a _new_ PR (and close #199). > There should be a _single_ commit with a log message of the form: > ---CUT--- > MATH-1563: Introducing new implementation of GA functionality (WIP). > ---CUT--- > > [If you don't want to give more details about all the changes, please > stick to the above sentence. Note that the convention is that the > issue identifier be followed by a colon; then, as the commit log > summary, a single sentence, ending with a period, on the first line.] > > Thanks, > Gilles > > > > > Thanks & Regards > > --Avijit Basak > > > > On Mon, 27 Dec 2021 at 19:21, Avijit Basak > wrote: > > > > > Hi All > > > > > > I have tried to rebase. However I found too many conflicts and > > > most of them are unnecessary. So I aborted the process. Can we avoid > the > > > rebase as we have very few commits after the last rebase. Please share > your > > > views on this. > > > > > > Thanks & Regards > > > --Avijit Basak > > > > > > On Sat, 25 Dec 2021 at 18:45, Gilles Sadowski > > > wrote: > > > > > >> Hello. > > >> > > >> I've fetched the current contents of PR 199; locally, the build > completes > > >> successfully, so the problem reported by Travis looks strange indeed. > > >> I would create a branch for further discussion on your GA design but > > >> please first create a *single* commit that contains all changes wrt to > > >> current "master" with a clear log message (first word *must* be the > JIRA > > >> identifier of your proposal (perhaps a new JIRA report would be > clearer?), > > >> like: > > >> ---CUT--- > > >> MATH-: Refactoring of GA functionality (WIP) > > >> > > >> Summary of what has been implemented (with the corresponding JIRA > > >> reports)... > > >> > > >> (optionally) Summary what is under discussion... > > >> ---CUT--- > > >> > > >> Thanks, > > >> Gilles > > >> > > >> >> [...] > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [MATH] Build Failure
Hi All Please ignore my previous mail. The rebase is done successfully. Please let me know if there is any issue. Thanks & Regards --Avijit Basak On Mon, 27 Dec 2021 at 19:21, Avijit Basak wrote: > Hi All > > I have tried to rebase. However I found too many conflicts and > most of them are unnecessary. So I aborted the process. Can we avoid the > rebase as we have very few commits after the last rebase. Please share your > views on this. > > Thanks & Regards > --Avijit Basak > > On Sat, 25 Dec 2021 at 18:45, Gilles Sadowski > wrote: > >> Hello. >> >> I've fetched the current contents of PR 199; locally, the build completes >> successfully, so the problem reported by Travis looks strange indeed. >> I would create a branch for further discussion on your GA design but >> please first create a *single* commit that contains all changes wrt to >> current "master" with a clear log message (first word *must* be the JIRA >> identifier of your proposal (perhaps a new JIRA report would be clearer?), >> like: >> ---CUT--- >> MATH-: Refactoring of GA functionality (WIP) >> >> Summary of what has been implemented (with the corresponding JIRA >> reports)... >> >> (optionally) Summary what is under discussion... >> ---CUT--- >> >> Thanks, >> Gilles >> >> >> [...] >> >> --------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > > -- > Avijit Basak > -- Avijit Basak
Re: [MATH] Build Failure
Hi All I have tried to rebase. However I found too many conflicts and most of them are unnecessary. So I aborted the process. Can we avoid the rebase as we have very few commits after the last rebase. Please share your views on this. Thanks & Regards --Avijit Basak On Sat, 25 Dec 2021 at 18:45, Gilles Sadowski wrote: > Hello. > > I've fetched the current contents of PR 199; locally, the build completes > successfully, so the problem reported by Travis looks strange indeed. > I would create a branch for further discussion on your GA design but > please first create a *single* commit that contains all changes wrt to > current "master" with a clear log message (first word *must* be the JIRA > identifier of your proposal (perhaps a new JIRA report would be clearer?), > like: > ---CUT--- > MATH-: Refactoring of GA functionality (WIP) > > Summary of what has been implemented (with the corresponding JIRA > reports)... > > (optionally) Summary what is under discussion... > ---CUT--- > > Thanks, > Gilles > > >> [...] > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [MATH] Build Failure
Hi I have no access to change anything in the master branch. The artifact id for the module "commons-math-core" is mentioned as "commons-math4-core" in the pom file present in the master branch. Please check the following URL. https://github.com/apache/commons-math/blob/master/commons-math-core/pom.xml Also please let me know if you see any other issues. Thanks & Regards --Avijit Basak On Fri, 24 Dec 2021 at 17:27, Gilles Sadowski wrote: > Le ven. 24 déc. 2021 à 12:23, Avijit Basak a > écrit : > > > > Hi > > > > I have initiated the build once again after pulling changes > from > > the master branch. However, the build has failed again. Kindly look into > > the report. > > https://app.travis-ci.com/github/apache/commons-math/builds/243963791 > > > > The artifact id "commons-math4-core" is present in the master > > branch of the repository > > > > > https://github.com/apache/commons-math/blob/master/commons-math-core/pom.xml > > . > > Am I missing anything? > > That > commons-math-core > is not the same as > commons-math4-core > > [IIRC, you create the latter by mistake.] > > > > > Thanks & Regards > > --Avijit Basak > > > > > >>> [...] > > ----- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [MATH] Build Failure
Hi I have initiated the build once again after pulling changes from the master branch. However, the build has failed again. Kindly look into the report. https://app.travis-ci.com/github/apache/commons-math/builds/243963791 The artifact id "commons-math4-core" is present in the master branch of the repository https://github.com/apache/commons-math/blob/master/commons-math-core/pom.xml . Am I missing anything? Thanks & Regards --Avijit Basak On Thu, 23 Dec 2021 at 18:34, Gilles Sadowski wrote: > Hi. > > Le mer. 22 déc. 2021 à 17:31, Gilles Sadowski a > écrit : > > > > Hello. > > > > Le mer. 22 déc. 2021 à 15:05, Avijit Basak a > écrit : > > > > > > Hi All > > > > > > I am facing a build issue for PR #199 in commons-math library. > > After you fix the build (cf. below), I'll create a branch dedicated > to GA development, in order to clarify the issues we've been > discussing. > > Regards, > Gilles > > > > > Next time, please provide a direct link to the build log. Thanks. > > > > The last Travis build is here: > > https://app.travis-ci.com/github/apache/commons-math/builds/240925186 > > > > AFAICT, the complaints are (starting at line 8044): > > ---CUT--- > > [WARNING] Rule violated for bundle commons-math4-core > > ---CUT--- > > > > However, there is no "commons-math4-core" in the "master" branch: > >https://github.com/apache/commons-math/ > > > > Regards, > > Gilles > > > > > The > > > report summary is given below. Can anyone kindly look into the issue > and do > > > the needful. > > > > > > [ [1;34mINFO [m] Apache Commons Math > > > [1;32mSUCCESS [m [ 8.501 s]*[ [1;34mINFO [m] Miscellaneous core > > > classes . [1;31mFAILURE [m [ 24.551 s]* > > > [ [1;34mINFO [m] Artificial neural networks . > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] Transforms . > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] Exception classes (Legacy) . > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] Miscellaneous core classes (Legacy) > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] Apache Commons Math (Legacy) ... > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] Example applications ... > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] SOFM ... > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] Chinese Rings .. > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] Traveling Salesman Problem . > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] genetic algorithm .. > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] examples-genetic-algorithm . > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] examples-ga-math-functions . > > > [1;33mSKIPPED [m > > > [ [1;34mINFO [m] examples-ga-tsp > > > [1;33mSKIPPED [m > > > > > > > > > Thanks & Regards > > > -- Avijit Basak > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization
, only configuration > >of the range. > *-- *I agree. But the question is should we block the extension. >Please find a valid use case. ;-) -- Recently I did an implementation of scheduling with commons-math 3.6. I have implemented the chromosome representing schedule by extending AbstractListChromosome. The mutation was also customized according to the requirement. However, I was able to use the existing OnePointCrossover operator. Do you think this kind of implementation would be possible if the framework does not support extensibility? > > >> I have initially implemented > >> the Binary chromosome and the corresponding binary mutation following the > >> same pattern. However, restricting extension of concrete classes by > private > >> constructor does not prevent users from extending the abstract parent > >> classes. > > >We should aim at coding the GA logic through (Java) interfaces, and not > >expose the "abstract" classes. > *-- *One of the primary reasons for me to contribute in Apache' GA library > is it's simplicity and extensibility. >"Extensibility" does not necessarily imply "inheritance"-based. -- Can you provide a solution to the above problem without an extensibility feature? >In fact, we do want to *avoid* in order to more easily and more robustly >provide other advantages such as multi-threading. -- IMHO immutable operator design is the best choice for supporting multi-threading. It is much easier to implement even for user extension. Why don't we think of fixing the ThreadLocalRandomSource. >> I would like to have a framework >> which should be always extensible for any problem domain with minor >> changes. >Any problem domain should indeed be amenable to be solved >by the library; I don't see how that should imply a design based >on inheritance. -- Do you have any alter design in mind. Kindly share the same. >> The primary reason behind this is that application domains of GA >> are too diverse. It is not possible to implement everything in a library. >> We don't know all possible domain areas too. If we remove the extensibility >> from the framework it would be useless in lots of areas. >When that occurs, people are welcome to contribute back if >something they need is missing. -- I think we have a disconnect here too. If the framework is not extensible how users can use this in their problem domain. If this is not extensible then it would never be used. How can we get back the contribution? >Your argument of "too much diversity" can be reversed, in that >it is unlikely that one library would attract everyone that needs a >genetic algorithm. -- Even if it cannot attract everyone with out of box features it should be extensible for those. >Better make a design that can handle a fraction of use cases, >and grow as needed. --There are already libraries which can solve most common use cases. Non-extensible nature would block the growth to a considerable extent. >> >Extending the functionality, if necessary, should be contributed back here >> *-- *Sometimes the GA operators are very much specific to the domain and >> it's hard to generalise. In those scenarios contributing back to the >> library might not be possible. >In such a case, how likely will it also be that whatever general >framework this library has put in place, will also not be amenable >to that domain's specifics? -- Could you please frame this concern w.r.t. the scheduling example provided above. >There is always a scope from which design decisions must be taken. >If "multi-threading" is in the scope, then the design must avoid >inheritance (in public classes) in order to much more easily >ensure the correctness of applications. -- Immutable design can also take care of multi-threading. >> However, if a library cannot be extended for >> a new domain by users it becomes underutilised over time if not useless. >Sure but that is a hypothetical for the long-term. >However, if the library is buggy or slow, it will not be used at all. -- Is there any benchmark for speed/performance? GA is always infamous for resource consumption rather than time. Thanks & Regards --Avijit Basak On Wed, 22 Dec 2021 at 20:32, Gilles Sadowski wrote: > Hello. > > Le mer. 22 déc. 2021 à 14:25, Avijit Basak a > écrit : > > > > Hi All > > > > Please see my comments below. > > > > >> >Several problems with this approach (raised in previous messages > IIRC): > > >> >1. Potential performance loss in sharing the same RNG instance. > > >> -- As per my understanding ThreadLocalRandomSource creates separate > > >> instances of UniformRandomProvider for e
[MATH] Build Failure
Hi All I am facing a build issue for PR #199 in commons-math library. The report summary is given below. Can anyone kindly look into the issue and do the needful. [ [1;34mINFO [m] Apache Commons Math [1;32mSUCCESS [m [ 8.501 s]*[ [1;34mINFO [m] Miscellaneous core classes . [1;31mFAILURE [m [ 24.551 s]* [ [1;34mINFO [m] Artificial neural networks . [1;33mSKIPPED [m [ [1;34mINFO [m] Transforms . [1;33mSKIPPED [m [ [1;34mINFO [m] Exception classes (Legacy) . [1;33mSKIPPED [m [ [1;34mINFO [m] Miscellaneous core classes (Legacy) [1;33mSKIPPED [m [ [1;34mINFO [m] Apache Commons Math (Legacy) ... [1;33mSKIPPED [m [ [1;34mINFO [m] Example applications ... [1;33mSKIPPED [m [ [1;34mINFO [m] SOFM ... [1;33mSKIPPED [m [ [1;34mINFO [m] Chinese Rings .. [1;33mSKIPPED [m [ [1;34mINFO [m] Traveling Salesman Problem . [1;33mSKIPPED [m [ [1;34mINFO [m] genetic algorithm .. [1;33mSKIPPED [m [ [1;34mINFO [m] examples-genetic-algorithm . [1;33mSKIPPED [m [ [1;34mINFO [m] examples-ga-math-functions . [1;33mSKIPPED [m [ [1;34mINFO [m] examples-ga-tsp [1;33mSKIPPED [m Thanks & Regards -- Avijit Basak
Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization
Hi All Please see my *changed* comments below. >> > Mine is against using "ThreadLocalRandomSource"... >> -- What is the wayout other than that. Please suggest. >I think I did. >>*--* The factory based approach would be useful only when we can have separate copies of operators for each set of operations. *--* *T*he factory based approach can introduce *custom* RNG, but it can improve performance only when we can have separate copies of operators for each set of operations which might lead to *memory issues* as explained in previous mail. Thanks & Regards --Avijit Basak On Wed, 22 Dec 2021 at 18:54, Avijit Basak wrote: > Hi All > > Please see my comments below. > > >> >Several problems with this approach (raised in previous messages IIRC): > >> >1. Potential performance loss in sharing the same RNG instance. > >> -- As per my understanding ThreadLocalRandomSource creates separate > >> instances of UniformRandomProvider for each thread. So I am not sure > how a > >> UniformRandomProvider instance is being shared. Please correct me if I > am > >> wrong. > > >Within a given thread there will be *one* RNG instance; that's what I > meant > >by "shared". > >Of course you are right that that instance is not shared by multiple > threads > >(which would be a bug). > >The performance loss is because it will be necessary to call > > ThreadLocalRandomSource.current(RandomSource source) > >for each access to the RNG (since it would be a bug to store the returned > >value in e.g. an operator instance that would be shared among threads (as > >you suggest below). > > -- I tried to do a small test on it and here are the results. Output times > are in milliseconds. According to my understanding the performance loss is > mostly during creation of per thread instance of UniformRandomProvider. > --*CUT*-- > @Test > void test() { > int limit = 1; > long start = System.currentTimeMillis(); > for (int i = 0; i < limit; i++) { > ThreadLocalRandomSource.current(RandomSource.JDK); > } > System.out.println(System.currentTimeMillis() - start); > > limit = 1000; > start = System.currentTimeMillis(); > for (int i = 0; i < limit; i++) { > ThreadLocalRandomSource.current(RandomSource.JDK); > } > System.out.println(System.currentTimeMillis() - start); > > limit = 1; > start = System.currentTimeMillis(); > for (int i = 0; i < limit; i++) { > ThreadLocalRandomSource.current(RandomSource.JDK); > } > System.out.println(System.currentTimeMillis() - start); > > limit = 10; > start = System.currentTimeMillis(); > for (int i = 0; i < limit; i++) { > ThreadLocalRandomSource.current(RandomSource.JDK); > } > System.out.println(System.currentTimeMillis() - start); > > limit = 100; > start = System.currentTimeMillis(); > for (int i = 0; i < limit; i++) { > ThreadLocalRandomSource.current(RandomSource.JDK); > } > System.out.println(System.currentTimeMillis() - start); > > limit = 1000; > start = System.currentTimeMillis(); > for (int i = 0; i < limit; i++) { > ThreadLocalRandomSource.current(RandomSource.JDK); > } > System.out.println(System.currentTimeMillis() - start); > > limit = 1; > start = System.currentTimeMillis(); > for (int i = 0; i < limit; i++) { > ThreadLocalRandomSource.current(RandomSource.JDK); > } > System.out.println(System.currentTimeMillis() - start); > > limit = 10; > start = System.currentTimeMillis(); > for (int i = 0; i < limit; i++) { > ThreadLocalRandomSource.current(RandomSource.JDK); > } > System.out.println(System.currentTimeMillis() - start); > } > --*CUT*-- > --*output*-- > 363 > 1 > 2 > 4 > 6 > 28 > 244 > 2423 > --*output*-- > > >> >2. Less/no flexibility (no user's choice of random source). > >> -- Agreed. > -- Do we really need this much flexibility here? > >> >3. Error-prone (user can access/reuse the "UniformRandomProvider" > >> instances). > >> > >> >Again: "ThreadLocalRandomSource" is an ad-hoc workaround for correct > but > >> >"light" usage of random number generation in a multi-threaded > app
Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization
uirement also increases with increase of dimension this might lead to a major issue and need a thought. So I think we have a design tradeoff here performance vs memory consumption. I am more worried about memory as that might restrict use of this library beyond a certain number of dimensions in some areas. However, creating deep copy would only be possible when we strictly restrict extension of operators which I want to avoid. >> So even if we provide >> the customization at the operator level we cannot avoid sharing. >We can, and we should. >What we probably can't avoid sharing is the instance that represents the >population of chromosomes. *--* In a multi-threaded optimization the chromosome instances are shared in case the same chromosome is chosen for crossover by the selection process. I missed this point earlier. ... >> > Mine is against using "ThreadLocalRandomSource"... >> -- What is the wayout other than that. Please suggest. >I think I did. *--* The factory based approach would be useful only when we can have separate copies of operators for each set of operations. >Maybe it's time to create a dedicated branch for the GA functionality >so that we can try out the different approaches. > > >> I think first we need to decide on whether we really need this > >> customization and if yes then why. Then we can decide on alternate > >> implementation options. > > > >> >As per the recent updates of the math-related code bases, the > >> >public API should provide factory methods (constructors should > >> >be private). > >> -- private constructors will make public API classes non-extensible. This > >> will severely restrict the extensibility of this framework which I want > to > >> avoid. I am not sure why we need to remove public constructors. It would > be > >> helpful if you could refer me to any relevant discussion thread. > > > Allowing extensibility is a huge burden on library maintainers. The > > library must have been designed to support it; hence, you should > > first describe what kind(s) of extensions (with usage examples) you > > have in mind. > --The library should be extensible to support customization. Users should > be able to customise or provide their own implementation of genetic > operators for crossover and mutation. The chromosome classes should also be > open for extension. >I don't get why we should support extensions outside this library. *--* I think we should not block the extension. >Initially we discussed about having a light-weight library, for easier usage >than alternative existing framework(s). *--* We can always think of making the framework lightweight but it should not cost extensibility. >> E.g. any developer should be able to extend the >> IntegralChromosome class and define a child class which explicitly >> specifies the range of integers to be used. >It does not look like this would need an extension, only configuration >of the range. *-- *I agree. But the question is should we block the extension. >> I have initially implemented >> the Binary chromosome and the corresponding binary mutation following the >> same pattern. However, restricting extension of concrete classes by private >> constructor does not prevent users from extending the abstract parent >> classes. >We should aim at coding the GA logic through (Java) interfaces, and not >expose the "abstract" classes. *-- *One of the primary reasons for me to contribute in Apache' GA library is it's simplicity and extensibility. I would like to have a framework which should be always extensible for any problem domain with minor changes. The primary reason behind this is that application domains of GA are too diverse. It is not possible to implement everything in a library. We don't know all possible domain areas too. If we remove the extensibility from the framework it would be useless in lots of areas. >Extending the functionality, if necessary, should be contributed back here *-- *Sometimes the GA operators are very much specific to the domain and it's hard to generalise. In those scenarios contributing back to the library might not be possible. However, if a library cannot be extended for a new domain by users it becomes underutilised over time if not useless. Thanks & Regards --Avijit Basak On Tue, 21 Dec 2021 at 22:05, Gilles Sadowski wrote: > Hello. > > Le mar. 21 déc. 2021 à 16:21, Avijit Basak a > écrit : > > > > Hi All > > > > Please see my comments. Sorry for the delayed response. > > > > >Several problems with this approach (raised in previous messages IIRC): > > >1. Potential performance loss in sharing the same RNG
Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization
Hi All Please see my comments. Sorry for the delayed response. >Several problems with this approach (raised in previous messages IIRC): >1. Potential performance loss in sharing the same RNG instance. -- As per my understanding ThreadLocalRandomSource creates separate instances of UniformRandomProvider for each thread. So I am not sure how a UniformRandomProvider instance is being shared. Please correct me if I am wrong. >2. Less/no flexibility (no user's choice of random source). -- Agreed. >3. Error-prone (user can access/reuse the "UniformRandomProvider" instances). >Again: "ThreadLocalRandomSource" is an ad-hoc workaround for correct but >"light" usage of random number generation in a multi-threaded application; GAs >make "heavy" use of RNG, thus it is does not seem outlandish that all the RNG >"clients" (e.g. every "operator") creates their own instances. >IMHO, a more important discussion would be about the expectations in a >multithreaded context: E.g. should an operator be shareable by different >threads? And if not, how does the API help application developers to avoid >such pitfalls? -- Once we implement multi-threading in GA, same crossover and mutation operators will be re-used across multiple threads. So even if we provide the customization at the operator level we cannot avoid sharing. >> My original implementation did not allow any customization of RandomSource >> instances. There was a thought in review for customization of RandomSource, >> so these options were considered. I don't think this would make any >> difference to algorithm functionality. > Quite right. But the customization can come at zero cost for the users > who don't need it. Admittedly it's a little more work on the part of the > developer(s) but it's a one off cost (and I'm fine working on that part of > the library once other, more important, things have been settled). >> Even earlier I used Math.random() >> which worked equally well. So my *vote* should be *against* this >> customization. > Mine is against using "ThreadLocalRandomSource"... -- What is the wayout other than that. Please suggest. >> I think first we need to decide on whether we really need this >> customization and if yes then why. Then we can decide on alternate >> implementation options. > >> >As per the recent updates of the math-related code bases, the >> >public API should provide factory methods (constructors should >> >be private). >> -- private constructors will make public API classes non-extensible. This >> will severely restrict the extensibility of this framework which I want to >> avoid. I am not sure why we need to remove public constructors. It would be >> helpful if you could refer me to any relevant discussion thread. > Allowing extensibility is a huge burden on library maintainers. The > library must have been designed to support it; hence, you should > first describe what kind(s) of extensions (with usage examples) you > have in mind. --The library should be extensible to support customization. Users should be able to customise or provide their own implementation of genetic operators for crossover and mutation. The chromosome classes should also be open for extension. E.g. any developer should be able to extend the IntegralChromosome class and define a child class which explicitly specifies the range of integers to be used. I have initially implemented the Binary chromosome and the corresponding binary mutation following the same pattern. However, restricting extension of concrete classes by private constructor does not prevent users from extending the abstract parent classes. Thanks & Regards --Avijit Basak On Tue, 30 Nov 2021 at 19:20, Gilles Sadowski wrote: > Hi. > > Le mar. 30 nov. 2021 à 06:40, Avijit Basak a > écrit : > > > > Hi All > > > > Please see my comments: > > > > >The provider returned from ThreadLocalRandomSource.current(...) should > > >only be used within a single method. > > -- I missed the context of the thread in my previous mail. Sorry for the > > previous communication. We can only cache the RandomSource's enum value > and > > reuse the same locally in other methods. According to the analysis, the > > current implementation(In PR#199) with pre-configured RandomSource would > > work correctly. > > --CUT-- > > public final class RandomProviderManager { > > /** The default RandomSource for random number generation. **/ > > private static RandomSource randomSource = > > RandomSource.XO_RO_SHI_RO_128_PP; > > /** > > * constructs the singleton instance. > >
Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization
Hi All Please see my comments: >The provider returned from ThreadLocalRandomSource.current(...) should >only be used within a single method. -- I missed the context of the thread in my previous mail. Sorry for the previous communication. We can only cache the RandomSource's enum value and reuse the same locally in other methods. According to the analysis, the current implementation(In PR#199) with pre-configured RandomSource would work correctly. --CUT-- public final class RandomProviderManager { /** The default RandomSource for random number generation. **/ private static RandomSource randomSource = RandomSource.XO_RO_SHI_RO_128_PP; /** * constructs the singleton instance. */ private RandomProviderManager() {} /** * Returns the (static) random generator. * @return the static random generator shared by GA implementation classes */ public static UniformRandomProvider getRandomProvider() { return ThreadLocalRandomSource.current(RandomProviderManager.randomSource); } } --CUT-- @Alex Herbert , kindly share if you see any challenge to this. My original implementation did not allow any customization of RandomSource instances. There was a thought in review for customization of RandomSource, so these options were considered. I don't think this would make any difference to algorithm functionality. Even earlier I used Math.random() which worked equally well. So my *vote* should be *against* this customization. I think first we need to decide on whether we really need this customization and if yes then why. Then we can decide on alternate implementation options. >As per the recent updates of the math-related code bases, the >public API should provide factory methods (constructors should >be private). -- private constructors will make public API classes non-extensible. This will severely restrict the extensibility of this framework which I want to avoid. I am not sure why we need to remove public constructors. It would be helpful if you could refer me to any relevant discussion thread. Thanks & Regards --Avijit Basak On Mon, 29 Nov 2021 at 23:47, Gilles Sadowski wrote: > Le lun. 29 nov. 2021 à 19:07, Alex Herbert a > écrit : > > > > Note that your examples have incorrect usage of ThreadLocalRandomSource: > > The detailed explanation confirms what I hinted at previously: We > should not use "ThreadLocalRandomSource" from within the library > because we can easily do otherwise (and just as transparently for > the user). > > Gilles > > > [...] > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization
Hi All Here is a sample use of two options. *Option1*: Declaring factory interface in MutationPolicy, CrossoverPolicy and SelectionPolicy. A sample implementation has been shown here for MutationPolicy. Similar would be required for all other relevant interfaces and implemented classes. --CUT-- public interface MutationPolicy { Chromosome mutate(Chromosome original, double mutationRate); interface Factory { /** * Creates an instance with a dedicated source of randomness. * * @param rng RNG algorithm. * @param seed Seed. * @return an instance that must not be shared among threads. */ MutationPolicy create(RandomSource rng, Object... args); default MutationPolicy create(RandomSource rng) { return create(rng, null); } default MutationPolicy create() { return create(RandomSource.SPLIT_MIX_64); } } } //Implementation Class public class IntegralValuedMutation implements MutationPolicy { private final UniformRandomProvider provider; private IntegralValuedMutation(RandomSource rng) { provider = ThreadLocalRandomSource.current(rng); } ... ... public static class MutationFactory implements Factory { private static final MutationFactory instance = new MutationFactory<>(); private MutationFactory() {} @Override public MutationPolicy create(RandomSource rng, Object... args) { return new IntegralValuedMutation<>(args[0], args[1]); } public static MutationFactory getInstance() { return instance; } } //Usage MutationPolicy policy = IntegralValuedMutation.MutationFactory.getInstance().create(); --CUT-- Option2: Optional constructor argument can also be used as an alternative solution. --CUT-- public class IntegralValuedMutation implements MutationPolicy { private final UniformRandomProvider provider; public IntegralValuedMutation() { provider = ThreadLocalRandomSource.current(RandomSource.DEFAULT); //DEFAULT is a chosen source. } public IntegralValuedMutation(RandomSource rng) { provider = ThreadLocalRandomSource.current(rng); } ... } //Usages MutationPolicy policy = new IntegralValuedMutation(rng); --CUT-- Option2 looks to be much simpler regarding implementation and I would vote for the same if we decide to allow customization of RandomSource. Thanks & Regards --Avijit Basak On Mon, 22 Nov 2021 at 19:28, Gilles Sadowski wrote: > Hello. > > Le lun. 22 nov. 2021 à 13:49, Avijit Basak a > écrit : > > > > Hi All > > > > I would like to request everyone to share their opinion regarding > > use and customization of RNG functionality in the Genetic Algorithm > > library. > > In current design RNG functionality has been used internally by > the > > RandomProviderManager class. This class encapsulates a predefined > instance > > of RandomSource and utilizes the same for all random number generation > > requirements. This makes the API cleaner and easy to use for users. > > However, during the review an alternate thought has been proposed > > related to customization of RandomSource by users. According to the new > > proposal the users will be able to provide a RandomSource instance of > their > > choice to the crossover and mutation operators and other places like > > ChromosomeRepresentationUtils. The drawback of this customization could > be > > increased complexity of the API. > > Please provide an usage example of both (showing that the alternative > would actually increase the API complexity). > > Thanks, > Gilles > > > We need to decide here whether we really need this kind of > > customization by users and if yes the method of doing so. Here two > options > > have been proposed. > > *Option1:* > > ---CUT--- > > public interface MutationPolicy { > > Chromosome mutate(Chromosome original, double mutationRate); > > > > interface Factory { > > /** > > * Creates an instance with a dedicated source of randomness. > > * > > * @param rng RNG algorithm. > > * @param seed Seed. > > * @return an instance that must not be shared among > > threads. > > */ > > MutationPolicy create(RandomSource rng, Object seed); > > > > default MutationPolicy create(RandomSource rng) { > > return create(rng, null); > > } > > default MutationPolicy create() { > > return create(RandomSource.SPLIT_MIX_64); > > } > >
Re: [MATH][GENETICS][PR#199] Design Decision of Chromosome hierarchy
Hi All I have uploaded the image(*chromosome hierarchy.png*) in JIRA. Here is the link. Let me know if anyone faces any issues. https://issues.apache.org/jira/projects/MATH/issues/MATH-1563?filter=allopenissues Thanks & Regards --Avijit Basak On Mon, 22 Nov 2021 at 19:39, Gilles Sadowski wrote: > Hello. > > Le lun. 22 nov. 2021 à 13:47, Avijit Basak a > écrit : > > > > Hi All > > > > We need to make a decision on the chromosome hierarchy, proposed > for commons-math-ga module. > > Currently the hierarchy is designed as shown in the diagram below. > > Image has probably been stripped from your original message. > Please upload it to JIRA and post a link here. > > Thanks, > Gilles > > > > > > > Brief description: > > 1) The chromosome hierarchy is based on it's internal representation of > Genotype. > > 2) The phenotype of chromosomes is kept as a Generic parameter . > > 3) Decoder is introduced to convert Genotype to Phenotype. > > 4) FitnessFunction is introduced to calculate Fitness of chromosomes. > > 5) AbstractChromosome represents the chromosome abstraction for all > genotypes. > > 6) AbstractListChromosome has been introduced to represent the > abstraction for List based Genotype. > > 7) Any chromosome representing list based genotypes should extend > AbstractListBasedChromosome. > > 8) All other chromosomes should extend the AbstractChromosome class. > > 9) BinaryChromosome(not committed) is introduced to represent binary > genotypes and extends AbstractChromosome. > > > > Pros: > > 1) This hierarchy maintains a separation of Genotype and Phenotype. > > 2) Chromosome class with the same genotype can represent different > phenotypes with different implementations of Decoders. > > 3) Users will be able to use primitive types for higher dimensions by > extending the AbstractChromosome class. > > 4) Unlike the legacy model all concrete chromosomes are reusable with > proper implementation of FitnessFunction and Decoder. > > 5) Any custom list based genotypes can be implemented by extending > AbstractListChromosome class. > > 6) Internal genotype representations have been exposed which enabled the > reuse of crossover and mutation operators. > > > > I would like to request everyone to review the design and reply > in case of any concerns. > > > > > > Thanks & Regards > > -- Avijit Basak > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
[MATH][GENETICS][PR-199] Decision on the use of Logging functionality
Hi All We need to make a decision on usage of a logging framework. The previous release does not have any implementation of a logging framework. However, in the current implementation a slf4j logger has been introduced. Please share if anyone has any concerns related to this. Thanks & Regards -- Avijit Basak
[MATH][GENETICS][PR-199] Decision on use and customization of RNG functionality for randomization
Hi All I would like to request everyone to share their opinion regarding use and customization of RNG functionality in the Genetic Algorithm library. In current design RNG functionality has been used internally by the RandomProviderManager class. This class encapsulates a predefined instance of RandomSource and utilizes the same for all random number generation requirements. This makes the API cleaner and easy to use for users. However, during the review an alternate thought has been proposed related to customization of RandomSource by users. According to the new proposal the users will be able to provide a RandomSource instance of their choice to the crossover and mutation operators and other places like ChromosomeRepresentationUtils. The drawback of this customization could be increased complexity of the API. We need to decide here whether we really need this kind of customization by users and if yes the method of doing so. Here two options have been proposed. *Option1:* ---CUT--- public interface MutationPolicy { Chromosome mutate(Chromosome original, double mutationRate); interface Factory { /** * Creates an instance with a dedicated source of randomness. * * @param rng RNG algorithm. * @param seed Seed. * @return an instance that must not be shared among threads. */ MutationPolicy create(RandomSource rng, Object seed); default MutationPolicy create(RandomSource rng) { return create(rng, null); } default MutationPolicy create() { return create(RandomSource.SPLIT_MIX_64); } } } ---CUT--- *Option 2:* Use of an optional constructor argument for all crossover and mutation operators. Users will be providing a RandomSource instance of their choice or use the default one configured while instantiating the operators. Thanks & Regards -- Avijit Basak
[MATH][GENETICS][PR#199] Decision on retention of ASCII Art in Javadoc section
Hi All I would like to inform everyone that there is some ASCII art in the javadoc section in some classes like OnePointCrossover etc. This is taken unaltered from the previous release of math library. We need to decide whether we should keep them in the next release or remove them. I would like to request everyone to share their opinion. Thanks & Regards -- Avijit Basak
[MATH][GENETICS][PR#199] Design Decision of Chromosome hierarchy
Hi All We need to make a decision on the chromosome hierarchy, proposed for *commons-math-ga* module. Currently the hierarchy is designed as shown in the diagram below. [image: image.png] *Brief description:* 1) The chromosome hierarchy is based on it's internal representation of Genotype. 2) The phenotype of chromosomes is kept as a Generic parameter <*P>*. 3) Decoder is introduced to convert Genotype to Phenotype. 4) FitnessFunction is introduced to calculate Fitness of chromosomes. 5) AbstractChromosome represents the chromosome abstraction for all genotypes. 6) AbstractListChromosome has been introduced to represent the abstraction for List based Genotype. 7) Any chromosome representing list based genotypes should extend AbstractListBasedChromosome. 8) All other chromosomes should extend the AbstractChromosome class. 9) BinaryChromosome(not committed) is introduced to represent binary genotypes and extends AbstractChromosome. *Pros:* 1) This hierarchy maintains a separation of Genotype and Phenotype. 2) Chromosome class with the same genotype can represent different phenotypes with different implementations of Decoders. 3) Users will be able to use primitive types for higher dimensions by extending the AbstractChromosome class. 4) Unlike the legacy model all concrete chromosomes are reusable with proper implementation of FitnessFunction and Decoder. 5) Any custom list based genotypes can be implemented by extending AbstractListChromosome class. 6) Internal genotype representations have been exposed which enabled the reuse of crossover and mutation operators. I would like to request everyone to review the design and reply in case of any concerns. Thanks & Regards -- Avijit Basak
Re: [MATH][GENETICS] Review of PR #197
Hi All >Depending on released code (e.g. version 3.6.1 of Commons Math) >is fine too for the "main" codes of the "examples" module. >Setting up all the "policies" (mutation, crossover, ...) must be done >anyways, by the user; the above factories just add an argument to >be passed at instantiation. >> Separate RandomSource for each place may be >> redundant. >I'm not sure what you mean by "redundant". >Since the RNG instances are not thread-safe, either separate sources >*must* be used or the synchronization must be handled separately (as >done for the "ThreadLocalRandomSource" possibly with a performance >loss). >> I also believe the system should provide a default option for >> RandomSource instead of completely depending on the choice of users. >Why? >This is a library, and the RNG should be viewed as an input (user's choice >to be made at the application level). >There are multiple problems (already noted previously): >1. It hardcodes a specific "default" source (whereas the GA functionality >is "agnostic" about which source is actually used). >2. The RNG instance is shared among all the classes that need it (which >makes access unnecessarily slower). >3. The static field "randomSource" is mutable; this is looking for trouble >(race condition). --The way I tried to design this is that GA functionality should likely be agnostic to the rng feature. If we accept the randomsource as a parameter to GA operators it would likely mandate the users having knowledge about rng. Given the vast amount of RandomSources I am not sure how many options will be really considered by users and mostly might go for JDK source. This will increase the learning curve of users as well as a bit of complexity of API. Customization of any operators will mandate the implementation of the factory as well. [IMHO] The users do not need to choose the RandomSource for each operator separately. That is the reason I proposed the use of customization option in the RandomNumberGenerator class itself. But this is also true that it cannot mandate the configuration only once. In case that is configured inside any custom operator then that might result in race conditions once we go for parallel implementation. [IMHO] We really don't need users to customize RandomSource for each and every operator or for the application. Can we stick to the previous implementation and remove the configure(RandomSource rng) method of RandomNumberGenerator class? Kindly share your thoughts. Thanks & Regards --Avijit Basak On Tue, 9 Nov 2021 at 00:11, Gilles Sadowski wrote: > Hello. > > Side-note: Please try to not remove quotes from previous emails, > as it leads to a confusing sequence (e.g. things I wrote previously > now appear below as quotes from your last message). > > Le dim. 7 nov. 2021 à 10:34, Avijit Basak a > écrit : > > > > Hi All > > > > Please find my comments below: > > > > > > [...] > > > > > > > > > > (B) > > > > I'm confused by your defining "legacy" packages in new modules... > > > > What kind of comparisons are you considering? > > > > It is fine to depend (with scope "test") on CM v3.6.1 to perhaps (?) > > > > regression testing; but please note that when your proposal is > > > > merged, it will imply that the "legacy" codes *must* be removed. > > > > [We don't want to keep duplicate functionality.] > > > > -- The new implementation has improved the quality of optimization > over > > > the > > > > legacy model. > > > > > > "Improved" in what sense? > > > If you mean enhanced performance, such checks should be done > > > using JMH (producing data to be published on the web site). > > > > > > --Along with performance and memory utilization, stochastic algorithms > > have > > > another comparison parameter "quality of result". In stochastic > > algorithms, > > > global optimum is not guaranteed, We have to compare the quality of the > > > result along with performance and memory consumption to compare two > > > algorithm implementations. I have kept the legacy example just for > > > comparison between the new and the legacy implementations. > > > > Great that you take care of checking improvements on the quality > > measures. Just make sure that the new code do not depend on > > anything in module "commons-math-legacy". As noted earlier, a > > dependency on a previous release of CM with > > test > > is fine. > > > > --test scope can only
Re: [MATH][GENETICS] Review of PR #197
houghts regarding this. --CUT-- /** The default RandomSource for random number generation. **/ private static RandomSource randomSource = RandomSource.XO_RO_SHI_RO_128_PP; /** * Sets the random source for this random generator. * @param randomSource */ public static void configure(RandomSource randomSource) { RandomNumberGenerator.randomSource = randomSource; } /** * constructs the singleton instance. */ private RandomNumberGenerator() { } /** * Returns the (static) random generator. * @return the static random generator shared by GA implementation classes */ public static UniformRandomProvider getRandomGenerator() { return ThreadLocalRandomSource.current(RandomNumberGenerator.randomSource); } --CUT-- > > > > * Class "ValidationUtils" > > -> should not be public (or should be defined in an "internal" package). > > --Changed > > The class actually provides no added value. > > --I was thinking of having a validation utility which can be reused > everywhere. Otherwise I have to duplicate the code in all places. Do you > think that is a good way of doing this? Reuse is good, sure; but in this case, the very small gain (one line) is not worth the loss of clarity (method call vs direct conditional test). --Made changes >> [...] > > > > (E) Unit tests > > * src/test > > 1. New tests should use Junit 5. > > 2. "Example usages" probably belong in the "examples-ga" module. > > --JUnit version is inherited from commons-math module. > > Not sure what you mean. > It's possible to use Junit5 in new modules/test classes even if > other classes use older Junit. > --Do you think it is fine to have two separate versions of JUnit library in > CM. [IMHO] we should keep only one version only. In the mid-term, yes. But the point is that we must start somewhere. And as you are creating a new tests suite, it's worth using the up-to-date framework. [This will also reduce the burden on the people who'll take on updating all the other tests.] --Done Thanks & Regards --Avijit Basak On Mon, 1 Nov 2021 at 21:09, Gilles Sadowski wrote: > Hello. > > Le lun. 1 nov. 2021 à 08:56, Avijit Basak a > écrit : > > > > Hi All > > > > Please find my comments below: > > > > > > > > Hi All > > > > > > I have fixed most of the review comments. The changes have been > > > committed to PR#199. > > > > > > (A) > > > Please "rebase" on "master". > > > Please "squash" intermediate commits: For a new feature, a single > commit > > > should exist (that corresponds to the JIRA report describing it). > > > --Will be done once all changes are finalized and committed. > > > > What is the rationale for not doing it right now? > > The PR should always be "rebased" on the latest "master". > > > > --Done both rebase and squash. > > Thanks. > > The convention for the log summary (first line of the log message) is to > put the issue number in front; thus, instead of > ---CUT--- > Developed the new genetic algorithm module following the JIRA MATH-1563. > ---CUT--- > it should be something like > ---CUT--- > MATH-1563: Introducing new genetic algorithm module. > ---CUT--- > > > > > > > (B) > > > I'm confused by your defining "legacy" packages in new modules... > > > What kind of comparisons are you considering? > > > It is fine to depend (with scope "test") on CM v3.6.1 to perhaps (?) > > > regression testing; but please note that when your proposal is > > > merged, it will imply that the "legacy" codes *must* be removed. > > > [We don't want to keep duplicate functionality.] > > > -- The new implementation has improved the quality of optimization over > > the > > > legacy model. > > > > "Improved" in what sense? > > If you mean enhanced performance, such checks should be done > > using JMH (producing data to be published on the web site). > > > > --Along with performance and memory utilization, stochastic algorithms > have > > another comparison parameter "quality of result". In stochastic > algorithms, > > global optimum is not guaranteed, We have to compare the quality of the > > result along with performance and memory consumption to compare two > > algorithm implementations. I have kept the legacy example just for > > comparison between the new and the legacy implementations. > > Great that you take care of
Re: [MATH][GENETICS] Review of PR #197
t;NullPointerException") are necessary, > there could be a factory for creating the appropriate instance. > However, for "null" checks, please use the JDK utilities[2]. > --Moved to an internal package. Null checks have been modified too. > > > > > * Class "ConvergenceListenerRegistry" > > Shouldn't it be thread-safe? > > -- Yes. We need this to be thread-safe for parallel multi-population > > parallel genetic algorithms. > --No change for the time being. > > (E) Unit tests > * src/test > 1. New tests should use Junit 5. > 2. "Example usages" probably belong in the "examples-ga" module. > --JUnit version is inherited from commons-math module. Not sure what you mean. It's possible to use Junit5 in new modules/test classes even if other classes use older Junit. --Do you think it is fine to have two separate versions of JUnit library in CM. [IMHO] we should keep only one version only. > --I could not understand what is meant by "Example usages" here. Which > component is being referred to here. I'm referring to a comment such as ---CUT--- // to test a stochastic algorithm is hard, so this will rather be an usage // example ---CUT--- (at line 72 in "GeneticAlgorithmTestBinary.java"). --Removed. This was an existing comment from previous release. > > (F) Code readability > * Please write one argument per line. > * Write one condition check per line. > * Avoid comments with no added value (like "constructor" for a constructor). > * Avoid "ASCII art" (see e.g. "OnePointCrossover"); a link[2] is often > preferable. > * Do no duplicate documentation (see e.g. "OnePointCrossover"). > --I have formatted the method declaration to have one parameter in one line. > --Most of the if conditions are having a single condition except very few > pre existing ones. I could not see any way to format the if statement in > eclipse like the suggestion. I cannot introduce any formatting rule which > cannot be handled in eclipse as that will be very hard to manage. ? [I can't imagine that Eclipse won't let you add a newline.] --I searched a little bit but could not find anything relevant. However, I found the following reference https://stackoverflow.com/questions/31808237/formatting-if-else-in-eclipse Putting parenthesis is not an option. Let me know if you find anything relevant to this. > --ASCII art and other crossover classes are untouched for this release. What do you mean by "this release"? In this instance, it is easy to make the docs clearer by using a link rather than ASCII figures. If you want to argue that the latter should be kept, please start a new thread in order to collect other opinions. --I can start making the changes. Thanks & Regards --Avijit Basak On Sat, 30 Oct 2021 at 07:11, Gilles Sadowski wrote: > Le ven. 29 oct. 2021 à 17:00, Avijit Basak a > écrit : > > > > Hi All > > > > I have fixed most of the review comments. The changes have been > > committed to PR#199. > > > > (A) > > Please "rebase" on "master". > > Please "squash" intermediate commits: For a new feature, a single commit > > should exist (that corresponds to the JIRA report describing it). > > --Will be done once all changes are finalized and committed. > > What is the rationale for not doing it right now? > The PR should always be "rebased" on the latest "master". > > > > > (B) > > I'm confused by your defining "legacy" packages in new modules... > > What kind of comparisons are you considering? > > It is fine to depend (with scope "test") on CM v3.6.1 to perhaps (?) > > regression testing; but please note that when your proposal is > > merged, it will imply that the "legacy" codes *must* be removed. > > [We don't want to keep duplicate functionality.] > > -- The new implementation has improved the quality of optimization over > the > > legacy model. > > "Improved" in what sense? > If you mean enhanced performance, such checks should be done > using JMH (producing data to be published on the web site). > > > I have added the legacy packages to demonstrate the same. > > Once we remove the genetics packages in the legacy module, the same will > be > > deleted from examples. > > I'm probably missing what exactly those "legacy" examples aim to > demonstrate... > In passing, what's the purpose of > Thread.sleep(5000) > (at line 55 in file "TSPOptimizerLegacy")? > > > > > (C) > > File > > > "commons-math-examples/examples-ga/src/main/resou
Re: [MATH][GENETICS] Review of PR #197
quot; vs "NullPointerException") are necessary, there could be a factory for creating the appropriate instance. However, for "null" checks, please use the JDK utilities[2]. --Moved to an internal package. Null checks have been modified too. > > * Class "ConvergenceListenerRegistry" > Shouldn't it be thread-safe? > -- Yes. We need this to be thread-safe for parallel multi-population > parallel genetic algorithms. --No change for the time being. (E) Unit tests * src/test 1. New tests should use Junit 5. 2. "Example usages" probably belong in the "examples-ga" module. --JUnit version is inherited from commons-math module. --I could not understand what is meant by "Example usages" here. Which component is being referred to here. (F) Code readability * Please write one argument per line. * Write one condition check per line. * Avoid comments with no added value (like "constructor" for a constructor). * Avoid "ASCII art" (see e.g. "OnePointCrossover"); a link[2] is often preferable. * Do no duplicate documentation (see e.g. "OnePointCrossover"). --I have formatted the method declaration to have one parameter in one line. --Most of the if conditions are having a single condition except very few pre existing ones. I could not see any way to format the if statement in eclipse like the suggestion. I cannot introduce any formatting rule which cannot be handled in eclipse as that will be very hard to manage. --ASCII art and other crossover classes are untouched for this release. (G) Some files contain "tab" characters (e.g. "pom.xml"). --Removed tab characters. Thanks & Regards --Avijit Basak On Thu, 21 Oct 2021 at 21:32, Gilles Sadowski wrote: > Le mer. 20 oct. 2021 à 08:47, Avijit Basak a > écrit : > > > > Hi > > > > Thanks for the review comments. I have started making the > changes. > > However, I have some queries regarding some of comments as noted below: > > Some (partial) answers below. > > > > > (B) > > I'm confused by your defining "legacy" packages in new modules... > > --This is kept for comparison purposes between the legacy and the new > > implementation of GA. > > What kind of comparisons are you considering? > It is fine to depend (with scope "test") on CM v3.6.1 to perhaps (?) > regression testing; but please note that when your proposal is > merged, it will imply that the "legacy" codes *must* be removed. > [We don't want to keep duplicate functionality.] > > > (D) General design > > Class "ConsoleLogger" > > -> We should not reinvent the wheel. We should consider whether logging > > is necessary, and in the affirmative, depend on the de facto standard: > > "slf4j". > > -- I don't see any use of a logging framework in the math library. > > There is a long history of not wanting any kind of dependency. > But this ship has sailed. > > > That is > > the reason I introduced ConsoleLogger. If we introduce a logging > framework > > we won't need this class at all. I think we should include the logger in > > the root(commons-math\) pom.xml file so that all modules should be able > to > > use this. > > Starting with the upcoming release, we can decide on a per-module > basis. Please make the case (in a new ML thread) for introducing > such a dependency in the GA module. > > > > > Class "Constants" > > -> Any data should be declared where its purpose is obvious. > > -- We can declare the constants where it belongs but this might introduce > > duplicate constants across different classes and hence reduce > reusability. > > The class does not mention where the data is used, nor why it is > necessary that it be "public". > By default, the leaner API (i.e. no unnecessary "public" components), > the better (even if sometimes that would entail duplicating "private" > data). > [TBD on a case-by-case basis.] > > > > > * Class "AbstractListChromosome" (and subclasses) > > Didn't we conclude that this was a very wasteful implementation of the > > "chromosome" concept? > > -- I have some concerns regarding this. I am not much aware of any > > discussion regarding this conclusion. > > Please search the ML archive; I seem to recall a detailed discussion > where Alex gave hints on how a binary chromosome should be > implemented. > > > Chromosomes are always conceptualized as collections of allele/genes. So > We > > need a collection of the *genotypes* anyway. Here List has been used as a > > collection. > > We need an abstraction for r
Re: [MATH][GENETICS] Review of PR #197
Hi Thanks for the review comments. I have started making the changes. However, I have some queries regarding some of comments as noted below: (B) I'm confused by your defining "legacy" packages in new modules... --This is kept for comparison purposes between the legacy and the new implementation of GA. (D) General design Class "ConsoleLogger" -> We should not reinvent the wheel. We should consider whether logging is necessary, and in the affirmative, depend on the de facto standard: "slf4j". -- I don't see any use of a logging framework in the math library. That is the reason I introduced ConsoleLogger. If we introduce a logging framework we won't need this class at all. I think we should include the logger in the root(commons-math\) pom.xml file so that all modules should be able to use this. Class "Constants" -> Any data should be declared where its purpose is obvious. -- We can declare the constants where it belongs but this might introduce duplicate constants across different classes and hence reduce reusability. * Class "AbstractListChromosome" (and subclasses) Didn't we conclude that this was a very wasteful implementation of the "chromosome" concept? -- I have some concerns regarding this. I am not much aware of any discussion regarding this conclusion. Chromosomes are always conceptualized as collections of allele/genes. So We need a collection of the *genotypes* anyway. Here List has been used as a collection. We need an abstraction for representing the collection of Genotype. All crossover and mutation operators are based on this abstraction. This enabled reuse of crossover and mutation operators for all chromosome types which extend the abstraction. I am not sure how to achieve this reusability without an abstraction. Any domain specific new chromosome implementation extending the AbstractListChromosome class can reuse all crossover and mutation operators. For our proposed improvement of BinaryChromosome we should be able to extend the AbstractChromosome (*not* AbstractListChromosome) for the new class and provide the dedicated crossover and mutation operators for the corresponding Genotype. Without an *explicit* abstraction, management of crossover and mutation operators would be difficult. Please share further thoughts regarding this. * Class "GeneticException" 1. Should not be public (or should be defined in an "internal" package"[1]). 2. If various types (that map to different JDK subclasses of RuntimeException, e.g. "IllegalArgumentException" vs "NullPointerException") are necessary, there could be a factory for creating the appropriate instance. However, for "null" checks, please use the JDK utilities[2]. -- As of now we are managing all exception types by single GeneticException class. So there is no factory. -- Using JDK utilities for NullPointer would repeat this code in all places. Is it fine? Objects.requireNonNull(object, Message.format(GeneticException.NULL_ARGUMENT, args)); * Class "ConvergenceListenerRegistry" Shouldn't it be thread-safe? -- Yes. We need this to be thread-safe for parallel multi-population parallel genetic algorithms. Thanks & Regards --Avijit Basak On Mon, 18 Oct 2021 at 23:13, Gilles Sadowski wrote: > Hello. > > Sorry for the delay in reviewing. > > Le lun. 18 oct. 2021 à 09:35, Avijit Basak a > écrit : > > > > Hi All > > > > I have created PR#197 as mentioned earlier. Kindly let me know if > > there is any concern or comments. > > I have created another *PR#199* consisting of the changes with > > adaptive probability generations. > > Please find below my first remarks. > > (A) > Please "rebase" on "master". > Please "squash" intermediate commits: For a new feature, a single commit > should exist (that corresponds to the JIRA report describing it). > > (B) > I'm confused by your defining "legacy" packages in new modules... > > (C) > File > "commons-math-examples/examples-ga/src/main/resources/spotbugs/spotbugs-exclude-filter.xml" > does not belong there. > > (D) General design > Class "ConsoleLogger" > -> We should not reinvent the wheel. We should consider whether logging > is necessary, and in the affirmative, depend on the de facto standard: > "slf4j". > > Class "Constants" > -> Any data should be declared where its purpose is obvious. > > * Class "RandomGenerator" > 1. Duplicates functionality (storage of thread-local instances) > readily available in "Commons RNG". > 2. (IMHO) Thread-local instances should not be used for "heavy" usage > (like in GA). > > * Class "ValidationUtils" > -> should not
Re: [MATH][GENETICS] Review of PR #197
Hi All I have created PR#197 as mentioned earlier. Kindly let me know if there is any concern or comments. I have created another *PR#199* consisting of the changes with adaptive probability generations. Kindly review the same. The build has failed due to a spot bug issue as mentioned below. [ERROR] Medium: Public static org.apache.commons.math4.ga.listener.ConvergenceListenerRegistry.getInstance() may expose internal representation by returning ConvergenceListenerRegistry.INSTANCE [org.apache.commons.math4.ga.listener.ConvergenceListenerRegistry] At ConvergenceListenerRegistry.java:[line 89] MS_EXPOSE_REP However, the same code in my previous PR(#197) was built successfully. I am not sure how to resolve this issue. Any help will be appreciated. Thanks & Regards --Avijit Basak On Mon, 27 Sept 2021 at 18:21, Avijit Basak wrote: > Hi All > > I have created the *PR #197* consisting of changes for JIRA > MATH-1563, Task MATH-1618. This is the primary work to standardize the > design of GA module. The build has passed. I would like to request a review > of the PR. Once the primary design is standardized I can check in further > changes like introduction of adaptive model and data structure change for > Binary chromosomes. > Kindly let me know for any concerns or queries. > > Thanks & Regards > -- Avijit Basak > -- Avijit Basak
[MATH][GENETICS] Review of PR #197
Hi All I have created the *PR #197* consisting of changes for JIRA MATH-1563, Task MATH-1618. This is the primary work to standardize the design of GA module. The build has passed. I would like to request a review of the PR. Once the primary design is standardized I can check in further changes like introduction of adaptive model and data structure change for Binary chromosomes. Kindly let me know for any concerns or queries. Thanks & Regards -- Avijit Basak
Re: [MATH][GENETICS] Build Issue with PR #197
Hi All I have made all the changes. Thanks & Regards --Avijit Basak On Sat, 25 Sept 2021 at 17:14, Gilles Sadowski wrote: > Hello. > > Le sam. 25 sept. 2021 à 08:06, Avijit Basak a > écrit : > > > > Hi All > > > > I have created a PR (#197) to merge my changes for Genetic > Algorithm. Although the build has passed locally for my components > (commons-math4-genetics and example-genetics) the PR build failed with unit > test case errors for legacy(commons-math4-legacy) modules. > > Please note that maven modules should named "commons-math-Xxx"; > as Alex mentioned previously, you should remove the spurious "4" in > the "genetics" module (and I'd personally favour "ga" over "genetics", as > per a previous discussion). Thus: >commons-math-ga > with top-level package > org.apache.commons.math4.ga > > > I have not changed anything in the legacy module. So not sure what is > causing those issues. > > It is caused by a randomized test ("SimplexOptimizerTest", see the log[1]). > > > Could anyone kindly guide me how to pass the PR build in this scenario. > > You might try to resubmit the PR. > [Anyways the build will be performed after you commit the changes > indicated above.] > > Best regards, > Gilles > > > The local build logs are attached herewith for reference. > > > > Local build command: mvn clean verify apache-rat:check checkstyle:check > pmd:check spotbugs:check javadoc:javadoc > > PR Link: https://github.com/apache/commons-math/pull/197 > > > > Thanks & Regards > > -- Avijit Basak > > > > [1] https://app.travis-ci.com/github/apache/commons-math/builds/238471866 > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
[MATH][GENETICS] Build Issue with PR #197
Hi All I have created a PR (#197) to merge my changes for Genetic Algorithm. Although the build has passed locally for my components (commons-math4-genetics and example-genetics) the PR build failed with unit test case errors for legacy(commons-math4-legacy) modules. I have not changed anything in the legacy module. So not sure what is causing those issues. Could anyone kindly guide me how to pass the PR build in this scenario. The local build logs are attached herewith for reference. *Local build command:* mvn clean verify apache-rat:check checkstyle:check pmd:check spotbugs:check javadoc:javadoc *PR Link*: https://github.com/apache/commons-math/pull/197 Thanks & Regards -- Avijit Basak - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library
Hi All I have created a pull request for task *MATH-1618* belonging to *Jira** MATH-1563*. Kindly initiate the review. Please send a note if you have any questions or concerns. Development is in progress for the next task *MATH-1619*. *URL:* https://github.com/apache/commons-math/pull/197 Thanks & Regards --Avijit Basak On Sun, 15 Aug 2021 at 23:17, Gilles Sadowski wrote: > Le dim. 15 août 2021 à 15:48, Avijit Basak a > écrit : > > > > Hi > > > > As mentioned earlier I need to use descriptive statistics in > > *genetics* module as part of *math4* release. This will be required for > > checking convergence status, probability generation. This can also be > used > > for streaming current population conditions to interested listeners. > > Currently, we have a DescriptiveStatistics class as part of math4.legacy > > module. Is there any plan to develop a new statistics module like > neuralnet > > and genetics? > > Not exactly: Refactored statistics utilities should find a home in the > the new "Commons Statistics" component.[1] > > > If not what is the way to proceed forward. Kindly guide me in > > this regard. > > There are several ways forward: > 1. You contribute to start work on a "commons-statistics-descriptive" > maven module in the component mentioned above. ["Commons Math" > can depend on that component's modules.] > 2. You make modifications to the GA functionality inside the current > "o.a.c.m.legacy.genetics" package. [I'd still advise that we define > interfaces to whatever functionality (like descriptive statistics) should > ultimately be implemented somewhere else.] > 3. You create a new "commons-math-ga" module that does not depend > on the "commons-math-legacy" module. [That would imply creating an > "internal" package (where you can copy anything you need) whose > contents will not be part of the official API (i.e. users must not rely on > it being stable across even minor releases).] > > > Regards, > Gilles > > [1] https://commons.apache.org/proper/commons-statistics > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library
Hi As mentioned earlier I need to use descriptive statistics in *genetics* module as part of *math4* release. This will be required for checking convergence status, probability generation. This can also be used for streaming current population conditions to interested listeners. Currently, we have a DescriptiveStatistics class as part of math4.legacy module. Is there any plan to develop a new statistics module like neuralnet and genetics? If not what is the way to proceed forward. Kindly guide me in this regard. Thanks & Regards --Avijit Basak On Sun, 8 Aug 2021 at 19:14, Gilles Sadowski wrote: > Hello. > > Le dim. 8 août 2021 à 07:22, Avijit Basak a > écrit : > > > > Hi All > > > > I have started to work in genetic module. > > Great! > > > I want to push the new > > module as part of a new feature branch "*feature/MATH-1563*". Changes > > include mostly the existing code and modfication due to the new Exception > > class. I have encountered the following error which indicates my Github > Id " > > *avijitbasak*" is not permitted to check-in code in the repository. > > Indeed, not all GitHub users are allowed to modify an ASF's project > repository. ;-) > > > Could > > anyone kindly grant me access to the repository. Let me know if I need to > > do anything else regarding this. > > Only ASF committers[1] are given write access. > For a contributor who is not (yet) a committer, the (nowadays[2]) usual way > to suggest changes is through GitHub pull requests (i.e. you have to > "clone" > the repository into your projects' GH space and modify there). > > Regards, > Gilles > > [1] https://www.apache.org/foundation/how-it-works.html#roles > [2] The alternative is uploading patches to the issue-tracking system: > https://commons.apache.org/patches.html > > > [...] > > --------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library
Hi All I have started to work in genetic module. I want to push the new module as part of a new feature branch "*feature/MATH-1563*". Changes include mostly the existing code and modfication due to the new Exception class. I have encountered the following error which indicates my Github Id " *avijitbasak*" is not permitted to check-in code in the repository. Could anyone kindly grant me access to the repository. Let me know if I need to do anything else regarding this. git.exe push --progress "origin" feature/MATH-1563 remote: Permission to apache/commons-math.git denied to avijitbasak. fatal: unable to access '*https://github.com/apache/commons-math.git/ <https://github.com/apache/commons-math.git/>*': The requested URL returned error: *403* Thanks & Regards --Avijit Basak On Sun, 8 Aug 2021 at 10:49, Avijit Basak wrote: > Hi > > I have created two new subtasks for Jira *MATH-1563* to explain > the requirement of changes and a new JIRA MATH-1618 > <https://issues.apache.org/jira/browse/MATH-1618>. > Let me know if that helps. We can continue the discussion here > in case of any queries. > > Thanks & Regards > --Avijit Basak > > On Wed, 28 Jul 2021 at 23:22, Gilles Sadowski > wrote: > >> Hello. >> >> Le mer. 28 juil. 2021 à 10:23, Avijit Basak a >> écrit : >> > >> > Hi >> > >> > I shall try to describe my proposed changes with proper context >> in >> > my next communication. Regarding the stats, I need a library that can be >> > used for any statistical calculation needed. >> >> Are the calculations needed for the GA to work (e.g. as part of a stopping >> criterion), or are they only meant to inform the user (e.g. for computing >> current average fitness and the like)? >> >> In the latter case, (IIUC) I don't think that we need to introduce such a >> dependency: Couldn't "out-of-band" functionality be defined through a >> plugin infrastructure? >> >> > I don't want to use the one >> > from math3 legacy component as that will include all other legacy >> > components too. >> >> If you intend to improve the "genetics" package within the current >> "commons-math-legacy" module, you can use all the code in there, >> (including the "o.a.c.math4.stat" package, although that will make it >> more difficult to create a new module free of those dependencies. >> >> Please clarify what goal you are pursuing. >> >> > If the commons-statistics component is an isolated one then >> > that can be re-used once released. >> >> I don't understand what you mean. >> >> > It will be nice to have a library for plotting graph. Earlier I >> > used jFreeChart (Lesser GNU Public License), which works fine for this >> kind >> > of requirement. Any suggestion regarding this? >> >> If you suggest that a Commons component should depend on >> a plotting library, it's likely "no go". >> Would a GA implementation need this? >> Again, if the purpose is to follow progress of the computation, we >> should define appropriate interfaces to allow data collection in >> real time. How those are processed (e.g. plotting statistics of the >> current population) is probably out-of-scope. >> >> Regards, >> Gilles >> >> > >> > Thanks & Regards >> > --Avijit Basak >> > >> > >> > On Tue, 27 Jul 2021 at 19:33, Gilles Sadowski >> wrote: >> > >> > > Hello. >> > > >> > > Le mar. 27 juil. 2021 à 09:15, Avijit Basak >> a >> > > écrit : >> > > > >> > > > Hi All >> > > > >> > > > Please find the proposed changes for the Genetic Algorithm >> > > library in commons.maths. >> > > > Changes in Model: >> > > > 1) GeneticAlgorithm class is broken into a hierarchy to accommodate >> > > commons implementation in an Abstract class AbstractGeneticAlgorithm. >> New >> > > AdaptiveGeneticAlgorithm class has also been introduced. >> > > > 2) Introduced Elitism interface which is implemented by >> > > ElitisticListPopulation. >> > > > 3) Interface Fitness has been removed. >> > > > 4) Interface FitnessCalculator has been introduced. >> > > > 5) Chromosome has been updated with FitnessCalculator interface and >> > > accessor. >> > > > 6) Operations in AbstractChromosome has be
Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library
Hi I have created two new subtasks for Jira *MATH-1563* to explain the requirement of changes and a new JIRA MATH-1618 <https://issues.apache.org/jira/browse/MATH-1618>. Let me know if that helps. We can continue the discussion here in case of any queries. Thanks & Regards --Avijit Basak On Wed, 28 Jul 2021 at 23:22, Gilles Sadowski wrote: > Hello. > > Le mer. 28 juil. 2021 à 10:23, Avijit Basak a > écrit : > > > > Hi > > > > I shall try to describe my proposed changes with proper context > in > > my next communication. Regarding the stats, I need a library that can be > > used for any statistical calculation needed. > > Are the calculations needed for the GA to work (e.g. as part of a stopping > criterion), or are they only meant to inform the user (e.g. for computing > current average fitness and the like)? > > In the latter case, (IIUC) I don't think that we need to introduce such a > dependency: Couldn't "out-of-band" functionality be defined through a > plugin infrastructure? > > > I don't want to use the one > > from math3 legacy component as that will include all other legacy > > components too. > > If you intend to improve the "genetics" package within the current > "commons-math-legacy" module, you can use all the code in there, > (including the "o.a.c.math4.stat" package, although that will make it > more difficult to create a new module free of those dependencies. > > Please clarify what goal you are pursuing. > > > If the commons-statistics component is an isolated one then > > that can be re-used once released. > > I don't understand what you mean. > > > It will be nice to have a library for plotting graph. Earlier I > > used jFreeChart (Lesser GNU Public License), which works fine for this > kind > > of requirement. Any suggestion regarding this? > > If you suggest that a Commons component should depend on > a plotting library, it's likely "no go". > Would a GA implementation need this? > Again, if the purpose is to follow progress of the computation, we > should define appropriate interfaces to allow data collection in > real time. How those are processed (e.g. plotting statistics of the > current population) is probably out-of-scope. > > Regards, > Gilles > > > > > Thanks & Regards > > --Avijit Basak > > > > > > On Tue, 27 Jul 2021 at 19:33, Gilles Sadowski > wrote: > > > > > Hello. > > > > > > Le mar. 27 juil. 2021 à 09:15, Avijit Basak a > > > écrit : > > > > > > > > Hi All > > > > > > > > Please find the proposed changes for the Genetic Algorithm > > > library in commons.maths. > > > > Changes in Model: > > > > 1) GeneticAlgorithm class is broken into a hierarchy to accommodate > > > commons implementation in an Abstract class AbstractGeneticAlgorithm. > New > > > AdaptiveGeneticAlgorithm class has also been introduced. > > > > 2) Introduced Elitism interface which is implemented by > > > ElitisticListPopulation. > > > > 3) Interface Fitness has been removed. > > > > 4) Interface FitnessCalculator has been introduced. > > > > 5) Chromosome has been updated with FitnessCalculator interface and > > > accessor. > > > > 6) Operations in AbstractChromosome has been updated with > > > FitnessCalculator as interface. > > > > 7) New BinaryChromosome class has been added. > > > > 8) Interface PermutationChromosome has been replaced by > > > IndirectlyEncodedChromosome as the interface primarily represents > > > chromosomes with indirect encoding. A more appropriate name can be > > > suggested. > > > > 9) RandomKey class operations have been updated with > FitnessCalculator. > > > > 10) I would like to include a new class PermutationChromosome as we > have > > > corresponding crossover operators like OrderedCrossover. > > > > 11) crossover method in CrossoverPolicy interface has been updated > with > > > additional argument probability to support dynamic probability > generation. > > > This would impact all implementation classes. > > > > 12) mutate method in MutationPolicy has been added another argument > > > probability to support dynamic probability generation. This would > impact > > > all implementation classes. > > > > 13) Two new evolution StoppingCondition has been added > > > UnchangedAvgFitness and UnchangedBestFitness.
Re: [MATH][DESIGN] Design Discussion for Genetic Algorithm Library
Hi I shall try to describe my proposed changes with proper context in my next communication. Regarding the stats, I need a library that can be used for any statistical calculation needed. I don't want to use the one from math3 legacy component as that will include all other legacy components too. If the commons-statistics component is an isolated one then that can be re-used once released. It will be nice to have a library for plotting graph. Earlier I used jFreeChart (Lesser GNU Public License), which works fine for this kind of requirement. Any suggestion regarding this? Thanks & Regards --Avijit Basak On Tue, 27 Jul 2021 at 19:33, Gilles Sadowski wrote: > Hello. > > Le mar. 27 juil. 2021 à 09:15, Avijit Basak a > écrit : > > > > Hi All > > > > Please find the proposed changes for the Genetic Algorithm > library in commons.maths. > > Changes in Model: > > 1) GeneticAlgorithm class is broken into a hierarchy to accommodate > commons implementation in an Abstract class AbstractGeneticAlgorithm. New > AdaptiveGeneticAlgorithm class has also been introduced. > > 2) Introduced Elitism interface which is implemented by > ElitisticListPopulation. > > 3) Interface Fitness has been removed. > > 4) Interface FitnessCalculator has been introduced. > > 5) Chromosome has been updated with FitnessCalculator interface and > accessor. > > 6) Operations in AbstractChromosome has been updated with > FitnessCalculator as interface. > > 7) New BinaryChromosome class has been added. > > 8) Interface PermutationChromosome has been replaced by > IndirectlyEncodedChromosome as the interface primarily represents > chromosomes with indirect encoding. A more appropriate name can be > suggested. > > 9) RandomKey class operations have been updated with FitnessCalculator. > > 10) I would like to include a new class PermutationChromosome as we have > corresponding crossover operators like OrderedCrossover. > > 11) crossover method in CrossoverPolicy interface has been updated with > additional argument probability to support dynamic probability generation. > This would impact all implementation classes. > > 12) mutate method in MutationPolicy has been added another argument > probability to support dynamic probability generation. This would impact > all implementation classes. > > 13) Two new evolution StoppingCondition has been added > UnchangedAvgFitness and UnchangedBestFitness. > > 14) An interface ProbabilityGenerator has been introduced with few > selective implementations to be used by AdaptiveGeneticAlgorithm class. The > signature of the probability generation method has been kept generic to > keep strategies interchangeable. > > I'd have a hard time commenting as we mostly miss the context: AFAIK, > nobody here has ever used CM's GA implementation and nobody knows > how its design structure should be changed in order to improve its > * usability, > * performance, > * robustness, > * extensibility, or > * maintenance; > hence the listing of changes is not very useful without some hint as to why > things are to be modified, removed or added (e.g. pointing to shortcomings, > missing features, performance bottlenecks, and so on; and create a JIRA > report for each of them). > Actually, I understand that it might be a tedious task, and probably not > worth > the modest feedback which you may expect in return. So the best course of > action is perhaps to implement the new design as you see fit, and then show > (through applications in "examples" module) how it solves selected > problems. > > Doing so, you could keep us informed of your progress through commenting > in the appropriate JIRA report(s) and a link to an up-to-date PR. > > > I have few more queries related to repository structure. > > 1) Do we need to keep package name as math4 and not math. Using a > version-independent name would ease version migration for developers for > future releases. > > Commons has a strict policy of backwards compatibility of minor releases. > Changing the top-level package's name is done in every major release in > order to avoid JAR hell. > > > 2) Can we have the stat module out of legacy component. > > Are you on to fix all the reported issues? > > > This can be useful to calculate population statistics if required. > > You are certainly welcome to refactor the parts of the "o.a.c.m.stat" > package which would be of interest for that purpose. > Please note that redesign statistical functionalities should be ported > to the "Commons Statistics" component.[1] > > Regards, > Gilles > > [1] https://commons.apache.org/proper/commons-statistics/ > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
[MATH][DESIGN] Design Discussion for Genetic Algorithm Library
Hi All Please find the proposed changes for the Genetic Algorithm library in commons.maths. Changes in Model: 1) GeneticAlgorithm class is broken into a hierarchy to accommodate commons implementation in an Abstract class AbstractGeneticAlgorithm. New AdaptiveGeneticAlgorithm class has also been introduced. 2) Introduced Elitism interface which is implemented by ElitisticListPopulation. 3) Interface Fitness has been removed. 4) Interface FitnessCalculator has been introduced. 5) Chromosome has been updated with FitnessCalculator interface and accessor. 6) Operations in AbstractChromosome has been updated with FitnessCalculator as interface. 7) New BinaryChromosome class has been added. 8) Interface PermutationChromosome has been replaced by IndirectlyEncodedChromosome as the interface primarily represents chromosomes with indirect encoding. A more appropriate name can be suggested. 9) RandomKey class operations have been updated with FitnessCalculator. 10) I would like to include a new class PermutationChromosome as we have corresponding crossover operators like OrderedCrossover. 11) crossover method in CrossoverPolicy interface has been updated with additional argument probability to support dynamic probability generation. This would impact all implementation classes. 12) mutate method in MutationPolicy has been added another argument probability to support dynamic probability generation. This would impact all implementation classes. 13) Two new evolution StoppingCondition has been added UnchangedAvgFitness and UnchangedBestFitness. 14) An interface ProbabilityGenerator has been introduced with few selective implementations to be used by AdaptiveGeneticAlgorithm class. The signature of the probability generation method has been kept generic to keep strategies interchangeable. I have few more queries related to repository structure. 1) Do we need to keep package name as *math4* and not *math*. Using a version-independent name would ease version migration for developers for future releases. 2) Can we have the stat module out of legacy component. This can be useful to calculate population statistics if required. Kindly share your thoughts. Thanks & Regards --Avijit Basak - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: The case for a Commons component
Hi All This has been a long mail thread. It will be really helpful if anyone can summarize the decisions. Is the proposal of developing the new machine learning component approved? If the team repository is not provided is there any way to go ahead? Waiting for a response. Thanks & Regards --Avijit Basak On Fri, 7 May 2021 at 02:26, sebb wrote: > On Thu, 6 May 2021 at 21:13, Gary Gregory wrote: > > > > It is true that there much less friction these days to get a repository > > going with GitHub, GitLab, and BitBucket, but, for now, the Commons > Sandbox > > is still available. If we want to do away with the sandbox, then let's > > talk about that separately. > > > > There is no need for a Sandbox component to use SVN, and it's easy to > create a new Commons git repo. > > A non-ASF code repo would require code to be checked for license > compliance etc before it could become a Commons component. > A Sandbox component does not require that. > > > Gary > > > > On Thu, May 6, 2021, 11:26 Ralph Goers > wrote: > > > > > > > > > > > > On May 6, 2021, at 8:06 AM, Gary Gregory > wrote: > > > > > > > > What about the Commons Sandox? Would that be a good place to start? > > > > > > > > > > Emmanuel just sort of proposed doing away with it. As he put it, anyone > > > can create a > > > GitHub repo so why does it need to be under the apache user. He hasn’t > > > formally > > > made a proposal for that and I’m not sure how I would vote on it if he > > > did. He does > > > have a point. At the same time I’m not sure I’d close off doing > > > experimental or > > > early development within the ASF space. > > > > > > Ralph > > > > > > > > > > > > - > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [Vote] Create repository for "machine learning" algorithms.
Hi Gilles Thanks for voting on behalf of me. Regards --Avijit Basak On Mon, 3 May 2021 at 18:14, Gilles Sadowski wrote: > Recording a vote in the proper thread on behalf of Avijit Basak (who > inadequately posted his vote in two other threads). > > Le mer. 21 avr. 2021 à 19:05, Gilles Sadowski a > écrit : > > > > [...] > > > > Name of component: "Commons Machine Learning" > > Name of "git" repository: "commons-machinelearning" > > Top-level package name: "org.apache.commons.machinelearning" > > > > [...] > > > > > > Please vote: > [X] Yes. > > [ ] No, because ... > > Gilles (on behalf on Avijit Basak) > > ----- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: The case for a Commons component
Hi I would like to vote for *commons-ml*. Thanks & Regards --Avijit Basak On Mon, 3 May 2021 at 04:29, Gilles Sadowski wrote: > Hi. > > > [... Discussion about GA data-structures...] > > I'd suggest that we finalize the [Vote] before getting into the > details... > > Currently, there have been votes by: > Emmanuel Bourg (-1) > Sebastian Bazley (-0) > Ralph Goers (+0) > Paul King (+1) > > So currently, the discussion should be focused on settling to the > issues put forward by the opponents to having this new component: > * Problem 1: Functionality should go somewhere else (Emmanuel, Sebb) > * Problem 2: Who will contribute? (Ralph) > > Partial answers have been given. > We need more opinions (and votes). > > Regards, > Gilles > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: The case for a Commons component
Hi >>Note: You cannot easily just use java.util.BitSet as you wish to have access to the underlying long[] to store the chromosome to enable efficient crossover. --Thanks for pointing this. However, I have considered few constraints while doing the implementation. 1) I extended the existing class AbstractListChromosome, which requires a Generic type. This is the reason for using a list of Long. However, I can extend the Chromosome and use an array of primitive long. BitSet also uses a similar data structure. 2) One problem of BitSet is the use of MSB to retain bits. As a result, we won't be able to use the static utility methods of wrapper classes(Long) for conversion between primitive type and string. We will have to write custom code for conversion between string and integral types. This is the only reason I have used BLOCKSIZE as 63 instead of 64. >>// This is not actually required... // int bit = cross & 64; // i.e. cross % 64 --Do you mean bit index is not required to calculate? How can we handle crossover indexes which are not multiple of 64. >> Do you think that allele sets other than binary would be useful to implement? [IIUC your document above, it seems not (?).] --The document only describes the data structure related to Binary genotype. We already have an implementation of RandomKey genotype in commons. We can think of adding other genotypes gradually. Thanks & Regards --Avijit Basak On Sat, 1 May 2021 at 22:18, Gilles Sadowski wrote: > Le ven. 30 avr. 2021 à 17:40, Avijit Basak a > écrit : > > > > Hi > > > > >>lot of spurious references to "Commons Numbers" > > --I have only created the basic project structure. Changes > > need to be made. Can anyone from the existing commons team help in doing > > this. > > Wel, you should "search and replace": > "Numbers" -> "Machine Learning" > commons-numbers -> commons-machinelearning > > Other things (repository URL, JIRA project name and URL) require that > a component be created (vote is pending). > [As long as those files are not part of a PR, it is not urgent to fix > them.] > > > >> For sure, populate it with the code extracted from CM's > > "genetics" > > package and proceed with the enhancements. > > At first, I'd suggest to refactor the layout of the package (i.e. create > > a "subpackage" for each component of a genetic algorithm). > > -- I am working on it. > > Great! > > > Did not commit the code till now. > > OK. When you do, please ask for review on the "dev" ML. > > > >> Then some examination of the data-structures is required (a > > binary chromosome is currently stored as a "List"). > > -- I have recently done some work on this. Could you please > > check this article and share your thought. > > "*https://arxiv.org/abs/2103.04751 > > <https://arxiv.org/abs/2103.04751>*" > > Alex already provided a thorough response. > It's a pity that JDK's BitSet is missing a few methods (e.g. "append") > for a readily usable implementation of a "binary chromosome". > > Do you think that allele sets other than binary would be useful to > implement? [IIUC your document above, it seems not (?).] > > > Are we thinking to use Spark for our parallelism > > No, if the code is to reside in Commons. > > > or a simple > > multi-threading of Java. > > Yes, we'd depend only on JDK classes. > > > I would prefer to use java multi-threading and > > avoid any other framework. > > In java we don't have any library which can be used for AI/ML > > programming with a very minimal learning curve. Can we think of > fulfilling > > this need? > > That would be nice. Don't hesitate to enlist fellow programmers. :-) > > Regards, > Gilles > > > This will be helpful for many java developers to venture into > > AI/ML without learning a new language like Python. > > > > > >>> [...] > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [Vote] Create a "machine learning" component
Hi I would like to vote for *commons-ml*. Thanks & Regards --Avijit Basak On Sat, 24 Apr 2021 at 08:12, Paul King wrote: > I added some more comments relevant to if the proposed algorithm > belongs somewhere in the commons "math" area back in the Jira: > > https://issues.apache.org/jira/browse/MATH-1563 > > Cheers, Paul. > > On Wed, Apr 21, 2021 at 7:26 PM Gilles Sadowski > wrote: > > > > Le mer. 21 avr. 2021 à 08:56, Paul King a > écrit : > > > > > > On Wed, Apr 21, 2021 at 4:12 PM Ralph Goers < > ralph.go...@dslextreme.com> wrote: > > > > > > > > Why are y’all having a long discussion on Vote thread? > > > > Paul King's comments is interesting information that could > > bear on people's decision on the proposal (especially the > > licence's issue). > > As for the question of whether the purported functionality would > > find a better home elsewhere with the ASF, I'm sure what would > > be the conclusion (apart from Avijit Bask's plain preference (?) to > > develop a standalone component, as per Commons' requirement). > > > > > > > > Fair enough. I am +1 (non-binding). > > > > So currently, IIRC the tally (on creating a dedicated component) is > > Gilles Sadowski +1 > > Avijit Basak +1 > > Paul King +1 > > And several -1 on the initially suggested name; but the proposed > > name has been changed early on to "commons-machinelearning" > > (in order to comply with Commons' tradition of full words and > > descriptive names). > > [Please correct if it doesn't reflect what has been expressed.] > > > > Where does that lead us? > > > > Regards, > > Gilles > > > > >>> [...] > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: The case for a Commons component
Hi >>lot of spurious references to "Commons Numbers" --I have only created the basic project structure. Changes need to be made. Can anyone from the existing commons team help in doing this. >> For sure, populate it with the code extracted from CM's "genetics" package and proceed with the enhancements. At first, I'd suggest to refactor the layout of the package (i.e. create a "subpackage" for each component of a genetic algorithm). -- I am working on it. Did not commit the code till now. >> Then some examination of the data-structures is required (a binary chromosome is currently stored as a "List"). -- I have recently done some work on this. Could you please check this article and share your thought. "*https://arxiv.org/abs/2103.04751 <https://arxiv.org/abs/2103.04751>*" Are we thinking to use Spark for our parallelism or a simple multi-threading of Java. I would prefer to use java multi-threading and avoid any other framework. In java we don't have any library which can be used for AI/ML programming with a very minimal learning curve. Can we think of fulfilling this need? This will be helpful for many java developers to venture into AI/ML without learning a new language like Python. Thanks & Regards --Avijit Basak On Wed, 28 Apr 2021 at 18:48, Gilles Sadowski wrote: > Le lun. 26 avr. 2021 à 16:18, Avijit Basak a > écrit : > > > > Hi > > > > As per previous discussions, I have created a temporary > repository > > in GitHub under my personal GitHub Id(avijitbasak). The artifacts have > been > > copied from commons-numbers. A preliminary structure has been created for > > the proposed component. > > Please let me know if we want to proceed with this format. > > There is no source code (and a lot of spurious references to > "Commons Numbers"). > For sure, populate it with the code extracted from CM's "genetics" > package and proceed with the enhancements. > At first, I'd suggest to refactor the layout of the package (i.e. create > a "subpackage" for each component of a genetic algorithm). > Then some examination of the data-structures is required (a binary > chromosome is currently stored as a "List"). > Shouldn't the whole design be revised (based on interfaces and > streams)? > > > We can copy the > > same to any other team repository if required. > > That would be a repository on an ASF server, once the pending vote > process is completed. [By the way: You didn't vote...] > > Regards, > Gilles > > >> [...] > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: The case for a Commons component
Hi As per previous discussions, I have created a temporary repository in GitHub under my personal GitHub Id(avijitbasak). The artifacts have been copied from commons-numbers. A preliminary structure has been created for the proposed component. Please let me know if we want to proceed with this format. We can copy the same to any other team repository if required. Repo URL: https://github.com/avijitbasak/commons-machinelearning Thanks & Regards --Avijit Basak On Mon, 26 Apr 2021 at 04:49, Paul King wrote: > On Mon, Apr 26, 2021 at 12:27 AM sebb wrote: > > > > I assume this thread is about the possible ML component. > > > > If the code was developed by Commons, I assume it could be used as > > part of Spark. > > However Commons does not currently have many developers who are > > familiar with the field. > > So it would seem to me better to have development done by a project > > which does have relevant experience. > > > > You say that Spark etc have lots of jars. > > Surely that allows for it to be implemented as a separate jar which > > can either be used as part of the Spark platform, or used > > independently? > > The stats I gave were for the current minimal use of those algorithms. > Most algorithms are written in Scala, use RDD "dataframes" rather than > say double arrays, and assume you're running on "the platform" which > handles how you might get your data and return results and do logging > etc. in a potentially concurrent world. Some of those design choices > are key to scaling up but don't align with the goal of making the > algorithms runnable "independently". > > > The only other option I see is for Commons to persuade some developers > > who are familiar with the field to join Commons to assist with the > > algorithms. > > I agree that is the crux of the issue here. The "commons doesn't have > the bandwidth to absorb another algorithm" part of the discussion > seems perfectly legit to me. The "and there is an obvious home > elsewhere" part of the discussion seemed a little more dubious to me, > though obviously that is something which should be considered. > > > Existing Commons developers can help manage the logistics of packaging > > and releasing the code, as this does not require in depth knowledge of > > the design. > > However this only makes sense if the developers skilled in the are are > > prepared to assist long-term. > > > > > > On Sat, 24 Apr 2021 at 23:32, Paul King > wrote: > > > > > > Thanks Gilles, > > > > > > I can provide the same sort of stats across a clustering example > > > across commons-math (KMeans) vs Apache Ignite, Apache Spark and > > > Rheem/Apache Wayang (incubating) if anyone would find that useful. It > > > would no doubt lead to similar conclusions. > > > > > > Cheers, Paul. > > > > > > On Sun, Apr 25, 2021 at 8:15 AM Gilles Sadowski > wrote: > > > > > > > > Hello Paul. > > > > > > > > Le sam. 24 avr. 2021 à 04:42, Paul King > a écrit : > > > > > > > > > > I added some more comments relevant to if the proposed algorithm > > > > > belongs somewhere in the commons "math" area back in the Jira: > > > > > > > > > > https://issues.apache.org/jira/browse/MATH-1563 > > > > > > > > Thanks for a "real" user's testimony. > > > > > > > > As the ML is still the official forum for such a discussion, I'm > quoting > > > > part of your post on JIRA: > > > > ---CUT--- > > > > For linear regression, taking just one example dataset, commons-math > > > > is a couple of library calls for a single 2M library and solves the > > > > problem in 240ms. Both Ignite and Spark involve "firing up the > > > > platform" and the code is more complex for simple scenarios. Spark > has > > > > a 181M footprint across 210 jars and solves the problem in about 20s. > > > > Ignite has a 87M footprint across 85 jars and solves the problem in > > > > > 40s. But I can also find more complex scenarios which need to scale > > > > where Ignite and Spark really come into their own. > > > > ---CUT--- > > > > > > > > A similar rationale was behind my developing/using the SOFM > > > > functionality in the "o.a.c.m.ml.neuralnet" package: I needed a > > > > proof of concept, and taking the "lightweight" pa
Re: [Vote] Create a "machine learning" component
Hi > Did you ask "Spark" people about their opinion about it? -- Not yet. I am not sure what would be the right option for this communication. It will be good if you can approach them. > where it can be used in real-life (performance-wise) applications, then you should demonstrate it -- Do we have any kind of performance benchmark or use case regarding this? Once that is decided, then I can proceed with this. Thanks & Regards --Avijit Basak On Mon, 19 Apr 2021 at 18:51, Gilles Sadowski wrote: > Hello. > > Le lun. 19 avr. 2021 à 08:35, Avijit Basak a > écrit : > > > > Hi > > > > >Isn't a GA inherently parallel? > > >If so, why not take advantage of the concurrency tools provided by the > JDK? > > -- Are we planning to implement multi-threading for GA operations even > as > > part of a single population > > This seems an obvious improvement to our current implementation > (in case a chromosome's evaluation is not population-dependent). > > > or only for multi-population parallel GA. > > -- We can implement different types of co-evolution as part of parallel > > GA. Need to decide on the corresponding strategies we are going to > > incorporate. > > The discussion is still about the "administrative" question of whether > any of this should be implemented in the "Commons" project... > > Did you ask "Spark" people about their opinion about it? > > As I said, if you are confident that you can bring our implementation to > a state where it can be used in real-life (performance-wise) applications, > then you should demonstrate it (in order to convince other people from > the Commons PMC that it is worth engaging in long-term maintenance). > AFAICT, a way to do it would be to create a GitHub project (aimed at > becoming a new "machine learning" component, or a maven/JPMS > module within Commons Math). > > Best regards, > Gilles > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [Vote] Create a "machine learning" component
Hi >Isn't a GA inherently parallel? >If so, why not take advantage of the concurrency tools provided by the JDK? -- Are we planning to implement multi-threading for GA operations even as part of a single population or only for multi-population parallel GA. -- We can implement different types of co-evolution as part of parallel GA. Need to decide on the corresponding strategies we are going to incorporate. Thanks & Regards --Avijit Basak On Wed, 14 Apr 2021 at 05:53, Gilles Sadowski wrote: > Le mar. 13 avr. 2021 à 18:21, Avijit Basak a > écrit : > > > > Hi > > > > Please find my comments below. > > > > >> I don't follow the distinction "prod" vs "non-prod". > > -- Actually in Prod we really need a very high performing system. So > > use of implicit parallelism in spark would help us to achieve it. But for > > other types of work like POC or R we may not need such performance. > > Isn't a GA inherently parallel? > If so, why not take advantage of the concurrency tools provided by the JDK? > > > >> the question was actually whether you are willing to modularize CM > > -- I am not much aware of other ml components in commons. I would > look > > into it. > > I've mentioned them in earlier messages: > * Self-organizing feature map (artificial neural net) > * Clustering > > The former is multi-threaded; the latter should be refactored to > take advantage of multi-threading. > > > >>You did not expand about the usability/performance (e.g. the issue of > > multi-threading) > > -- Are we planning to incorporate parallel GA. > > Aren't you? > > > Then multi-threading > > would be a more appropriate option. > > IMHO, a necessary one. > > > >> So, as a way forward, I would suggest that you create a project on > > GitHub (copying all the settings from a *Commons modular* component, > such as > > "Commons Numbers") > > -- Could you kindly share the GitHub repository URL for any Commons > > modular component. > > https://github.com/apache/commons-rng > https://github.com/apache/commons-numbers > https://github.com/apache/commons-geometry > https://github.com/apache/commons-statistics > > > > > Thanks & Regards > > --Avijit Basak > > > > > > On Tue, 13 Apr 2021 at 18:29, Gilles Sadowski > wrote: > > > > > Hello. > > > > > > Le lun. 12 avr. 2021 à 17:21, Avijit Basak a > > > écrit : > > > > > > > > Hi > > > > > > > > Sorry for the delayed response. Thanks for your patience. > Please > > > > find my comments below: > > > > > > > > (1) Why not Spark? [At least post over there (?).] > > > > --We can move to Spark. But it will be very much useful if the > > > things > > > > can also run without Spark. The use of Spark would make more sense > in a > > > > production environment. But the portability of the library will be > more > > > > useful for the non-prod environment. > > > > > > I don't follow the distinction "prod" vs "non-prod". > > > > > > > Definitely, we can reach the Spark > > > > team and query. > > > > > > That would be a good idea... > > > > > > > (2) Further develop a monolithic CM? [Who will do it?] > > > >--I can help with the upgrade of the existing library related > to > > > GA > > > > functionality. > > > > > > Sure, but nobody is currently working on (2). > > > > > > > (3) Modularize CM? [Who will do it?] > > > >--I can help with the upgrade of the existing library related > to > > > GA > > > > functionality. > > > > > > I don't doubt it; but the question was actually whether you are willing > > > to modularize CM (that is: in addition to, and before, contributing to > > > the GA functionality). > > > > > > > (4) New component (with another name) with the proposed contents? > > > >--This is the best option if permitted. > > > > > > Currently, only the two of us are in favour of this alternative. > > > > > > Nobody, by their action, is really in favour of any of the other > > > alternatives. > > > So, as a way forward, I would suggest that you create a project on > GitHub > > > (copying all the settings from a Commons modular component, such as > > > "Commons Numbers"), to be eventually integrated here, once its > potential > > > has been demonstrated. > > > > > > > The code which I have written can be reused with minor > > > modifications. > > > > So it won't take too much effort for this activity. > > > > > > You did not expand about the usability/performance (e.g. the issue of > > > multi-threading)... > > > > > > Regards, > > > Gilles > > > > > > >> [...] > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [Vote] Create a "machine learning" component
Hi Please find my comments below. >> I don't follow the distinction "prod" vs "non-prod". -- Actually in Prod we really need a very high performing system. So use of implicit parallelism in spark would help us to achieve it. But for other types of work like POC or R we may not need such performance. >> the question was actually whether you are willing to modularize CM -- I am not much aware of other ml components in commons. I would look into it. >>You did not expand about the usability/performance (e.g. the issue of multi-threading) -- Are we planning to incorporate parallel GA. Then multi-threading would be a more appropriate option. >> So, as a way forward, I would suggest that you create a project on GitHub (copying all the settings from a *Commons modular* component, such as "Commons Numbers") -- Could you kindly share the GitHub repository URL for any Commons modular component. Thanks & Regards --Avijit Basak On Tue, 13 Apr 2021 at 18:29, Gilles Sadowski wrote: > Hello. > > Le lun. 12 avr. 2021 à 17:21, Avijit Basak a > écrit : > > > > Hi > > > > Sorry for the delayed response. Thanks for your patience. Please > > find my comments below: > > > > (1) Why not Spark? [At least post over there (?).] > > --We can move to Spark. But it will be very much useful if the > things > > can also run without Spark. The use of Spark would make more sense in a > > production environment. But the portability of the library will be more > > useful for the non-prod environment. > > I don't follow the distinction "prod" vs "non-prod". > > > Definitely, we can reach the Spark > > team and query. > > That would be a good idea... > > > (2) Further develop a monolithic CM? [Who will do it?] > >--I can help with the upgrade of the existing library related to > GA > > functionality. > > Sure, but nobody is currently working on (2). > > > (3) Modularize CM? [Who will do it?] > >--I can help with the upgrade of the existing library related to > GA > > functionality. > > I don't doubt it; but the question was actually whether you are willing > to modularize CM (that is: in addition to, and before, contributing to > the GA functionality). > > > (4) New component (with another name) with the proposed contents? > >--This is the best option if permitted. > > Currently, only the two of us are in favour of this alternative. > > Nobody, by their action, is really in favour of any of the other > alternatives. > So, as a way forward, I would suggest that you create a project on GitHub > (copying all the settings from a Commons modular component, such as > "Commons Numbers"), to be eventually integrated here, once its potential > has been demonstrated. > > > The code which I have written can be reused with minor > modifications. > > So it won't take too much effort for this activity. > > You did not expand about the usability/performance (e.g. the issue of > multi-threading)... > > Regards, > Gilles > > >> [...] > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [Vote] Create a "machine learning" component
Hi Sorry for the delayed response. Thanks for your patience. Please find my comments below: (1) Why not Spark? [At least post over there (?).] --We can move to Spark. But it will be very much useful if the things can also run without Spark. The use of Spark would make more sense in a production environment. But the portability of the library will be more useful for the non-prod environment. Definitely, we can reach the Spark team and query. (2) Further develop a monolithic CM? [Who will do it?] --I can help with the upgrade of the existing library related to GA functionality. (3) Modularize CM? [Who will do it?] --I can help with the upgrade of the existing library related to GA functionality. (4) New component (with another name) with the proposed contents? --This is the best option if permitted. The code which I have written can be reused with minor modifications. So it won't take too much effort for this activity. Kindly share further thoughts. Thanks & Regards --Avijit Basak On Sun, 14 Feb 2021 at 19:56, Gilles Sadowski wrote: > Le dim. 14 févr. 2021 à 09:06, Avijit Basak a > écrit : > > > > Hi > > > >I would like to mention a few points here. Genetic Algorithm has a > > vast range of applications in optimization and search problems. Machine > > learning is only one of those. > >If we couple the new GA library with any specific domain like ml > it > > would be meaningless for people working in other domains. > > Isn't "meaningless" a slight overstatement? > We might have an issue of terminology: There is no necessary "coupling" > but maybe "acquaintance" (for lack of a better word), as a set of tools > that > might come in handy for solving certain types of problems. [For example, > the Traveling Salesman Problem can be tackled by GA and SOFM, both > of which are candidate for inclusion in the new component, although they > don't share any code.] > > If the name "machine learning" is not the most appropriate one to convey > the intended scope, do you have another idea? > ["AI" would perhaps be more correct if we consider a strict hierarchy, but > would obviously be far too presumptuous.] > > > They have to > > incorporate the entire ml library > > No, they won't. Given the stated goal of "modularity": the "ga" module > will be available as a dedicated JAR (possibly with a dependency to > codes that can be reused in other modules provided by the component). > > > which may be completely unrelated to > > their project. Coupling it with any technology like spark might also > limit > > it's usability. > > You may be right; I have no idea about the "restrictions" imposed by > Spark. [It seems that in this case, one would have to indeed depend > on Spark's "mllib" (?). This would be one reason, as I already stated, > for having something in "Commons".] > > Could you elaborate on a concrete use-case where one would be > starting to develop an application with the specific requirement that > Spark could not be used? > In particular, IIRC Spark has multi-threading built in. Don't you see > it as a huge problem that CM would not provide such a feature? > > >If a separate component is not approved for this change then we > can > > incorporate the changes as part of *commons.math* library. > > Of course, if somebody wants to do that, he's welcome. > [That will not be me, for all the reasons which I've explained. In the > last > 5 years I've been pretty much alone in handling bug reports about CM; > I'm unwilling to assume implicit support for even more codes.] > > Also, with this solution, you'd now be willing to accept what you weren't > above: Anyone wanting to use the GA functionality would indeed have to > "incorporate" the whole of "Commons Math" (CM). > Of course, the latter could be modularized, but this will only mitigate the > issue, as any release of the GA functionality will potentially be then held > off by potential issues in other parts of CM (which nobody has been able > to consistently support for more than 5 years now). > > >The same library can be reused in ml or neural network libraries > as > > a dependency. > > It is the other way around: The development version of CM currently > depends on "lower-level" components. > Furthermore, right now its (embryonic) "machine learning" functionality > hasn't any substantial dependency on codes outside the "o.a.c.math4.ml" > package. > > >Kindly share further views on this. > > In summary, to be cla
Re: [Vote] Create a "machine learning" component
Hi I would like to mention a few points here. Genetic Algorithm has a vast range of applications in optimization and search problems. Machine learning is only one of those. If we couple the new GA library with any specific domain like ml it would be meaningless for people working in other domains. They have to incorporate the entire ml library which may be completely unrelated to their project. Coupling it with any technology like spark might also limit it's usability. If a separate component is not approved for this change then we can incorporate the changes as part of *commons.math* library. The same library can be reused in ml or neural network libraries as a dependency. Kindly share further views on this. Thanks & Regards --Avijit Basak On Wed, 10 Feb 2021 at 19:49, Gilles Sadowski wrote: > Le mer. 10 févr. 2021 à 13:19, sebb a écrit : > > > > Likewise, commons-ml is too cryptic. > > > > Also, the Spark project has a machine-learning library: > > > > https://spark.apache.org/mllib/ > > Thanks for the pointer. > > > > > Maybe that would be better home? > > On the face of it, probably. > [For sure, Avijit should comment on the suggestion.] > > On the other hand, "Commons" is the place where one can pick "bare > bone" implementations, and add the functionality to one's application > without necessarily comply with an overarching framework. > [I don't mean that framework compliance is bad; quite the contrary, it is > hopefully the result of a thorough reflection by experts. But ... cf. the > numerous "no-dependency" discussions ...] > > Actually, concerning Avijit's proposed contribution, didn't I say:[1] > ---CUT--- > Thus, I think that we must assess whether the "genetic algorithms" > functionality has a reasonable future within "Apache Commons" (i.e. > potential users and contributors) while there exist other libraries that > seem much more advanced for any serious usage. > ---CUT--- > > > I'm also a bit concerned as to whether there are sufficient developers > > here with knowledge of the ML domain to be able to support the code in > > the future. > > An interesting point; by all means not a new one (see e.g. [2]). > > Isn't it the same point I've been making about "Commons Math" (CM)? > There has been no releases because nobody here is able (or is willing > to) support it. > > Concerning the support of the purported "machinelearning" component: > 1. Package > org.apache.commons.math4.ml.neuralnet > * I've written it entirely and I have applications that depend on it > (and I > cannot assume that I could easily switch to, or port it to, Spark), > so I > can reasonably ensure that it would be supported. > 2. Package > org.apache.commons.math4.ml.clustering > * Functionality is mentioned in Spark's "mllib" user guide. > * When a new feature was last contributed[3], it was noticed[4][5][6] > that improvement were needed (but there was no follow-up). > * I've an application that depend on it (from CM v3.6.1) but I wouldn't > support it if shipped in CM v4.0. > 3. Package > org.apache.commons.math4.genetics > * Part of my "end-of-study" project consisted in a GA implementation. > I've never used the CM implementation, and I don't deny that there > could be perfectly fine uses of it but, just looking at the code, it > seems > obvious that it cannot compete feature-wise with other libraries > out there. > * I've suggested long ago that, without anyone supporting it actively > (and > no known user community), it should be dropped from CM. > * Avijit expressed a willingness to improve the functionality: Is > this enough > for the PMC to create a new component? From the experience with the > "clustering" package mentioned above, I'd tend to think > (unfortunately) > that it isn't. He should first explore whether the Spark community > is > interested, that the GA functionality be moved over there. > > Gilles > > [1] https://issues.apache.org/jira/browse/MATH-1563 > [2] https://markmail.org/message/26yxj5vhysdsoety > [3] https://issues.apache.org/jira/projects/MATH/issues/MATH-1509 > [4] https://issues.apache.org/jira/projects/MATH/issues/MATH-1524 > [5] https://issues.apache.org/jira/projects/MATH/issues/MATH-1528 > [6] https://issues.apache.org/jira/projects/MATH/issues/MATH-1526 > > > > > On Wed, 10 Feb 2021 at 08:27, Emmanuel Bourg wrote: > > > > > > -1 for commons-ml for the same reasons. > > > > > > What a
Re: [All][Math] New GA component
Hello Gilles Thanks for your reply. Actually I am not very comfortable with the porting process. It will be really nice if I can have an initial repository. Thanks & Regards --Avijit Basak On Wed, 20 Jan 2021 at 17:50, Gilles Sadowski wrote: > Hello. > > Le mer. 20 janv. 2021 à 11:11, Avijit Basak a > écrit : > > > > Hello Gilles Sadowski > > > > Thanks for your reply. Yes I intend to contribute to enhancement > > of the GA functionality as per the JIRA (MATH-1563) proposal. > > My proposal was to first create a new component (and, thus, implement > the enhancement over there). > Do you agree to perform the port? As said in the previous message, this > should be relatively easy, but will require populating a new "git" > repository, > using a recent and similar project's (e.g. "Commons Numbers") files as > templates. > > > If I find any > > other changes suitable I would also propose the same. Could you kindly > look > > into the approval process for this JIRA. > > There is no "process" other than the discussions taking place here, on > the "dev" ML. > > Regards, > Gilles > > > > > Thanks & Regards > > --Avijit Basak > > > > On Wed, 20 Jan 2021 at 04:11, Gilles Sadowski > wrote: > > > > > Hi Avijit. > > > > > > [I've changed the "Subject:" line.] > > > > > > Le mar. 19 janv. 2021 à 08:31, Avijit Basak a > > > écrit : > > > > > > > > Hello Gilles Sadowski > > > > > > > > I have extended the current implementation of Genetic > Algorithm > > > in a.c.m package and made the probability generation process adaptive. > A > > > significant improvement of performance was observed because of this. > The > > > current version of implementation in a.c.m.GA incorporates simple > genetic > > > algorithm which is not much efficient and useful. However I have > extended > > > the same framework to incorporate the enhancement as part of my work. > > > However the library can also be extended to incorporate other advanced > > > concepts of Genetic Programming. > > > > > > Do you intend to do, or otherwise further contribute to the enhancement > > > of the GA functionality? > > > > > > > To compare with other libraries I have chosen a.c.m because > of > > > it's flexible and extensible design. > > > > > > That's good news, despite we never had much feedback about that code > > > base... > > > > > > > This is to be decided if we need a new component or extend > the > > > same component. > > > > > > The functionality in package "o.a.c.m.genetics" does not depend on > > > functionality > > > in other packages (except for exceptions). Setting up a new component > > > would > > > thus be very easy. > > > Doing so will bring the same maintenance advantage as we have witnessed > > > with > > > the other Commons Math spin-offs. > > > > > > Regards, > > > Gilles > > > > > > >> [...] > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: [All][Math] New GA component
Hello Gilles Sadowski Thanks for your reply. Yes I intend to contribute to enhancement of the GA functionality as per the JIRA (MATH-1563) proposal. If I find any other changes suitable I would also propose the same. Could you kindly look into the approval process for this JIRA. Thanks & Regards --Avijit Basak On Wed, 20 Jan 2021 at 04:11, Gilles Sadowski wrote: > Hi Avijit. > > [I've changed the "Subject:" line.] > > Le mar. 19 janv. 2021 à 08:31, Avijit Basak a > écrit : > > > > Hello Gilles Sadowski > > > > I have extended the current implementation of Genetic Algorithm > in a.c.m package and made the probability generation process adaptive. A > significant improvement of performance was observed because of this. The > current version of implementation in a.c.m.GA incorporates simple genetic > algorithm which is not much efficient and useful. However I have extended > the same framework to incorporate the enhancement as part of my work. > However the library can also be extended to incorporate other advanced > concepts of Genetic Programming. > > Do you intend to do, or otherwise further contribute to the enhancement > of the GA functionality? > > > To compare with other libraries I have chosen a.c.m because of > it's flexible and extensible design. > > That's good news, despite we never had much feedback about that code > base... > > > This is to be decided if we need a new component or extend the > same component. > > The functionality in package "o.a.c.m.genetics" does not depend on > functionality > in other packages (except for exceptions). Setting up a new component > would > thus be very easy. > Doing so will bring the same maintenance advantage as we have witnessed > with > the other Commons Math spin-offs. > > Regards, > Gilles > > >> [...] > > ----- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Avijit Basak
Re: Contributor License Agreement
Hello Gilles Sadowski I have extended the current implementation of Genetic Algorithm in a.c.m package and made the probability generation process adaptive. A significant improvement of performance was observed because of this. The current version of implementation in a.c.m.GA incorporates simple genetic algorithm which is not much efficient and useful. However I have extended the same framework to incorporate the enhancement as part of my work. However the library can also be extended to incorporate other advanced concepts of Genetic Programming. To compare with other libraries I have chosen a.c.m because of it's flexible and extensible design. This is to be decided if we need a new component or extend the same component. Kindly share your thoughts on this. Thanks & Regards --Avijit Basak On Mon, 18 Jan 2021 at 23:21, Gilles Sadowski wrote: > Hi. > > Le lun. 18 janv. 2021 à 17:56, Avijit Basak a > écrit : > > > > Hello > > > > I would like to inform you that I am interested in contributing > to > > the Apache Commons Maths project. A JIRA (*MATH-1563*) was created with > the > > respective proposal. Kindly grant me the required access for the same. I > > would like to use my github Id *'avijitbasak'* for this contribution. > > Kindly let me know if any further information is required. > > I tried to get some discussion started on the "dev" ML: > https://markmail.org/message/p7gkatll4dvdlcdd > > Your opinion is certainly welcome... > > Regards, > Gilles > > > > > Thanks & Regards > > --Avijit Basak > > > > -- Forwarded message - > > From: Matt Sicker > > Date: Mon, 4 Jan 2021 at 21:28 > > Subject: Re: Contributor License Agreement > > To: Avijit Basak > > Cc: > > > > > > Dear Avijit Basak, > > > > This message acknowledges receipt of your ICLA, which has been filed in > the > > Apache Software Foundation records. > > > > With this message, the Commons PMC has been notified that your ICLA has > > been filed. > > > > ** Please contact the Apache Commons PMC with any further questions, not > > the Secretary. Thanks. ** > > > > If you have been invited as a committer, please provide the Apache > Commons > > PMC (copied) with your preferred Apache id. > > > > The id must not already be in use. See > > https://people.apache.org/committer-index.html > > Note that some existing ids include '-' and '_'. These characters are no > > longer permitted in ids. > > > > The id must consist of lowercase alphanumeric characters only, starting > > with an alphabetic character. > > Minimum length 3 characters. No special characters. > > > > Warm Regards, > > > > -- > > Matt Sicker > > Secretary, Apache Software Foundation > > > > > > > > -- > > Avijit Basak > -- Avijit Basak
[MATH] A Proposal for Implementation of Adaptive Probability Generation Strategy for Genetic Algorithm
Hi All I would like to propose incorporation of adaptive probability generation strategy for Genetic Algorithm implementation of apache commons maths library. Currently Apache's API works on constant probability strategy. I have done some work on the adaptive approach and published in this article " https://www.ijcaonline.org/archives/volume175/number10/basak-2020-ijca-920572.pdf ". I have created a JIRA "MATH-1563" to describe the same. Kindly let me know your views on the same. Thanks & Regards -- Avijit Basak