Hello.

Le jeu. 23 déc. 2021 à 14:22, Avijit Basak <avijit.ba...@gmail.com> a écrit :
>
> Hi All
>
>          Please see my comments below.
>
> >As I've already indicated, "ThreadLocalRandomSource" is, IMHO, a
> >sort of workaround for a multi-thread application that does not want
> >to bother managing per-thread RNG instance(s).
> -- I am not clear on this. ThreadLocalRandomSource maintains
> an EnumMap<RandomSource, ThreadLocal<UniformRandomProvider>>. What is meant
> by it "does not want to bother managing per-thread RNG instance(s)" Could
> you please elaborate more on this. If this is an issue in RNG why don't we
> think of fixing the same or providing a different internal implementation.

There is no issue in "Commons RNG"; it provides a tool.
I think that it is not the right tool for a multi-threaded GA library.

>
> >The library should not make that decision for the application since we
> >can care for both usages: Every piece of the GA that needs a RNG can
> >provide factory methods that either take a "RandomSource" argument
> >or create a default one.
> -- Library can always use a default option or provide an option for
> customization at a global level but it need not be at the operator
> level(IMHO).

How can a GA operator work without a RNG?
It can't; it is one of the main settings of such an operator, and the
reason it should be customizable.

> I don't see much use of it.

That's OK; that's why I proposed for this kind a use a way to
generate a default instance, without any burden for the caller.

> >
> > >> >2. Less/no flexibility (no user's choice of random source).
> > >> -- Agreed.
> > -- Do we really need this much flexibility here?
>
> >My main concern is that IMO the RNG is a prominent part of a GA
> >and it is not a good design to use "ThreadLocalRandomSource".
> -- RNG is definitely a prominent part. However, if we have a sharing issue
> with ThreadLocalRandomSource we need to think of it's alternate
> implementation.

There is a misunderstanding; there is no sharing issue, there is
a design issue.

> >How many is "too many instances"?
> >The memory used by an operator is tiny compared to a chromosome,
> >even less to a population of chromosome, or two populations of them
> >(parents and offsprings).
> --My concern is we are trying to provide a fix for a performance problem in
> another library and that is going to consume additional memory.

Nothing (at all) that we should be worried (and discussing further):
Most RNG implementations are quite lean (a few hundred bytes to
a few KB).  You multiply this by the number of threads (a few tens
at most), and you are well below 1 MB.  What is this amount when
compared to the average Java application nowadays?

> >     So I think we have a design tradeoff here performance vs memory
> > consumption. I am more worried about memory as that might restrict use of
> > this library beyond a certain number of dimensions in some areas.
>
> >I'm referring to separate copies for each thread.
> >How many threads/virtual CPUs are common nowadays?
> >> However,
> >> creating deep copy would only be possible when we strictly restrict
> >> extension of operators which I want to avoid.
>
> >How to avoid deep copies in a multi-thread library?
> >Through synchronization?
> -- The operator interfaces are designed like a functional interface.
> Accordingly, the current implementation of all operators are read only. The
> implementation does not maintain any mutable properties during computations
> too. So they are perfectly suitable for multi-threaded operation.

Great!

> If you
> see any deviation to it please notify me.

Sure.
Sorry I did not have the time to look into the code yet.

> >
> > >> So even if we provide
> > >> the customization at the operator level we cannot avoid sharing.
> >
> > >We can, and we should.
> > >What we probably can't avoid sharing is the instance that represents the
> > >population of chromosomes.
> > *--* In a multi-threaded optimization the chromosome instances are shared
> > in case the same chromosome is chosen for crossover by the selection
> > process. I missed this point earlier.
> > ...
>
> Chromosomes can be shared (if they are read-only).
> --They are read-only.

And immutable?

> >
> > >> >  Mine is against using "ThreadLocalRandomSource"...
> > >> -- What is the wayout other than that. Please suggest.
> >
> > >I think I did.
> > *--* The factory based approach would be useful only when we can have
> > separate copies of operators for each set of operations.
>
> If we don't have separate copies in each thread, then the operator
> will not be multithreaded...
> -- If operators do not contain any mutable property then they are perfectly
> usable in a multi-threaded environment.

The problem is that they do: by necessity, the RNG instance is mutable.
You want to hide this fact through using "ThreadLocalRandomSource",
and I think that it should not be hidden.

> > *--* I think we should not block the extension.
>
> >This would be going backwards to many things that have been done
> >to improve the robustness and reduce the bug counts of the Commons
> >Math codes.
> -- GA is different from other math functions. We may not impose the same
> principle on everything.

The principle stems from putting (actually needed) robustness above of
(hypothetically needed) extensibility.
IIRC every usage of "protected" in Commons Math, on the expectation
that it might be useful for some (indefinite) use, was reverted to "private".

We should develop first with as few "public" API as possible; then if
the need arise, and your design is indeed extensible by construction,
it will just be be a matter of changing "private" to "protected" in a later
release.

> > >Initially we discussed about having a light-weight library, for easier
> > usage
> > >than alternative existing framework(s).
> > *--* We can always think of making the framework lightweight but it should
> > not cost extensibility.
>
> >There is no cost: We'll gladly merge every worthy extension into
> >the Commons component.
> -- I think we have a disconnect here. If the framework is not extensible
> how anyone would be able to use it in any new domain. Do you mean first the
> framework should be changed for any new domain and users should only use it
> out of box.

This is an open-source project.
Anyone can take the code, make whatever extension, use it for whatever
purpose (e.g. proving by a working example that it is needed), and submit
a patch so that everyone benefits in the next release.

> >
> > >> E.g. any developer should be able to extend the
> > >> IntegralChromosome class and define a child class which explicitly
> > >> specifies the range of integers to be used.
> >
> > >It does not look like this would need an extension, only configuration
> > >of the range.
> > *-- *I agree. But the question is should we block the extension.
>
> >Please find a valid use case. ;-)
> -- Recently I did an implementation of scheduling with commons-math 3.6. I
> have implemented the chromosome representing schedule by extending
> AbstractListChromosome. The mutation was also customized according to the
> requirement. However, I was able to use the existing OnePointCrossover
> operator. Do you think this kind of implementation would be possible if the
> framework does not support extensibility?

I'm lacking information, namely to understand why you could use
the "crossover" but not the "mutation".
Also isn't the chromosome in principle an abstract representation
of any solution independently of the domain (how good a solution
for the problem at hand being obtained through computing the
fitness of its associated phenotype)?

In the end, we should first wonder whether there is a design issue
that could be solved without resorting to using "protected" fields.
[My first impression about what you had to do, is that it points to a
shortcoming of the GA functionality in previous versions of CM
and the new design is an opportunity to fix that.]

> > >> I have initially implemented
> > >> the Binary chromosome and the corresponding binary mutation following
> the
> > >> same pattern. However, restricting extension of concrete classes by
> > private
> > >> constructor does not prevent users from extending the abstract parent
> > >> classes.
> >
> > >We should aim at coding the GA logic through (Java) interfaces, and not
> > >expose the "abstract" classes.
> > *-- *One of the primary reasons for me to contribute in Apache' GA library
> > is it's simplicity and extensibility.
>
> >"Extensibility" does not necessarily imply "inheritance"-based.
> -- Can you provide a solution to the above problem without an extensibility
> feature?

It depends on the scope of the library.
I'm pretty that whatever the new implementation which you are
working on, there are some problems which it won't be able to
solve even if it's inheritance-based.
Moreover it can be construed that if some user has to develop
an extension, he might rather turn to another software with that
functionality already built in.

>
> >In fact, we do want to *avoid* in order to more easily and more robustly
> >provide other advantages such as multi-threading.
> -- IMHO immutable operator design is the best choice for supporting
> multi-threading.

Agreed.
Immutability implies that all fields are "final" (hence "protected"
fields would be useless).

> It is much easier to implement even for user extension.

Agreed.
Whether we allow some classes to be non-"final" is a much
easier discussion.  No problem in doing that if it imposes no
maintenance burden.

> Why don't we think of fixing the ThreadLocalRandomSource.

As said above, nothing to fix there.

>
> >> I would like to have a framework
> >> which should be always extensible for any problem domain with minor
> >> changes.
>
> >Any problem domain should indeed be amenable to be solved
> >by the library; I don't see how that should imply a design based
> >on inheritance.
> -- Do you have any alter design in mind. Kindly share the same.

I gave some hints in previous messages; I can't promise that it
would fly without actually trying it. ;-)
But I will do it once the code is in a branch which I can modify.

>
> >> The primary reason behind this is that application domains of GA
> >> are too diverse. It is not possible to implement everything in a library.
> >> We don't know all possible domain areas too. If we remove the
> extensibility
> >> from the framework it would be useless in lots of areas.
>
> >When that occurs, people are welcome to contribute back if
> >something they need is missing.
> -- I think we have a disconnect here too. If the framework is not
> extensible how users can use this in their problem domain. If this is not
> extensible then it would never be used. How can we get back the
> contribution?

I answered to this above.

>
> >Your argument of "too much diversity" can be reversed, in that
> >it is unlikely that one library would attract everyone that needs a
> >genetic algorithm.
> -- Even if it cannot attract everyone with out of box features it should be
> extensible for those.

I don't agree with making things more complicated for us, now and
in the foreseeable future, in order to satisfy users who don't exist yet
(because the library does not exist yet).

Let's focus on making it work within a given scope, and then we can
think of improvements (that will be easy if the design is "structurally"
extensible, even if they are somehow "disabled" in the first release).

> >Better make a design that can handle a fraction of use cases,
> >and grow as needed.
> --There are already libraries which can solve most common use cases.
> Non-extensible nature would block the growth to a considerable extent.

Is there a misunderstanding about what is implied by "extensible"?
Question: Are all classes, in your current design, "immutable"?
If so, that's an excellent basis, and we should stop discussing the
meaning of "extensibility".

>
> >> >Extending the functionality, if necessary, should be contributed back
> here
> >> *-- *Sometimes the GA operators are very much specific to the domain and
> >> it's hard to generalise. In those scenarios contributing back to the
> >> library might not be possible.
>
> >In such a case, how likely will it also be that whatever general
> >framework this library has put in place, will also not be amenable
> >to that domain's specifics?
> -- Could you please frame this concern w.r.t. the scheduling example
> provided above.

?

>
> >There is always a scope from which design decisions must be taken.
> >If "multi-threading" is in the scope, then the design must avoid
> >inheritance (in public classes) in order to much more easily
> >ensure the correctness of applications.
> -- Immutable design can also take care of multi-threading.

My main point in the discussion is that all classes with "public" access
should be immutable, indeed.

>
> >> However, if a library cannot be extended for
> >> a new domain by users it becomes underutilised over time if not useless.
>
> >Sure but that is a hypothetical for the long-term.
> >However, if the library is buggy or slow, it will not be used at all.
> -- Is there any benchmark for speed/performance? GA is always infamous for
> resource consumption rather than time.

I'm not sure I understand what you mean here.

Gilles

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to