Hi All

        Please see my *changed* comments below.

>> >  Mine is against using "ThreadLocalRandomSource"...
>> -- What is the wayout other than that. Please suggest.

>I think I did.
>>*--* The factory based approach would be useful only when we can have
separate copies of operators for each set of operations.
*--* *T*he factory based approach can introduce *custom* RNG, but it can
improve performance only when we can have separate copies of operators for
each set of operations which might lead to *memory issues* as explained in
previous mail.


Thanks & Regards
--Avijit Basak

On Wed, 22 Dec 2021 at 18:54, Avijit Basak <avijit.ba...@gmail.com> wrote:

> Hi All
>
>         Please see my comments below.
>
> >> >Several problems with this approach (raised in previous messages IIRC):
> >> >1. Potential performance loss in sharing the same RNG instance.
> >> -- As per my understanding ThreadLocalRandomSource creates separate
> >> instances of UniformRandomProvider for each thread. So I am not sure
> how a
> >> UniformRandomProvider instance is being shared. Please correct me if I
> am
> >> wrong.
>
> >Within a given thread there will be *one* RNG instance; that's what I
> meant
> >by "shared".
> >Of course you are right that that instance is not shared by multiple
> threads
> >(which would be a bug).
> >The performance loss is because it will be necessary to call
> >  ThreadLocalRandomSource.current(RandomSource source)
> >for each access to the RNG (since it would be a bug to store the returned
> >value in e.g. an operator instance that would be shared among threads (as
> >you suggest below).
>
> -- I tried to do a small test on it and here are the results. Output times
> are in milliseconds. According to my understanding the performance loss is
> mostly during creation of per thread instance of UniformRandomProvider.
> --*CUT*--
>     @Test
>     void test() {
>         int limit = 1;
>         long start = System.currentTimeMillis();
>         for (int i = 0; i < limit; i++) {
>             ThreadLocalRandomSource.current(RandomSource.JDK);
>         }
>         System.out.println(System.currentTimeMillis() - start);
>
>         limit = 1000;
>         start = System.currentTimeMillis();
>         for (int i = 0; i < limit; i++) {
>             ThreadLocalRandomSource.current(RandomSource.JDK);
>         }
>         System.out.println(System.currentTimeMillis() - start);
>
>         limit = 10000;
>         start = System.currentTimeMillis();
>         for (int i = 0; i < limit; i++) {
>             ThreadLocalRandomSource.current(RandomSource.JDK);
>         }
>         System.out.println(System.currentTimeMillis() - start);
>
>         limit = 100000;
>         start = System.currentTimeMillis();
>         for (int i = 0; i < limit; i++) {
>             ThreadLocalRandomSource.current(RandomSource.JDK);
>         }
>         System.out.println(System.currentTimeMillis() - start);
>
>         limit = 1000000;
>         start = System.currentTimeMillis();
>         for (int i = 0; i < limit; i++) {
>             ThreadLocalRandomSource.current(RandomSource.JDK);
>         }
>         System.out.println(System.currentTimeMillis() - start);
>
>         limit = 10000000;
>         start = System.currentTimeMillis();
>         for (int i = 0; i < limit; i++) {
>             ThreadLocalRandomSource.current(RandomSource.JDK);
>         }
>         System.out.println(System.currentTimeMillis() - start);
>
>         limit = 100000000;
>         start = System.currentTimeMillis();
>         for (int i = 0; i < limit; i++) {
>             ThreadLocalRandomSource.current(RandomSource.JDK);
>         }
>         System.out.println(System.currentTimeMillis() - start);
>
>         limit = 1000000000;
>         start = System.currentTimeMillis();
>         for (int i = 0; i < limit; i++) {
>             ThreadLocalRandomSource.current(RandomSource.JDK);
>         }
>         System.out.println(System.currentTimeMillis() - start);
>     }
> --*CUT*--
> --*output*--
> 363
> 1
> 2
> 4
> 6
> 28
> 244
> 2423
> --*output*--
>
> >> >2. Less/no flexibility (no user's choice of random source).
> >> -- Agreed.
> -- Do we really need this much flexibility here?
> >> >3. Error-prone (user can access/reuse the "UniformRandomProvider"
> >> instances).
> >>
> >> >Again: "ThreadLocalRandomSource" is an ad-hoc workaround for correct
> but
> >> >"light" usage of random number generation in a multi-threaded
> application;
> >> GAs
> >> >make "heavy" use of RNG, thus it is does not seem outlandish that all
> the
> >> RNG
> >> >"clients" (e.g. every "operator") creates their own instances.
> >
> >
> >> >IMHO, a more important discussion would be about the expectations in a
> >> >multithreaded context: E.g. should an operator be shareable by
> different
> >> >threads?  And if not, how does the API help application developers to
> avoid
> >> >such pitfalls?
> >> -- Once we implement multi-threading in GA, same crossover and mutation
> >> operators will be re-used across multiple threads.
>
> >I would be wary to go on that path; better consider making (deep) copies.
> >We can have multiple instances of an operator, all being configured in the
> >same way but being different instances with no risk of a multithreading
> bug.
>
> -- I don't think this would be a good design choice just to support
> customization of RNG functionality. This will lead to too many instances of
> the same operators resulting in lots of unnecessary memory consumption. I
> think we might face memory issues for higher dimensional problems. As
> population size requirement also increases with increase of dimension this
> might lead to a major issue and need a thought.
>     So I think we have a design tradeoff here performance vs memory
> consumption. I am more worried about memory as that might restrict use of
> this library beyond a certain number of dimensions in some areas. However,
> creating deep copy would only be possible when we strictly restrict
> extension of operators which I want to avoid.
>
> >> So even if we provide
> >> the customization at the operator level we cannot avoid sharing.
>
> >We can, and we should.
> >What we probably can't avoid sharing is the instance that represents the
> >population of chromosomes.
> *--* In a multi-threaded optimization the chromosome instances are shared
> in case the same chromosome is chosen for crossover by the selection
> process. I missed this point earlier.
> ...
>
> >> >  Mine is against using "ThreadLocalRandomSource"...
> >> -- What is the wayout other than that. Please suggest.
>
> >I think I did.
> *--* The factory based approach would be useful only when we can have
> separate copies of operators for each set of operations.
>
> >Maybe it's time to create a dedicated branch for the GA functionality
> >so that we can try out the different approaches.
>
>
> >
> > >> I think first we need to decide on whether we really need this
> > >> customization and if yes then why. Then we can decide on alternate
> > >> implementation options.
> > >
> > >> >As per the recent updates of the math-related code bases, the
> > >> >public API should provide factory methods (constructors should
> > >> >be private).
> > >> -- private constructors will make public API classes non-extensible.
> This
> > >> will severely restrict the extensibility of this framework which I
> want
> > to
> > >> avoid. I am not sure why we need to remove public constructors. It
> would
> > be
> > >> helpful if you could refer me to any relevant discussion thread.
> >
> > >  Allowing extensibility is a huge burden on library maintainers.  The
> > >  library must have been designed to support it; hence, you should
> > >  first describe what kind(s) of extensions (with usage examples) you
> > >  have in mind.
> > --The library should be extensible to support customization. Users should
> > be able to customise or provide their own implementation of genetic
> > operators for crossover and mutation. The chromosome classes should also
> be
> > open for extension.
>
> >I don't get why we should support extensions outside this library.
> *--* I think we should not block the extension.
>
> >Initially we discussed about having a light-weight library, for easier
> usage
> >than alternative existing framework(s).
> *--* We can always think of making the framework lightweight but it
> should not cost extensibility.
>
> >> E.g. any developer should be able to extend the
> >> IntegralChromosome class and define a child class which explicitly
> >> specifies the range of integers to be used.
>
> >It does not look like this would need an extension, only configuration
> >of the range.
> *-- *I agree. But the question is should we block the extension.
>
> >> I have initially implemented
> >> the Binary chromosome and the corresponding binary mutation following
> the
> >> same pattern. However, restricting extension of concrete classes by
> private
> >> constructor does not prevent users from extending the abstract parent
> >> classes.
>
> >We should aim at coding the GA logic through (Java) interfaces, and not
> >expose the "abstract" classes.
> *-- *One of the primary reasons for me to contribute in Apache' GA
> library is it's simplicity and extensibility. I would like to have a
> framework which should be always extensible for any problem domain with
> minor changes. The primary reason behind this is that application domains
> of GA are too diverse. It is not possible to implement everything in a
> library. We don't know all possible domain areas too. If we remove the
> extensibility from the framework it would be useless in lots of areas.
>
> >Extending the functionality, if necessary, should be contributed back here
> *-- *Sometimes the GA operators are very much specific to the domain and
> it's hard to generalise. In those scenarios contributing back to the
> library might not be possible. However, if a library cannot be extended for
> a new domain by users it becomes underutilised over time if not useless.
>
>
> Thanks & Regards
> --Avijit Basak
>
> On Tue, 21 Dec 2021 at 22:05, Gilles Sadowski <gillese...@gmail.com>
> wrote:
>
>> Hello.
>>
>> Le mar. 21 déc. 2021 à 16:21, Avijit Basak <avijit.ba...@gmail.com> a
>> écrit :
>> >
>> > Hi All
>> >
>> >         Please see my comments. Sorry for the delayed response.
>> >
>> > >Several problems with this approach (raised in previous messages IIRC):
>> > >1. Potential performance loss in sharing the same RNG instance.
>> > -- As per my understanding ThreadLocalRandomSource creates separate
>> > instances of UniformRandomProvider for each thread. So I am not sure
>> how a
>> > UniformRandomProvider instance is being shared. Please correct me if I
>> am
>> > wrong.
>>
>> Within a given thread there will be *one* RNG instance; that's what I
>> meant
>> by "shared".
>> Of course you are right that that instance is not shared by multiple
>> threads
>> (which would be a bug).
>> The performance loss is because it will be necessary to call
>>   ThreadLocalRandomSource.current(RandomSource source)
>> for each access to the RNG (since it would be a bug to store the returned
>> value in e.g. an operator instance that would be shared among threads (as
>> you suggest below).
>>
>> > >2. Less/no flexibility (no user's choice of random source).
>> > -- Agreed.
>> > >3. Error-prone (user can access/reuse the "UniformRandomProvider"
>> > instances).
>> >
>> > >Again: "ThreadLocalRandomSource" is an ad-hoc workaround for correct
>> but
>> > >"light" usage of random number generation in a multi-threaded
>> application;
>> > GAs
>> > >make "heavy" use of RNG, thus it is does not seem outlandish that all
>> the
>> > RNG
>> > >"clients" (e.g. every "operator") creates their own instances.
>> >
>> >
>> > >IMHO, a more important discussion would be about the expectations in a
>> > >multithreaded context: E.g. should an operator be shareable by
>> different
>> > >threads?  And if not, how does the API help application developers to
>> avoid
>> > >such pitfalls?
>> > -- Once we implement multi-threading in GA, same crossover and mutation
>> > operators will be re-used across multiple threads.
>>
>> I would be wary to go on that path; better consider making (deep) copies.
>> We can have multiple instances of an operator, all being configured in the
>> same way but being different instances with no risk of a multithreading
>> bug.
>>
>> > So even if we provide
>> > the customization at the operator level we cannot avoid sharing.
>>
>> We can, and we should.
>> What we probably can't avoid sharing is the instance that represents the
>> population of chromosomes.
>>
>> >
>> > >> My original implementation did not allow any customization of
>> > RandomSource
>> > >> instances. There was a thought in review for customization of
>> > RandomSource,
>> > >> so these options were considered. I don't think this would make any
>> > >> difference to algorithm functionality.
>> >
>> > >  Quite right.  But the customization can come at zero cost for the
>> users
>> > >  who don't need it. Admittedly it's a little more work on the part of
>> the
>> > >  developer(s) but it's a one off cost (and I'm fine working on that
>> part
>> > of
>> > >  the library once other, more important, things have been settled).
>> >
>> > >> Even earlier I used Math.random()
>> > >> which worked equally well. So my *vote* should be *against* this
>> > >> customization.
>> >
>> > >  Mine is against using "ThreadLocalRandomSource"...
>> > -- What is the wayout other than that. Please suggest.
>>
>> I think I did.
>> Maybe it's time to create a dedicated branch for the GA functionality
>> so that we can try out the different approaches.
>>
>> >
>> > >> I think first we need to decide on whether we really need this
>> > >> customization and if yes then why. Then we can decide on alternate
>> > >> implementation options.
>> > >
>> > >> >As per the recent updates of the math-related code bases, the
>> > >> >public API should provide factory methods (constructors should
>> > >> >be private).
>> > >> -- private constructors will make public API classes non-extensible.
>> This
>> > >> will severely restrict the extensibility of this framework which I
>> want
>> > to
>> > >> avoid. I am not sure why we need to remove public constructors. It
>> would
>> > be
>> > >> helpful if you could refer me to any relevant discussion thread.
>> >
>> > >  Allowing extensibility is a huge burden on library maintainers.  The
>> > >  library must have been designed to support it; hence, you should
>> > >  first describe what kind(s) of extensions (with usage examples) you
>> > >  have in mind.
>> > --The library should be extensible to support customization. Users
>> should
>> > be able to customise or provide their own implementation of genetic
>> > operators for crossover and mutation. The chromosome classes should
>> also be
>> > open for extension.
>>
>> I don't get why we should support extensions outside this library.
>> Initially we discussed about having a light-weight library, for easier
>> usage
>> than alternative existing framework(s).
>>
>> > E.g. any developer should be able to extend the
>> > IntegralChromosome class and define a child class which explicitly
>> > specifies the range of integers to be used.
>>
>> It does not look like this would need an extension, only configuration
>> of the range.
>>
>> > I have initially implemented
>> > the Binary chromosome and the corresponding binary mutation following
>> the
>> > same pattern. However, restricting extension of concrete classes by
>> private
>> > constructor does not prevent users from extending the abstract parent
>> > classes.
>>
>> We should aim at coding the GA logic through (Java) interfaces, and not
>> expose the "abstract" classes.
>> Extending the functionality, if necessary, should be contributed back
>> here.
>>
>> Regards,
>> Gilles
>>
>> >>> [...]
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
>
> --
> Avijit Basak
>


-- 
Avijit Basak

Reply via email to