On 14/02/2019 12:42, Gilles Sadowski wrote:
Hello Herbert.

Le jeu. 14 févr. 2019 à 12:58, Alex Herbert <alex.d.herb...@gmail.com> a écrit :
On 13/02/2019 22:28, Gilles Sadowski wrote:
Hi.

Le mer. 13 févr. 2019 à 21:21, Alex Herbert <alex.d.herb...@gmail.com> a écrit :

On 13 Feb 2019, at 17:52, Gilles Sadowski <gillese...@gmail.com> wrote:

Le mer. 13 févr. 2019 à 18:47, Alex Herbert <alex.d.herb...@gmail.com> a écrit :

On 13/02/2019 17:35, Gilles Sadowski wrote:
Hello.

Le mer. 13 févr. 2019 à 17:20, <aherb...@apache.org> a écrit :
This is an automated email from the ASF dual-hosted git repository.

aherbert pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/commons-rng.git

commit 367f022a88dedf2b79778f18ad1ed3ec655fffe8
Author: aherbert <aherb...@apache.org>
AuthorDate: Wed Feb 13 16:20:21 2019 +0000

      The commons-math distributions can use a null random generator
Not a good move, I 'd think: this constructor has disappeared from
the development version of CM.[1]
The constructor is in the unit tests. These depend on commons-math3 in
the test scope.

If the tests are upgraded to commons-math4 then this will just be a
compile error. The tests would have to be changed anyway to reference
the updated component package name for maths.
I know; the point is that this constructor was a bad design
decision (there are old and long discussions about it).
So, why present wrong usage in [RNG] (that would a.o. lead
to wondering why the constructor was deleted)?

I thought I was changing to the recommended usage.

The javadoc for the distributions in CM acknowledge that a generator is created 
by default unless you supply one, and that it is used for sampling. Given the 
tests are not sampling then null is recommended.

The Junit test code can be updated with a better comment to reflect the 
recommended usage. Currently it states: “The commons-math distributions are not 
used for sampling so use a null random generator”. I can change to the CM 
javadoc of: “In case no sampling is needed for the created distribution, it is 
advised to pass null as random generator via the appropriate constructors to 
avoid the additional initialisation overhead.”

I can rename the RandomGenerator to ‘unusedRng’. Anyone who reads the code 
should not be mislead by this.
I'm sorry for the perpetuation of the misunderstanding.
The CM classes were wrong in initializing a RNG instance even though
it might never be used.  Hence the new API there[1] passes the RNG only
when the user wants to sample.
Passing null to the (deprecated) constructor is a workaround to avoid the
unnecessary creation, as you note but it thus perpetuates bad usage; in
the case of a unit test, documenting proper use is IMO more important than
having a slight performance improvement.
Note that if/when we release [Statisitics], the classes from there (where the
API has been corrected) will be used in the test.

HTH,
Gilles

[1] 
http://commons.apache.org/proper/commons-math/apidocs/src-html/org/apache/commons/math4/distribution/NormalDistribution.html#line.240
We seem to have some crossed wires here. I am advocating using the CM3
distributions in the recommended way. In that library these constructors
are not deprecated. So this is a change for 'proper use' of that
library. You are advocating using the constructor as it appears in the
improved, but unreleased, CM4 distributions. This is to promote a better
designed library that accept only parameters related to the distribution
and also avoid a reader wondering why the CM3 constructor was deleted in
CM4. I can see the merit in this. However as advocates of commons I
believe we should employ the recommended way for the library we are
using irrespective of what will happen in the future.

Given the scenario that a user reads the test code to understand how the
test is checking the samplers do sample correctly from the distribution.
Can we change the test code to state that the reference distribution is
constructed using CM3. Here the distributions have a dual functionality
of computing the PMF/PDF and also performing sampling. The test does not
require the sampling functionality and so passes an 'unusedRng' to the
constructor. This should not lead any reader astray, promotes correct
usage of CM3 and part-way explains why CM4 does not have the constructor
(since the distributions now have a separate sampling functionality).

In a flip scenario if the original code is restored then a reader would
have an example of how to use CM3 for PMF/PDF computations that has a
performance hit (that they may not know about). It would be up to them
to discover this and then how to avoid this performance hit by reading
the documentation. I think this is worse, hence the change.

So the lesser of two evils: (1) show a user how to avoid poor
performance during usage of CM3 but prompt them to question the library
design; (2) show a user how to use a distribution with only the
parameters that control the distribution, but leave them with a
performance hit to figure out if they copy the code.

Thank you for making me clarify my position. If you still disagree then
we can restore the old code and put a note that the distributions in
future should be from commons statistics/CM4 (or create a Jira ticket to
record the intention). I'm happy with either. But in a reversion I would
add a note that for clarity the CM3 implementation here is used with
only the parameters that control the distribution. Other parameters that
affect performance are not employed.
I can understand what you say; it is a rehash of all the discussions
we had back then. ;-)
The (so-called "recommended") null argument was to be a workaround
for users that could not bear the performance hit.

There is no real need to revert the code as long as it won't entail
questioning that the design fix, as implemented in [Statisitics], is
fine, and that the test in [RNG] will use that in due time.

What would be great is to review [Statistics] and make a release
of it[1], so that we can drop the dependency towards CM v3.6.1 and
stop the indirect advocacy for an outdated version just because
CM 4.0 (with tens of bugfixes and enhancements) could never be
released.

[1] Even if the version number would "0.x" because its scope is much
     larger than its current contents.

Sorry this went off list in my last reply.

So the current code using the null RandomGenerator stays. I can update the test with a better explanation quoting the current Javadoc for the distribution, e.g.:

            // This test uses reference distributions from commons-math3 to compute the expected             // PMF. These distributions have a dual functionality to compute the PMF and perform             // sampling. In case no sampling is needed for the created distribution, it is advised             // to pass null as the random generator via the appropriate constructors to avoid the
            // additional initialisation overhead.
            org.apache.commons.math3.random.RandomGenerator unusedRng = null;

I will put it on my todo list to have a look at [Statistics].

Alex



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to