[Numbers] Benchmark GSoC project (Was: Google Summer of Code)

Gilles Fri, 13 Apr 2018 17:19:49 -0700

Hi.

[As the mailing list is shared by many projects, don't forget
to prefix posts with a component's "identifier".]


On Tue, 10 Apr 2018 19:00:14 -0400, Brian Driscoll wrote:

Greg,

I'm sending this email to help explain Gilles response to your GSoC
project and what you should send in response.

Gilles:  There is no structure for benchmarks in Commons Math (there
are home-made codes used there for "FastMath" (that have shown that
"FastMath" is nos always fast...).   Here the purpose is to use JMH.
[There are examples in "Commons RNG".]

Explanation: In your GSoC project said that you would use
commons-math as a guideline to create benchmarks for commons-numbers.
Gilles is saying that benchmarks in commons-math is not a good place
to start, because those benchmarks don't use a test frame work to run
the benchmarks.  Your GSoC proposal is to do the work that's
documented in the the NUMBERS-70 Jira ticket.  That ticket indicates

that the JMH test framework(openjdk.java.net/projects/code-tools/jmh)

should be used.  What Gilles is saying is to use commons-rng as the
example starting point for creating the commons-numbers benchmarks.
This is because commons-rng has benchmarks which are done in jmh.

I checked out commons-rng.  It's a library to generate random

numbers, which is a very important thing for encryption. You canfind

it at commons.apache.org/proper/commons-rng.  The link "Source
Repository (current)" is an easy rudimentary way to look at the
source.

commons-rng-examples/examples-jmh/src/main/java/org/apache/commons/rng/examples/jmh
contains the code which benchmarks commons-math using jmh to run the
tests.

Your response:  Thanks for your insights on the benchmarks.  I'll

change my project to use the benchmarks in commons-rng as thetemplate

for commons-numbers benchmarks.  I found jmh benchmarks in

commons-rng/examples-jmh/src/main/java/org/apache/commons/rng/examples/jmh.
 I'm assuming those are the jmh benchmarks you were talking about.

Your project doc:  Update the Background section of your doc to
indicate the benchmarks in commons-rng will be used template for the
benchmark for commons-numbers.  At the end of the doc add a section

titled CHANGE LOG. Below that put "04/10 - Changed Backgroundsection

to say that benchmarks will be based on commons-rng rather than
commons-math."


I did not mean that the actual benchmarking code should be
modeled after what exists in "Commons RNG": there, the number
of core methods in relatively small and the purpose was to
compare their relative performances.

Here, the reference (to compare with) will rather be similar
functionality in other languages (e.g. Python or C++).
Given the number of methods, we should perhaps explore how to
generate benchmark codes.

Gilles:  I'd suggest "apt" for the documentation format since it is
somewhat easier than "xdoc" for tables (as the likely output of the
benchmark project).

Explanation:  "xdoc" and "apt" are different documentation formats
for Doxia.  See maven.apache.org/doxia/index.html for more info about
Doxia.  Doxia is a tool for generating web documentation.  The way it
works is your write documentation in a format that Doxia understands,

then run Doxia to process those files to generate web pages todisplay

the documentation.  Doxia supports a bunch of different formats,
"xdoc" and "apt" are two of them.  See

maven.apache.org/doxia/referenes/index.html for a complete list oftheformats supported. From what I can tell "apt" format is seemssimplerand easy to use, while "xdoc" is a richer but more complicatedformat.


Note that Doxia is part of the Apache Maven project.  Maven is tool
to build (compile, etc) a project from its source code and dependent
libraries.  Apache uses Maven to build many of their open source
projects.  For projects that have documentation in a Doxia format,
Maven runs the Doxia tool on the documentation files to generate the
finished documentation files that can be viewed via the web.

Your response:   I don't really know either the xdoc or apt formats
well.  Apt seems simpler & easier to use than xdoc.  xdoc looks like
it has more features but would be harder to use.  So using apt seems
like it would be easier, as long as it supports all the documentation

features that are needed. I was originally thinking thedocumentation

would be in xdoc because the commons-numbers/src/site/xdoc/userguide
contain the doc from commons-math and is in xdoc format.  I though
this was done because people wanted the commons-numbers doc to use
xdoc and be similar the commons-math doc.  Do you have any good
examples of apt doc that I could use as a starting point?


I don't know whether it's a good example, but the "Commons
RNG" userguide is written in APT format.
A section of "Commons Math" is also written in APT.
Actually, any format supported by Maven should be fine, if
you have another preference, since they are combined into
the generated HTML documents.


Gilles: Don't hesitate to open JIRA reports for each task that may
need interaction on the details.

Explanation:  Jira is the issue tracking system used by the Apache
organization.  It's a very common system and used by many
organizations.  Ullink uses is for the same thing.  Jira
tickets/issues are created for new features that need to be added,
bugs that need to be fixed, etc.  People put in the details of what

they are a requesting. Using Jira, people can track the status ofthe

issue, see what's going on with it, what release its fixed in, etc.
It's quite common that there is not enough information in the ticket
to implement the request.  It's common for people to ask questions to
clarify the details of things.  They can either be asked on the
existing ticket, which is NUMBERS-70 in your case, or a new ticket
linked to the original ticket.

Your response:   Okay.  I'm just getting familiar with Jira.  I'll
start with updating NUMBERS-70 and adding a comment with a link to my
GSoC project document.  When I need to get details worked out or have
questions, how should I do it in Jira?  Should I put them as comments
on NUMBERS-70?  Or should I create a new Jira issue linked to
NUMBERS-70 and if so what type, i.e. Task?


Yes; creating sub-tasks of the original issue would be fine.

Gilles:  At first sight, script(s) to convert from JMH's output to
"apt" would be welcome.

Explanation:  He's suggesting that a simple program be created which
reads the jmh benchmark test output and creates a doc in apt format
with the test results.  Then those results could be displayed on the
commons-numbers web site.  A simple program like this would typically
be written in a scripting language.  Like Borne Shell (which I know),
which is the command line language available on most Linux machines.
Python is another example of a scripting language, but it is more
complicated (I don't know it).  Perl is another scripting language
(which I know).  Typically scripting language programs don't need to
be complied.  You run them by passing them to the interpreter for the
language which parses and executed the commands in your program file.

Languages like Java, Haskell, etc. need to be compiled before theycan

be run.

Your response:  I've got experience with Java and Haskell, but don't
have much experience with scripting languages.  What scripting

language would you suggest for something like this, i.e. BourneShell,Perl, Python? I'll give it a try. I'd have to keep it reallysimple.

I'd do it after I finish the benchmarks. It would be one of the last
things I would do.  But I may not have enough time to complete it, if
learning the scripting language and writing the script take me a
while.


JMH can generate several output formats.
The idea is to explore which is more suited to give a clear
picture (a table, I guess) of the benchmarks result wrt some
expectation (to be determined, e.g. by running similar tests
on another language/platform).

Regards,
Gilles

Hi.

On Fri, 6 Apr 2018 21:09:56 -0400, Greg Driscoll wrote:
Hello all,
I'm a computer science student that's really interested in doing aGoogleSummer of Code project working on the commons-numbers User Guideandbenchmarks. In Jirait'shttps://issues.apache.org/jira/browse/NUMBERS-70.
Thanks for your interest, and welcome.
The link to my proposal is herehttps://docs.google.com/document/d/1i6yy2cW0x9MYbDOuLPdZrV0XA0eKO5q5N0SNg99mJfA/edit?usp=sharing
Looks good.
A few remarks:
 * There is no structure for benchmarks in Commons Math" (there are
   home-made codes used there for "FastMath" (that have shown that
   "FastMath is nos always fast...).
   Here the purpose is to use JMH. [There are examples in "Commons
   RNG".]
 * I'd suggest "apt" for the documentation format since it is
   somewhat easier than "xdoc" for tables (as the likely output
   of the benchmark project).
 * Don't hesitate to open JIRA reports for each task that may need
   interaction on the details
 * At first sight, script(s) to convert from JMH's output to "apt"
   would be welcome.
Please let me know what you think about it. You can reply to thismailing
list, comment on the doc, or email me directly.
Let's keep discussion on this list so that everyone interested
can participate.

Best,
Gilles
Thanks.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

[Numbers] Benchmark GSoC project (Was: Google Summer of Code)

Reply via email to