Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-10 Thread Daniel Watson
After some thought, this wrapper class might be better named something like
BigMeasurement (or just Measurement?). Significant figures and precision
are very closely tied to measurements, since the act of measuring is really
what causes the uncertainty to begin with.

I think a static method to count sigfigs is worth adding to commons-lang
math utils, so ill propose it there. As for a wrapper class, Im not so
sure. Measurement calculation seems closer to something like
commons-numbers, but if its not quite close enough to fit then i'll just
retain it in my personal commons.

Thanks for the discussion!

On Wed, Aug 9, 2023 at 3:00 PM Daniel Watson  wrote:

> I believe the convention is to take the *least* precise term and apply
> that precision (here "precision" != "sigfigs" - Ive been using both terms
> to mean sigfigs, but for these purposes precision is actually defined as
> how small a fraction the measurement is able to convey - e.g 0.01 is more
> precise than 1.1, despite the latter having more sigfigs).
>
> The results should be...
>
> 12345 + 10.0 = 12355
> 12345 + 10 =  12355
> 12345 + 1 =  12346
> 12345 + 1.0 =  12346
> 12345 + 1.0 = 12346
>
> None of these will have decimal places because the left term was not
> precise enough to have them. When adding/subtracting you can end up with
> more significant figures in your result than you had in one of your terms,
> you just can end up with a more "precise" result than either of your
> terms.e.g.
>
> 999.0 + 9.41 = 1008.4
> 4 sigfigs + 3 sigfigs = 5 sigfigs - It's perfectly fine that we ended up
> with more here, as long as we didnt increase the "precision".
>
> So in this case I think the correct logic is to add the two terms together
> in the normal way, reduce the precision to that of the limiting term, and
> then recalculate the number of significant figures on the result.
>
> I believe that, conveniently, the BigDecimal class already tracks this as
> scale(). So the information is available to determine the new precision. It
> would just be a matter of retaining it within the wrapper class and
> applying it when producing the final output string. I'd need to play around
> with a few more examples, but I think that's the logic at a high level.
>
> Dan
>
> On Wed, Aug 9, 2023 at 2:08 PM Alex Herbert 
> wrote:
>
>> On Wed, 9 Aug 2023 at 17:13, Daniel Watson  wrote:
>>
>> > BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))
>>
>> Multiply is easy as you take the minimum significant figures. What
>> about addition?
>>
>> 12345 + 0.0001
>>
>> Here the significant figures should remain at 5.
>>
>> And for this:
>>
>> 12345 + 10.0
>> 12345 + 10
>> 12345 + 1
>> 12345 + 1.0
>> 12345 + 1.00
>>
>> You have to track the overlap of significant digits somehow.
>>
>> Alex
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>


Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
I believe the convention is to take the *least* precise term and apply that
precision (here "precision" != "sigfigs" - Ive been using both terms to
mean sigfigs, but for these purposes precision is actually defined as how
small a fraction the measurement is able to convey - e.g 0.01 is more
precise than 1.1, despite the latter having more sigfigs).

The results should be...

12345 + 10.0 = 12355
12345 + 10 =  12355
12345 + 1 =  12346
12345 + 1.0 =  12346
12345 + 1.0 = 12346

None of these will have decimal places because the left term was not
precise enough to have them. When adding/subtracting you can end up with
more significant figures in your result than you had in one of your terms,
you just can end up with a more "precise" result than either of your
terms.e.g.

999.0 + 9.41 = 1008.4
4 sigfigs + 3 sigfigs = 5 sigfigs - It's perfectly fine that we ended up
with more here, as long as we didnt increase the "precision".

So in this case I think the correct logic is to add the two terms together
in the normal way, reduce the precision to that of the limiting term, and
then recalculate the number of significant figures on the result.

I believe that, conveniently, the BigDecimal class already tracks this as
scale(). So the information is available to determine the new precision. It
would just be a matter of retaining it within the wrapper class and
applying it when producing the final output string. I'd need to play around
with a few more examples, but I think that's the logic at a high level.

Dan

On Wed, Aug 9, 2023 at 2:08 PM Alex Herbert 
wrote:

> On Wed, 9 Aug 2023 at 17:13, Daniel Watson  wrote:
>
> > BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))
>
> Multiply is easy as you take the minimum significant figures. What
> about addition?
>
> 12345 + 0.0001
>
> Here the significant figures should remain at 5.
>
> And for this:
>
> 12345 + 10.0
> 12345 + 10
> 12345 + 1
> 12345 + 1.0
> 12345 + 1.00
>
> You have to track the overlap of significant digits somehow.
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Alex Herbert
On Wed, 9 Aug 2023 at 17:13, Daniel Watson  wrote:

> BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))

Multiply is easy as you take the minimum significant figures. What
about addition?

12345 + 0.0001

Here the significant figures should remain at 5.

And for this:

12345 + 10.0
12345 + 10
12345 + 1
12345 + 1.0
12345 + 1.00

You have to track the overlap of significant digits somehow.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
Ah I see what you were asking. Yes it is up to the human entering data to
understand that 1 has exactly one sigfig according to standard
convention. If you need it to have more then you must write it in full
scientific notation. Obviously If a specific precision is required due to
some flaw in the dataset then the user could manually override the detected
sigfig count. But the assumption of the parsing logic is that the input
abides by the standard convention, which is well defined. I don't see it as
being much different than any other Number class expecting the input to
abide by a specific format. Conventions for SigFig counting are well
defined. It just so happens that most people don't often need them (but the
same could be said for o.a.c.numbers.Complex).

As far as exact calculations, if the user did:

BigSigFig result = new BigSigFig("1.1").multiply(new BigDecimal("2.54"))

I would expect the BigSigFig class should understand that BigDecimal has no
sigfig limit, and would retain it's current minimum of 2. It would only
apply a new minimum in the case of operating against another BigSigFig...

BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))

The result of that should be a BigSigFig with an internal value of exactly
2.2 but would output as "2" to respect the new sigfig count. I think
something like that should be possible. In the end this is more of a
parsing / formatting exercise. The wrinkle is the tracking aspect, where we
need to dynamically reduce the sigfigs based on other operations. That's
where a wrapper class I think comes in handy.

Dan


On Wed, Aug 9, 2023 at 11:23 AM Alex Herbert 
wrote:

> On Wed, 9 Aug 2023 at 15:43, Daniel Watson  wrote:
> >
> > Hope that answers more questions than it creates!
>
> It does not address the issue of the last significant zero, e.g:
>
> 1 (4 sf)
> 1 (3 sf)
> 1 (2 sf)
>
> One way to solve this with standard parsing would be to use scientific
> notation:
>
> 1.000e4
> 1.00e4
> 1.0e4
>
> Note that for the example of inch to cm conversions the value 2.54
> cm/inch is exact. This leads to the issue that there should be a way
> to exclude some input from limiting the detection of the lowest
> significant figure (i.e. mark numbers as exact). This puts some
> responsibility on the provider of the data to follow a format; and
> some on the parser to know what fields to analyse.
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Alex Herbert
On Wed, 9 Aug 2023 at 15:43, Daniel Watson  wrote:
>
> Hope that answers more questions than it creates!

It does not address the issue of the last significant zero, e.g:

1 (4 sf)
1 (3 sf)
1 (2 sf)

One way to solve this with standard parsing would be to use scientific notation:

1.000e4
1.00e4
1.0e4

Note that for the example of inch to cm conversions the value 2.54
cm/inch is exact. This leads to the issue that there should be a way
to exclude some input from limiting the detection of the lowest
significant figure (i.e. mark numbers as exact). This puts some
responsibility on the provider of the data to follow a format; and
some on the parser to know what fields to analyse.

Alex

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
Before I answer your questions - I'll say that looking at the commons-math
codebase it is apparent that it's focused on specific functional
computation, rather than util-like features. So I agree this probably
doesn't fit well there. I honestly did not know commons-numbers existed.
I'll check there and then either move this discussion there or commons-lang.

(I'll respond to your questions anyway just in case this ever comes up
again or anyone is curious)

The use case is reading of text data (e.g. CSV) where significant figures
are implied according to the standard rules. Data that is already typed to
a standard java Number would have no inherent significant figure tracking
and it cannot be reliably determined (for the reasons you mentioned). If
the data is represented in that fashion then sigfigs must be
provided/applied separately.

The significant figures of the input data are inherently "verified" because
scientific calculations of this nature are provided by humans (obviously
cant account for some forms of human error) and humans will know
the precision of their apparatus, and can communicate it using the standard
rules of sigfigs - If thats not the case then the user should not be using
this api. Because the input data is verified, the output data is also
"verified" as long as this logic is correct.

I don't believe there is a need for repeating special characters when a
number of significant figures is known. In the case of infinite precision,
the BigDecimal class already handles that. When significant figures are
known then something like 1000/3 can and should be reported as 0.3 (or in
scientific notation) because there is only a single significant figure in
that calculation. A repeating 3 would imply precision that does not exist.
(Admittedly I need to double check this. I know that for pure mathematical
values e.g. conversion from feet to inches, the conversion has infinite
precision. However as long as the initial measurement has a precision then
the output will also necessarily have that same precision). Intermediate
calculations can use infinite precision, which could be handled internally
via BigDecimal. But final results should be reported with proper sigfig
rules applied.

You are correct that "1" would not be the same as "1.000" and for clinical
/ scientific data this is known to be important. "1" implies 1 sigfig,
"1.000" implies 4. This is why the data most likely will be represented as
text.

Determining if the String is a number is simpler in this case I think?
Assuming decimal base (and potentially scientific notation) there are a
limited number of characters and syntax. isCreateable() attempts to handle
different bases as well as type qualifiers whereas this logic would be
restricted to decimal base and syntax. (theoretically I suppose you could
use a different bases, but scientific calculations are rarely, if ever,
carried out in anything other than decimal. Seems natural that they would
be out of scope).

As for a wrapped class, my initial thought (though I havent worked out the
details) would be to extend BigDecimal and use its arithmetic logic.
Relevant methods would be overridden to ensure the sigfig subclass is
returned. There may be issues with that, I havent fleshed it out.

Ultimately the initial goal would be to simply count the number of sigfigs
through some text util/parse method. The fact that sigfigs are normally
conveyed via textual representation means that many of the issues you might
encounter trying to derive them from pure numbers doesn't apply.

Hope that answers more questions than it creates!

Dan

On Wed, Aug 9, 2023 at 8:48 AM Alex Herbert 
wrote:

> Hi,
>
> On Wed, 9 Aug 2023 at 12:27, Daniel Watson  wrote:
>
> > This feature is necessary when working with scientific/clinical data
> which
> > was reported with significant figures in mind, and for which calculation
> > results must respect the sigfig count. As far as I could tell there is no
> > Number implementation which correctly respects this. e.g.
> >
> > "11000" has 2 significant figures,
> > "11000." has 5
> > ".11000" has 5
> > "11000.0" has 6
>
> This functionality is not in Commons AFAIK. Is the counting to accept
> a String input?
>
> Q. What is the use case that you would read data in text format and
> have to compute the significant figures? Or are you reading data in
> numeric format and computing the decimal significant figures of the
> base-2 data representation? Note: Differences between base-10 and
> base-2 representations can lead to an implementation that satisfies
> one use case and not others due to rounding conversions (see
> NUMBERS-199 [1]). I would advise against this and only support text
> input when referring to decimal significant figures.
>
> I presume you have input data of unknown precision, are computing
> something with it, then outputting a result and wish to use the
> minimum significant figures from the input data. If the output
> significant figures are critical then th

Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Alex Herbert
Hi,

On Wed, 9 Aug 2023 at 12:27, Daniel Watson  wrote:

> This feature is necessary when working with scientific/clinical data which
> was reported with significant figures in mind, and for which calculation
> results must respect the sigfig count. As far as I could tell there is no
> Number implementation which correctly respects this. e.g.
>
> "11000" has 2 significant figures,
> "11000." has 5
> ".11000" has 5
> "11000.0" has 6

This functionality is not in Commons AFAIK. Is the counting to accept
a String input?

Q. What is the use case that you would read data in text format and
have to compute the significant figures? Or are you reading data in
numeric format and computing the decimal significant figures of the
base-2 data representation? Note: Differences between base-10 and
base-2 representations can lead to an implementation that satisfies
one use case and not others due to rounding conversions (see
NUMBERS-199 [1]). I would advise against this and only support text
input when referring to decimal significant figures.

I presume you have input data of unknown precision, are computing
something with it, then outputting a result and wish to use the
minimum significant figures from the input data. If the output
significant figures are critical then this is a case where I would
expect the reported significant figures to be manually verified; thus
automation only partly increases efficiency by providing a first pass.

Note: I do not think there is an easy way to handle trailing zeros
when not all the zeros are significant [1]. This is in part due to the
different formats used for this such as an overline/underline on the
last significant digit. I do not think we wish to support parsing of
non-ascii characters and multiple options to identify significant
zeros. Thoughts on this?

Secondly, you are reliant on the text input being correctly formatted.
For example if I include the number 1, it would have to be represented
as e.g. 1.000 to not limit the significant figures of the rest of the
input data.

Thirdly, if accepting string input then you have the issue of first
identifying if the string is a number. This is non-trivial. For
example there is a function to do it in o.a.c.lang3.math in
NumberUtils.isCreatable.

Finding a home in commons will elicit different opinions from the
various contributors. The math and numbers projects are more related
to numeric computations. They output results in a canonical format,
typically the JDK toString representation of the number(s).
Conversions to different formats are not in scope, and parsing is
typically from the canonical format using JDK parse methods. There is
a o.a.c.text.numbers package in the text component that currently has
formatting for floating-point numbers to various text representations.
But parsing of Strings is not supported there. And lang has the
NumberUtils.

As for a class that tracks significant figures through a computation,
that will require some consideration. Do we create the class to wrap
Number and track significant digits of a configurable base. This would
allow BigDecimal with base 10, or int and double with base 2. Since
Number does not specify arithmetic then this has to be captured
somehow. It may be able to use the Numbers implementation of Field [3]
in o.a.c.numbers.field. Or simplify to only using a BigDecimal
wrapper.

In summary this may be simpler with an ideal use case. For example the
input is text, it must be parable to a BigDecimal and the number of
significant figures is identified using the text input. Support of
significant zeros is limited to the full length of the trailing zeros,
or the first zero if no decimal point is provided. This could be
handled with a parse method that returns both the BigDecimal and the
significant figures.

Alex

[1] https://issues.apache.org/jira/browse/NUMBERS-199
[2] 
https://en.wikipedia.org/wiki/Significant_figures#Ways_to_denote_significant_figures_in_an_integer_with_trailing_zeros
[3] http://mathworld.wolfram.com/Field.html

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org