Re: Re: [LANG] Support single quotes in DurationFormatUtils methods' formats

2024-05-31 Thread Daniel Watson
Honestly I think not supporting empty literals is just as big a limitation
as not supporting single quotes, so IMO we'd just be trading one limitation
for another. i.e. if someone were to need empty literals, the things they
would have to do to use them are the same things they'd have to do to
support single quotes, and I can imagine use cases for both.

On Thu, May 30, 2024 at 2:48 PM Gary D. Gregory  wrote:

> I'm OK with Sebb's solution [1]
>
> Any further thoughts here?
>
> Gary
> [1] https://github.com/apache/commons-lang/pull/1227
>
> On 2024/05/29 13:37:40 Mike Drob wrote:
> > On Wed, May 29, 2024 at 8:17 AM Gary Gregory 
> wrote:
> >
> > > (Sorry for the top post, phone)
> > >
> > > A case I can imagine an empty '' occurring is when the format string
> itself
> > > is built programmatically for example a '%s' or using string
> concatenate of
> > > a variable that holds a string where that string can be empty or an
> "s" to
> > > mark a plural or a quote for example.
> > >
> > > "He said '" + bla + "' to me and I waited mm minutes!"
> > >
> > > Far fetched? Sure 
> > >
> > > Gary
> > >
> > > On Wed, May 29, 2024, 8:43 AM sebb  wrote:
> > >
> > > > On Sun, 26 May 2024 at 23:37, sebb  wrote:
> > > > >
> > > > > On Sun, 26 May 2024 at 08:25, Laertes Moustakas <
> lmous...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > Hello Gary,
> > > > > >
> > > > > > Thank you for your response. Some of the new assertions indeed
> fail
> > > > when interpreting the duplicate single quote as an escaped quote
> instead
> > > of
> > > > a closing and opening quote. In particular, "y' ''years' M 'months'"
> is
> > > > interpreted as "4 'years 0 months" while the expected text lacks the
> > > quote
> > > > before "years". Same for "hello''world": it's interpreted as
> > > "hello'world"
> > > > instead of "helloworld".
> > > > >
> > > > > Please see https://github.com/apache/commons-lang/pull/1227 for an
> > > > > alternate solution.
> > > > > This does not cause issues with any existing tests.
> > > > >
> > > > > However, it does change the behaviour of a duplicate single quote
> > > > > which is found outside an existing opening and closing quote.
> > > > > Instead of the empty string, it generates a lone single quote.
> > > > >
> > > > > Whilst this is a change in behaviour, it seems to me that there
> should
> > > > > be no need for anyone to use a format that uses a pair of adjacent
> > > > > single quotes to generate an empty string in the output, so it
> seems
> > > > > unlikely that this will cause any breakages.
> > > >
> > > > I've since realised that this argument could also apply to the
> > > > existing test cases: is there really a use-case for adjacent constant
> > > > strings?
> > > > Why would anyone want to split a constant string in this way?
> >
> >
> > This is similar to allowable string concatenation in Python, which static
> > analysis flags as a warning and probable bug.
> >
> https://pylint.pycqa.org/en/latest/user_guide/messages/warning/implicit-str-concat.html
> >
> >
> > > > AFAICT it just makes it harder to read the string.
> > > >
> > > > i.e. do the test cases represent a real-world use case?
> > > >
> > > > > > I understand this brings forth a breaking change in formats that
> use
> > > > two single quotes to close and open new literals (or even add an
> empty
> > > > string), but this is consistent with what java.text.SimpleDateFormat
> > > > expects. And I believe that most developers would favor consistency
> > > between
> > > > format strings in equivalent classes. Thus, I think the cases
> described
> > > > above where the two single quotes terminate and begin a literal
> should no
> > > > longer be supported.
> > > >
> > > > I agree.
> > > >
> > > > My alternate solution avoids breaking the test cases, but the
> downside
> > > > is that the syntax is not in agreement with
> > > > java.text.SimpleDateFormat, and is more verbose where a single-quote
> > > > is to be inserted in an existing constant string (it requires 4
> single
> > > > quotes, rather than 2).
> > > >
> > > > > >
> > > > > > Should this change go forward, I expect it to be part of a major
> > > > release (e.g. version 4.0.0, 5.0.0, etc.) instead of 3.x.x, as it
> does
> > > > contain a breaking change.
> > > > > >
> > > > > > If you have more questions, please don't hesitate to contact me.
> > > > > >
> > > > > > Best regards,
> > > > > > Laertes
> > > > > >
> > > > > > On 2024/05/25 13:47:23 Gary Gregory wrote:
> > > > > > > Hello Laertes,
> > > > > > >
> > > > > > > Thank you for your interest in improving Apache Commons Lang
> :-)
> > > > > > >
> > > > > > > Do you foresee any compatibility issues for existing call
> sites and
> > > > > > > format strings?
> > > > > > >
> > > > > > > For example, can you make your use cases work and still
> support:
> > > > > > >
> > > > > > >
> > > >
> > >
> 

Re: commons.text.CaseUtils

2024-04-11 Thread Daniel Watson
Stephan,

On that note, I worked on the initial implementation for PR 450, and had to
pause for work but am planning to resume soon, but definitely contribute
and/or provide any thoughts as it's definitely a big change from the
current CaseUtils.

Regards,
Dan

On Wed, Apr 10, 2024, 9:20 PM Stephan Peters
 wrote:

> Gary, thank you for your response.
>
> I initiated the pull request (#528) and already received some very
> constructive feedback from user mbenson.
> I am modifying the code to contain fewer methods that may be externally
> modified by a user, if something as simple as .toLowerCase() is required.
>
> I also noticed some recent discussion of this which you commented on in
> pull 450  Cases API + 4 implementations (Pascal, Camel, Kebab, Snake) #450
>
> When I am done with the edits and new tests and pushed them to my fork, I
> may join this conversation #450.
>
> My Jira account has been approved (after an initial disapproval.) I haven't
> looked at it yet, I will look for similar topics there.
>
> I also uncovered an issue with my code when I devised some tests I
> specifically designed to break it if possible, and I need to fix this.
>
> assertThat(CaseUtils.toTitleCase(" ' \u2019 Titl'e Case \u2019 '
> ")).isEqualTo("Title_Case");  // todo fix this failure.
>
> org.opentest4j.AssertionFailedError:
> expected: "Title_Case"
>  but was: "Title_Case_’_'"
> Expected :"Title_Case"
> Actual   :"Title_Case_’_'"
>
> This is because of the way I handle apostrophes so "That's good!" will
> return "Thats_Good"
>
> Again, thank you for your response.
>
> Stephan Peters
>
>
> On Tue, Apr 9, 2024 at 5:56 PM Stephan Peters <
> stephan.pet...@csuglobal.edu>
> wrote:
>
> > OK, I will initiate a PR.
> > Some of the added methods will be more useful than others.
> > The PR will come from speters33w.
> >
> > Thank you,
> > Stephan Peters
> >
> > On Tue, Apr 9, 2024 at 5:31 PM Gary Gregory 
> > wrote:
> >
> >> Hello Stephan,
> >>
> >> The best way to see what you are proposing is a PR, it's a bit painful
> to
> >> see differences otherwise, at least for me.
> >>
> >> That said anything new should solve a real world use case, not merely
> >> something that might be useful (or not) 
> >>
> >> I think seeing tests in a PR will help clarify what it is you are
> >> proposing
> >> that the current code doesn't do.
> >>
> >> See also also https://github.com/apache/commons-text/pull/450
> >>
> >> TY,
> >> Gary
> >>
> >> On Tue, Apr 9, 2024, 4:37 PM Stephan Peters
> >>  wrote:
> >>
> >> > I added several methods to the org.apache.commons.CaseUtils class I
> >> think
> >> > would be very useful, for example to use for normalized naming
> >> conventions
> >> > for file paths, file names, URLs, etc.
> >> >
> >> > I'm planning on initiating a pull request.
> >> >
> >> > I would like to discuss it here.
> >> >
> >> > I've posted it in a fork here:
> >> >
> >> >
> >>
> https://github.com/speters33w/commons-text/blob/master/src/main/java/org/apache/commons/text/CaseUtils.java
> >> >
> >> > and written new tests for all the methods that pass here:
> >> >
> >> >
> >>
> https://github.com/speters33w/commons-text/blob/master/src/test/java/org/apache/commons/text/CaseUtilsTest.java
> >> >
> >> > There is an example of the method return values at the top of the
> >> revised
> >> > CaseUtils.java.
> >> >
> >> > The methods have a little different behavior than the existing
> >> > toCamelCase(String, boolean, char[]) (which I left intact) in that
> they
> >> > normalize the input first before processing, so toCamelSnakeCase("The
> >> > café’s piñata gave me déjà vu.") will return
> >> > "the_Cafes_Pinata_Gave_Me_Deja_Vu"
> >> >
> >> > The main driver engine is in the toTitleCase() method and the rest of
> >> the
> >> > methods piggyback on that engine and perform minor changes to the
> return
> >> > value.
> >> >
> >> > If anyone feels like taking a look, I'd appreciate any feedback.
> >> >
> >> > Thank you.
> >> >
> >> > Stephan Peters
> >> >
> >>
> >
>


Re: [LANG] EqualsBuilder#reflectionEquals feature brainstorming

2024-03-07 Thread Daniel Watson
One comment about the collection comparison...

For any collection that actually implements List, equality should *include*
order, not attempt to ignore it, right? The contract of a list is that the
order is consistent, and two lists with the same items in different order,
should not be considered equal.

e.g for List:

Best case scenario - The lengths are different and you dont have to check
any equalities
Worst case scenario - every item is equal and you have to iterate both
lists completely
Remaining Cases  - Item [i] is different and you dont have to check
anything past that

(For pretty much any collection I think the best case is the same)

For sets, maybe it could be optimized by creating a new collection and
removing items as you find them, so each step necessarily gets smaller? Not
sure about this though.



On Thu, Mar 7, 2024 at 8:55 AM Gary D. Gregory  wrote:

> On 2024/03/07 06:58:30 Mark Struberg wrote:
> > The question to me is how we can make it more robust.
> > In a Collection (but actually also in most lists) the order in which you
> get the values (Iterator or get(i)) is not deterministic. It can be
> different in one list than in another - even if they contain the exact same
> items.
>
> Hm, so to iterate through Lists in parallel would work but not with Sets.
>
> >
> > Not yet sure how to work around this. We can probably try to sort it
> first, but then again, if they do not implement Comparable it won't help
> much. Or do a containsElement based on reflection as well - but that would
> be rather slow.
>
> This is one of those: If you want support for the feature, it'll work, but
> it'll be slow because there is no other way to do it (for now if ever).
>
> Gary
>
> >
> > Same with Maps. There is a good reason why the key at least should
> implement equals/hashCode. If this is not the case, then we are not able to
> implement this properly I fear.
> >
> > Any ideas?
> >
> > LieGrue,
> > strub
> >
> > > Am 06.03.2024 um 15:53 schrieb Gary Gregory :
> > >
> > > Ah, right, custom "non-equalable" _inside_ Collections and Maps...
> > >
> > > For the diff, I'd suggest you test and iterable over a Collection
> > > instead of a List.
> > >
> > > Then you'd need a separate test and traversal for Map instances.
> > >
> > > (Still no common super-interface in Java 21 for Collections and
> Maps...)
> > >
> > > Gary
> > >
> > > On Wed, Mar 6, 2024 at 7:40 AM Mark Struberg 
> wrote:
> > >>
> > >> Hi Gregory!
> > >>
> > >> I did try this out and figured that I didn't think it though. Maybe I
> need to go a few steps back and explain the problem:
> > >>
> > >> I have the following constellation
> > >>
> > >> public class SomeInnerDTO {int field..} // NOT implements equals!
> > >> public class TheOuterDTO{ List innerList;..}
> > >>
> > >> My problem is that reflectionEquals (which I use in a unit test)
> tries to introspect fields even in java.util.classes. And for getting the
> values of those classes it tries to call a setAccessible(true);
> > >> And obviously here it fails. There is a ticket already open [1] which
> sugggests to use trySetAccessible. But I fear that will still do nothing
> and you won't get access to those inner knowledge inside
> java.util.LinkedList etc.
> > >>
> > >> And using equals() on the List sadly won't help either, as the
> SomeInnerDTO would also get compared with equals(), but that will obviously
> use identity comparison only :(
> > >>
> > >>
> > >> What I did for now (running all tests with a few projects right now,
> but looks promising):
> > >>
> > >> diff --git
> a/src/main/java/org/apache/commons/lang3/builder/EqualsBuilder.java
> b/src/main/java/org/apache/commons/lang3/builder/EqualsBuilder.java
> > >> index ff5276b04..cf878da40 100644
> > >> ---
> a/src/main/java/org/apache/commons/lang3/builder/EqualsBuilder.java
> > >> +++
> b/src/main/java/org/apache/commons/lang3/builder/EqualsBuilder.java
> > >> @@ -978,6 +978,16 @@ public EqualsBuilder reflectionAppend(final
> Object lhs, final Object rhs) {
> > >> if (bypassReflectionClasses != null
> > >> && (bypassReflectionClasses.contains(lhsClass) ||
> bypassReflectionClasses.contains(rhsClass))) {
> > >> isEquals = lhs.equals(rhs);
> > >> +} else if (testClass.isAssignableFrom(List.class)) {
> > >> +List lList = (List) lhs;
> > >> +List rList = (List) rhs;
> > >> +if (lList.size() != rList.size()) {
> > >> +isEquals = false;
> > >> +return this;
> > >> +}
> > >> +for (int i = 0; i < lList.size(); i++) {
> > >> +reflectionAppend(lList.get(i), rList.get(i));
> > >> +}
> > >> } else {
> > >>
> > >> I'm rather sure this is still not enough and there are plenty other
> cases to consider. Like e.g. handling Maps etc.
> > >> But at least that's the direction I try to approach it right now. And
> of 

[Commons-Lang] EventListenerSupport fireQuietly method

2023-08-16 Thread Daniel Watson
Does it make sense for the EventListenerSupport class to have a separate
method to fire an event "quietly" i.e. without throwing exceptions to the
caller?

I've needed to implement it locally to guarantee that all listeners receive
the event, whereas the standard fire() terminates on the first exception.
Both methods seem like they have valid use cases. You can simulate the
"quiet" call by catching all exceptions on all listeners, but IMO that's a
less than ideal solution. Is something that can/should go in the commons
class?

- Dan


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-15 Thread Daniel Watson
I've incorporated some of the recommendations from this thread into the
Case api. Have a couple additional things to ask for thoughts on...

1) The current caseutils is just in org.apache.commons.text, but given that
this is multiple classes, I put the code into
"org.apache.commons.text.cases" ("case" is reserved). Thoughts?
2) Although I don't propose removing/deprecating the existing CaseUtils,
does it make sense to try and test against the use cases that overlap i.e.
validating that they produce the same output? Or is that just an
unnecessary link?

- Dan

On Fri, Aug 11, 2023 at 11:31 AM Daniel Watson  wrote:

> If no instance of Thing1Case can be reconfigured, then that holds true,
> right? The fact that it extends something like DelimitedCase doesn't break
> the spec I wouldn't think?
>
>
>
> On Fri, Aug 11, 2023, 11:23 AM Gary Gregory 
> wrote:
>
>> Hm, I too, would expect Thing1Case to mean one thing and one thing only...
>> hence this specification exercise 
>>
>> Gary
>>
>> On Wed, Aug 9, 2023, 9:52 PM Daniel Watson  wrote:
>>
>> > I would think it's possible to hide that "configuration" from the user
>> such
>> > that the implementation can only be reconfigured via extension. But I'm
>> not
>> > in love with the configurable base class either way. It was convenient
>> to
>> > have the common functionality in one place, but it's not a big deal to
>> > handle that differently.
>> >
>> > The tradeoff in having the Cases be pure functions is that it makes it
>> more
>> > difficult for a user to extend them with additional functionality. And
>> to
>> > me the need for extension is apparent even when just looking at the 4
>> basic
>> > cases. Two of them are character delimited, and 2 of them are uppercase
>> > delimited. There's two bits of shared functionality just in the 4 most
>> > basic cases.
>> >
>> > Back to the exception topic, I don't think the tokens "my" "component"
>> and
>> > "1" can be formatted in PascalCase in a way that they could be parsed
>> back
>> > out into 3 tokens. So the question is less about whether it's possible
>> to
>> > format them and more about whether the API should format output that
>> cannot
>> > be parsed back into the same input. I think it makes sense to enforce
>> that
>> > consistency, or at the very least allow the user to enable it?
>> >
>> >
>> >
>> >
>> > On Wed, Aug 9, 2023, 9:14 PM Elliotte Rusty Harold 
>> > wrote:
>> >
>> > > On Wed, Aug 9, 2023 at 11:36 PM Daniel Watson 
>> > > wrote:
>> > > >
>> > > > Meant to add...
>> > > >
>> > > > The reason I would favor exceptions is that the underlying
>> > implementation
>> > > > can be easily customized. If the user needs to allow non
>> alphanumeric
>> > > > characters there is a boolean flag in the underlying abstract class
>> > > > (AbstractConfigurableCase) that will simply turn that validation
>> off.
>> > >
>> > > This is another point, but customizability is a bug, not a feature. I
>> > > don't want to guess what the method might be doing based on what flag
>> > > was set where. I want camel case to mean one thing and one thing only.
>> > > Ditto snake case, pascal case, and any other formats. Possibly there's
>> > > a reason to add additional subclasses, but the
>> > > CamelCase/SnakeCase/KebabCase classes should not emit different
>> > > strings depending on how they're configured. The public API should be
>> > > a pure function, not an object.
>> > >
>> > > --
>> > > Elliotte Rusty Harold
>> > > elh...@ibiblio.org
>> > >
>> > > -
>> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> > > For additional commands, e-mail: dev-h...@commons.apache.org
>> > >
>> > >
>> >
>>
>


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-11 Thread Daniel Watson
If no instance of Thing1Case can be reconfigured, then that holds true,
right? The fact that it extends something like DelimitedCase doesn't break
the spec I wouldn't think?



On Fri, Aug 11, 2023, 11:23 AM Gary Gregory  wrote:

> Hm, I too, would expect Thing1Case to mean one thing and one thing only...
> hence this specification exercise 
>
> Gary
>
> On Wed, Aug 9, 2023, 9:52 PM Daniel Watson  wrote:
>
> > I would think it's possible to hide that "configuration" from the user
> such
> > that the implementation can only be reconfigured via extension. But I'm
> not
> > in love with the configurable base class either way. It was convenient to
> > have the common functionality in one place, but it's not a big deal to
> > handle that differently.
> >
> > The tradeoff in having the Cases be pure functions is that it makes it
> more
> > difficult for a user to extend them with additional functionality. And to
> > me the need for extension is apparent even when just looking at the 4
> basic
> > cases. Two of them are character delimited, and 2 of them are uppercase
> > delimited. There's two bits of shared functionality just in the 4 most
> > basic cases.
> >
> > Back to the exception topic, I don't think the tokens "my" "component"
> and
> > "1" can be formatted in PascalCase in a way that they could be parsed
> back
> > out into 3 tokens. So the question is less about whether it's possible to
> > format them and more about whether the API should format output that
> cannot
> > be parsed back into the same input. I think it makes sense to enforce
> that
> > consistency, or at the very least allow the user to enable it?
> >
> >
> >
> >
> > On Wed, Aug 9, 2023, 9:14 PM Elliotte Rusty Harold 
> > wrote:
> >
> > > On Wed, Aug 9, 2023 at 11:36 PM Daniel Watson 
> > > wrote:
> > > >
> > > > Meant to add...
> > > >
> > > > The reason I would favor exceptions is that the underlying
> > implementation
> > > > can be easily customized. If the user needs to allow non alphanumeric
> > > > characters there is a boolean flag in the underlying abstract class
> > > > (AbstractConfigurableCase) that will simply turn that validation off.
> > >
> > > This is another point, but customizability is a bug, not a feature. I
> > > don't want to guess what the method might be doing based on what flag
> > > was set where. I want camel case to mean one thing and one thing only.
> > > Ditto snake case, pascal case, and any other formats. Possibly there's
> > > a reason to add additional subclasses, but the
> > > CamelCase/SnakeCase/KebabCase classes should not emit different
> > > strings depending on how they're configured. The public API should be
> > > a pure function, not an object.
> > >
> > > --
> > > Elliotte Rusty Harold
> > > elh...@ibiblio.org
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> >
>


Re: [commons-lang] Util function in NumberUtils to count significant figures in a numeric string

2023-08-11 Thread Daniel Watson
Im am currently using that API for unit conversion. Don't remember seeing
anything related to uncertainty or precision. I'll double check, but IIRC
it's focused very much on just unit conversion.

On Thu, Aug 10, 2023, 9:40 PM Gary Gregory  wrote:

> See also JSR-363 https://jcp.org/en/jsr/detail?id=363
>
> Gary
>
> On Thu, Aug 10, 2023, 10:56 AM Daniel Watson  wrote:
>
> > I brought this up in commons-math and it was determined that that
> probably
> > wasn't a good place for it, as that lib focuses on computational
> functions.
> > It was also mentioned that commons-numbers was not a great place for the
> > static util method either.
> >
> > Essentially the need for this relates to scientific measurements.
> > Measurements are often reported with implied precision and uncertainty
> > (e.g. 0.0015 has 2 significant figures, 1.10 has 3, etc). Currently there
> > are no Number classes that retain or respect this information. There are
> > widely accepted conventions for how to retain and adjust both precision
> and
> > uncertainty during mathematical operations. But the first step is simply
> > knowing what those two values are. I propose a util method (already
> > written) in NumberUtils that can do this. The conventions are widely
> > documented but would be spelled out very specifically in the javadoc.
> > Although NumberUtils mainly focuses on pure math transformation, it does
> > also include some parsing, so this doesn't seem to be *completely* out of
> > scope.
> >
> > Is NumberUtils a possible home for this?
> >
> > On a separate, but related note, I honestly think this sort of math
> > actually deserves a full blown java Number implementation (similar to
> > BigFraction and Complex classes in commons- numbers). Possibly called
> > BigMeasurement? Which can interact with other Number implementations as
> > well as other BigMeasurements and retain/report the correct uncertainty
> and
> > precision throughout the computation. I haven't ironed that out - but a
> > necessary intermediate step is just being able to get the sigfig count.
> >
> > Dan
> >
>


[commons-lang] Util function in NumberUtils to count significant figures in a numeric string

2023-08-10 Thread Daniel Watson
I brought this up in commons-math and it was determined that that probably
wasn't a good place for it, as that lib focuses on computational functions.
It was also mentioned that commons-numbers was not a great place for the
static util method either.

Essentially the need for this relates to scientific measurements.
Measurements are often reported with implied precision and uncertainty
(e.g. 0.0015 has 2 significant figures, 1.10 has 3, etc). Currently there
are no Number classes that retain or respect this information. There are
widely accepted conventions for how to retain and adjust both precision and
uncertainty during mathematical operations. But the first step is simply
knowing what those two values are. I propose a util method (already
written) in NumberUtils that can do this. The conventions are widely
documented but would be spelled out very specifically in the javadoc.
Although NumberUtils mainly focuses on pure math transformation, it does
also include some parsing, so this doesn't seem to be *completely* out of
scope.

Is NumberUtils a possible home for this?

On a separate, but related note, I honestly think this sort of math
actually deserves a full blown java Number implementation (similar to
BigFraction and Complex classes in commons- numbers). Possibly called
BigMeasurement? Which can interact with other Number implementations as
well as other BigMeasurements and retain/report the correct uncertainty and
precision throughout the computation. I haven't ironed that out - but a
necessary intermediate step is just being able to get the sigfig count.

Dan


Re: [Meta] gitlab error responses to mailing list

2023-08-10 Thread Daniel Watson
I had my pitchfork ready, but I suppose a ban is more civil.

Thanks!

On Thu, Aug 10, 2023 at 9:41 AM Mark Thomas  wrote:

> Got them.
>
> The idiot concerned has won themselves a lifetime subscription to the
> deny list for commons-dev and the handful of other ASF lists they are
> subscribed to.
>
> Sorry it took a while to sort this out.
>
> Some of you won't yet have received the test message yet due to my mail
> server being rate limited. You should received it in the next few hours.
>
> Mark
>
>
> On 10/08/2023 09:48, Mark Thomas wrote:
> > Hi all,
> >
> > In an effort to trace the idiot that set up whatever process is
> > triggering these messages directly to anyone who posts to the dev list I
> > will be sending out some test messages later today. Each subscriber to
> > this list should only receive one test message. Apologies in advance for
> > the noise.
> >
> > Mark
> >
> >
> > On 07/08/2023 15:40, Gilles Sadowski wrote:
> >> Le lun. 7 août 2023 à 16:38, Gilles Sadowski  a
> >> écrit :
> >>>
> >>> Le lun. 7 août 2023 à 10:46, Mark Thomas  a écrit :
> 
>  Got the error message. To help me play hunt the subscriber, can anyone
>  provide information on when this behaviour started?
> >>>
> >>> I got one on Saturday at 11:17, in a thread with
> >>> [commons-math] Three Concerns
> >>> as subject line.  Content was:
> >>> ---CUT---
> >>> Unfortunately, your email message to GitLab could not be processed.
> >>>
> >>> We couldn't figure out what the email is for. Please create your issue
> >>> or comment through the web interface.
> >>> ---CUT---
> >>
> >> And again, just now, in reply to the above message...
> >>
> >>>
> >>> Regards,
> >>> Gilles
> >>>
> > [...]
> >>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> >> For additional commands, e-mail: dev-h...@commons.apache.org
> >>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-10 Thread Daniel Watson
After some thought, this wrapper class might be better named something like
BigMeasurement (or just Measurement?). Significant figures and precision
are very closely tied to measurements, since the act of measuring is really
what causes the uncertainty to begin with.

I think a static method to count sigfigs is worth adding to commons-lang
math utils, so ill propose it there. As for a wrapper class, Im not so
sure. Measurement calculation seems closer to something like
commons-numbers, but if its not quite close enough to fit then i'll just
retain it in my personal commons.

Thanks for the discussion!

On Wed, Aug 9, 2023 at 3:00 PM Daniel Watson  wrote:

> I believe the convention is to take the *least* precise term and apply
> that precision (here "precision" != "sigfigs" - Ive been using both terms
> to mean sigfigs, but for these purposes precision is actually defined as
> how small a fraction the measurement is able to convey - e.g 0.01 is more
> precise than 1.1, despite the latter having more sigfigs).
>
> The results should be...
>
> 12345 + 10.0 = 12355
> 12345 + 10 =  12355
> 12345 + 1 =  12346
> 12345 + 1.0 =  12346
> 12345 + 1.0 = 12346
>
> None of these will have decimal places because the left term was not
> precise enough to have them. When adding/subtracting you can end up with
> more significant figures in your result than you had in one of your terms,
> you just can end up with a more "precise" result than either of your
> terms.e.g.
>
> 999.0 + 9.41 = 1008.4
> 4 sigfigs + 3 sigfigs = 5 sigfigs - It's perfectly fine that we ended up
> with more here, as long as we didnt increase the "precision".
>
> So in this case I think the correct logic is to add the two terms together
> in the normal way, reduce the precision to that of the limiting term, and
> then recalculate the number of significant figures on the result.
>
> I believe that, conveniently, the BigDecimal class already tracks this as
> scale(). So the information is available to determine the new precision. It
> would just be a matter of retaining it within the wrapper class and
> applying it when producing the final output string. I'd need to play around
> with a few more examples, but I think that's the logic at a high level.
>
> Dan
>
> On Wed, Aug 9, 2023 at 2:08 PM Alex Herbert 
> wrote:
>
>> On Wed, 9 Aug 2023 at 17:13, Daniel Watson  wrote:
>>
>> > BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))
>>
>> Multiply is easy as you take the minimum significant figures. What
>> about addition?
>>
>> 12345 + 0.0001
>>
>> Here the significant figures should remain at 5.
>>
>> And for this:
>>
>> 12345 + 10.0
>> 12345 + 10
>> 12345 + 1
>> 12345 + 1.0
>> 12345 + 1.00
>>
>> You have to track the overlap of significant digits somehow.
>>
>> Alex
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
I would think it's possible to hide that "configuration" from the user such
that the implementation can only be reconfigured via extension. But I'm not
in love with the configurable base class either way. It was convenient to
have the common functionality in one place, but it's not a big deal to
handle that differently.

The tradeoff in having the Cases be pure functions is that it makes it more
difficult for a user to extend them with additional functionality. And to
me the need for extension is apparent even when just looking at the 4 basic
cases. Two of them are character delimited, and 2 of them are uppercase
delimited. There's two bits of shared functionality just in the 4 most
basic cases.

Back to the exception topic, I don't think the tokens "my" "component" and
"1" can be formatted in PascalCase in a way that they could be parsed back
out into 3 tokens. So the question is less about whether it's possible to
format them and more about whether the API should format output that cannot
be parsed back into the same input. I think it makes sense to enforce that
consistency, or at the very least allow the user to enable it?




On Wed, Aug 9, 2023, 9:14 PM Elliotte Rusty Harold 
wrote:

> On Wed, Aug 9, 2023 at 11:36 PM Daniel Watson 
> wrote:
> >
> > Meant to add...
> >
> > The reason I would favor exceptions is that the underlying implementation
> > can be easily customized. If the user needs to allow non alphanumeric
> > characters there is a boolean flag in the underlying abstract class
> > (AbstractConfigurableCase) that will simply turn that validation off.
>
> This is another point, but customizability is a bug, not a feature. I
> don't want to guess what the method might be doing based on what flag
> was set where. I want camel case to mean one thing and one thing only.
> Ditto snake case, pascal case, and any other formats. Possibly there's
> a reason to add additional subclasses, but the
> CamelCase/SnakeCase/KebabCase classes should not emit different
> strings depending on how they're configured. The public API should be
> a pure function, not an object.
>
> --
> Elliotte Rusty Harold
> elh...@ibiblio.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
Currently those exceptions do capture token and character index
information, but i think im just using it to create the message. I get what
you're saying but without them testing becomes less accurate. If IAE is
being thrown all over the place then asserting a failure can't actually
guarantee that it failed in the expected way.


In regards to what Elliotte said...


Not every set of tokens can actually be represented deterministcally in
every case. Which is why I think exceptions are needed.

my-component-1

Is a valid kebab cased string, with tokens my,component,1

However this cannot be formatted in camel case or Pascal case, because they
are delimited by alpha characters.

If those tokens were passed to those cases I would expect an exception to
be thrown, other wise the result is not reciprocal.  e.g. MyComponent1 is
only two PascalCase tokens.

On Wed, Aug 9, 2023, 7:36 PM Daniel Watson  wrote:

> Meant to add...
>
> The reason I would favor exceptions is that the underlying implementation
> can be easily customized. If the user needs to allow non alphanumeric
> characters there is a boolean flag in the underlying abstract class
> (AbstractConfigurableCase) that will simply turn that validation off. I
> don't think we need to make any specific implementation be significantly
> error tolerant.
>
> An extension of snake case to allow all characters should look like..
>
>
> class MySnakeCase extends SnakeCase {
> MySnakeCase(){
> super().
> this.alphanuneric = false;
> }
> }
>
>
> On Wed, Aug 9, 2023, 7:29 PM Daniel Watson  wrote:
>
>> Currently I'm planning a set of exceptions that are thrown for various
>> reasons. I created multiple classes to allow for clearer testing.
>>
>> ReservedCharacterException (extends InvalidCharacterException below) -
>> thrown specifically when a reserved character is encountered within a token.
>>
>> InvalidCharacterException (extends IllegalArgumentException) thrown
>> directly any time an illegal character is encountered.
>>
>> ZeroLengthTokenException (extends Illegal arg excep) - thrown when a zero
>> length token is encountered and Case does not support it.
>>
>> There are a few other error cases I believe. I'm not looking at the code
>> right this moment but I'm fairly certain about the need for the above 3.
>>
>>
>> On Wed, Aug 9, 2023, 6:08 PM Elliotte Rusty Harold 
>> wrote:
>>
>>> What happens when a token contains an unpermitted character?
>>>
>>> On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson 
>>> wrote:
>>> >
>>> > Here's my stab at a spec. Wanted to clarify some parts of the Case
>>> > interface first before jumping into the implementations. Wondering
>>> what a
>>> > good package name for this stuff is, given that "case" is a reserved
>>> word?
>>> >
>>> > Case (interface)
>>> > The Case interface defines two methods:
>>> > * String format(Iterable tokens)
>>> > The format method accepts an Iterable of String tokens and returns a
>>> single
>>> > String formatted according to the implementation. The format method is
>>> > intended to handle transforming between cases, thus tokens passed to
>>> the
>>> > format() method need not be properly formatted for the given Case
>>> instance,
>>> > though they must still respect any reserve character restrictions.
>>> > * List parse(String string)
>>> > The parse method accepts a single string and returns a List of string
>>> > tokens that abide by the Case implementation.
>>> > Note: format() and parse() methods must be fully reciprocal. ie. On a
>>> > single Case instance, when calling parse() with a valid string, and
>>> passing
>>> > the resulting tokens into format(), a matching string should be
>>> returned.
>>> >
>>> > DelimitedCase (base class for kebab and snake)
>>> > Defines a Case where all tokens are separated by a single character
>>> > delimiter. The delimiter is considered a reserved character and is not
>>> > allowed to appear within tokens when formatting. No further
>>> restrictions
>>> > are placed on token contents by this base implementation. Tokens can
>>> > contain any valid Java String character. DelimitedCases can support
>>> > zero-length tokens, which can occur if there are no characters between
>>> two
>>> > instances of the delimiter or if the parsed string begins or ends with
>>> the
>>> > delimiter.
>>> > Note: Other C

Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
Meant to add...

The reason I would favor exceptions is that the underlying implementation
can be easily customized. If the user needs to allow non alphanumeric
characters there is a boolean flag in the underlying abstract class
(AbstractConfigurableCase) that will simply turn that validation off. I
don't think we need to make any specific implementation be significantly
error tolerant.

An extension of snake case to allow all characters should look like..


class MySnakeCase extends SnakeCase {
MySnakeCase(){
super().
this.alphanuneric = false;
}
}


On Wed, Aug 9, 2023, 7:29 PM Daniel Watson  wrote:

> Currently I'm planning a set of exceptions that are thrown for various
> reasons. I created multiple classes to allow for clearer testing.
>
> ReservedCharacterException (extends InvalidCharacterException below) -
> thrown specifically when a reserved character is encountered within a token.
>
> InvalidCharacterException (extends IllegalArgumentException) thrown
> directly any time an illegal character is encountered.
>
> ZeroLengthTokenException (extends Illegal arg excep) - thrown when a zero
> length token is encountered and Case does not support it.
>
> There are a few other error cases I believe. I'm not looking at the code
> right this moment but I'm fairly certain about the need for the above 3.
>
>
> On Wed, Aug 9, 2023, 6:08 PM Elliotte Rusty Harold 
> wrote:
>
>> What happens when a token contains an unpermitted character?
>>
>> On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson 
>> wrote:
>> >
>> > Here's my stab at a spec. Wanted to clarify some parts of the Case
>> > interface first before jumping into the implementations. Wondering what
>> a
>> > good package name for this stuff is, given that "case" is a reserved
>> word?
>> >
>> > Case (interface)
>> > The Case interface defines two methods:
>> > * String format(Iterable tokens)
>> > The format method accepts an Iterable of String tokens and returns a
>> single
>> > String formatted according to the implementation. The format method is
>> > intended to handle transforming between cases, thus tokens passed to the
>> > format() method need not be properly formatted for the given Case
>> instance,
>> > though they must still respect any reserve character restrictions.
>> > * List parse(String string)
>> > The parse method accepts a single string and returns a List of string
>> > tokens that abide by the Case implementation.
>> > Note: format() and parse() methods must be fully reciprocal. ie. On a
>> > single Case instance, when calling parse() with a valid string, and
>> passing
>> > the resulting tokens into format(), a matching string should be
>> returned.
>> >
>> > DelimitedCase (base class for kebab and snake)
>> > Defines a Case where all tokens are separated by a single character
>> > delimiter. The delimiter is considered a reserved character and is not
>> > allowed to appear within tokens when formatting. No further restrictions
>> > are placed on token contents by this base implementation. Tokens can
>> > contain any valid Java String character. DelimitedCases can support
>> > zero-length tokens, which can occur if there are no characters between
>> two
>> > instances of the delimiter or if the parsed string begins or ends with
>> the
>> > delimiter.
>> > Note: Other Case implementations may not support zero-length tokens, and
>> > attempts to call format(...) with empty tokens may fail.
>> >
>> > KebabCase
>> > Extends DelimitedCase and initializes the delimiter as the hyphen '-'
>> > character. This case allows only alphanumeric characters within tokens.
>> >
>> > SnakeCase
>> > Extends DelimitedCase and initializes the delimiter as the underscore
>> '_'
>> > character. This case allows only alphanumeric characters within tokens.
>> >
>> > PascalCase
>> > Defines a Case where tokens begin with an uppercase alpha character. All
>> > subsequent token characters must be lowercase alpha or numeric
>> characters.
>> > Whenever an uppercase alpha character is encountered, the previous
>> token is
>> > considered complete and a new token begins, with the uppercase character
>> > being the first character of the new token. PascalCase does not allow
>> > zero-length tokens when formatting, as it would violate the reciprocal
>> > contract of format() and parse().
>> >
>> > CamelCase
>> > Extends PascalCase and sets one additional re

Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
Currently I'm planning a set of exceptions that are thrown for various
reasons. I created multiple classes to allow for clearer testing.

ReservedCharacterException (extends InvalidCharacterException below) -
thrown specifically when a reserved character is encountered within a token.

InvalidCharacterException (extends IllegalArgumentException) thrown
directly any time an illegal character is encountered.

ZeroLengthTokenException (extends Illegal arg excep) - thrown when a zero
length token is encountered and Case does not support it.

There are a few other error cases I believe. I'm not looking at the code
right this moment but I'm fairly certain about the need for the above 3.


On Wed, Aug 9, 2023, 6:08 PM Elliotte Rusty Harold 
wrote:

> What happens when a token contains an unpermitted character?
>
> On Wed, Aug 9, 2023 at 8:30 PM Daniel Watson  wrote:
> >
> > Here's my stab at a spec. Wanted to clarify some parts of the Case
> > interface first before jumping into the implementations. Wondering what a
> > good package name for this stuff is, given that "case" is a reserved
> word?
> >
> > Case (interface)
> > The Case interface defines two methods:
> > * String format(Iterable tokens)
> > The format method accepts an Iterable of String tokens and returns a
> single
> > String formatted according to the implementation. The format method is
> > intended to handle transforming between cases, thus tokens passed to the
> > format() method need not be properly formatted for the given Case
> instance,
> > though they must still respect any reserve character restrictions.
> > * List parse(String string)
> > The parse method accepts a single string and returns a List of string
> > tokens that abide by the Case implementation.
> > Note: format() and parse() methods must be fully reciprocal. ie. On a
> > single Case instance, when calling parse() with a valid string, and
> passing
> > the resulting tokens into format(), a matching string should be returned.
> >
> > DelimitedCase (base class for kebab and snake)
> > Defines a Case where all tokens are separated by a single character
> > delimiter. The delimiter is considered a reserved character and is not
> > allowed to appear within tokens when formatting. No further restrictions
> > are placed on token contents by this base implementation. Tokens can
> > contain any valid Java String character. DelimitedCases can support
> > zero-length tokens, which can occur if there are no characters between
> two
> > instances of the delimiter or if the parsed string begins or ends with
> the
> > delimiter.
> > Note: Other Case implementations may not support zero-length tokens, and
> > attempts to call format(...) with empty tokens may fail.
> >
> > KebabCase
> > Extends DelimitedCase and initializes the delimiter as the hyphen '-'
> > character. This case allows only alphanumeric characters within tokens.
> >
> > SnakeCase
> > Extends DelimitedCase and initializes the delimiter as the underscore '_'
> > character. This case allows only alphanumeric characters within tokens.
> >
> > PascalCase
> > Defines a Case where tokens begin with an uppercase alpha character. All
> > subsequent token characters must be lowercase alpha or numeric
> characters.
> > Whenever an uppercase alpha character is encountered, the previous token
> is
> > considered complete and a new token begins, with the uppercase character
> > being the first character of the new token. PascalCase does not allow
> > zero-length tokens when formatting, as it would violate the reciprocal
> > contract of format() and parse().
> >
> > CamelCase
> > Extends PascalCase and sets one additional restriction - that the first
> > character of the first token (ie the first character of the full string)
> > must be a lowercase alpha character (rather than the uppercase
> requirement
> > of PascalCase). All other restrictions of PascalCase apply.
> >
> >
> > On Tue, Aug 8, 2023 at 8:55 PM Daniel Watson 
> wrote:
> >
> > > Kebab case is extremely common for web identifiers, eg html element
> ids,
> > > classes, attributes, etc.
> > >
> > > In regards to PascalCase, i agree that most people won't understand the
> > > reasoning behind the name, but it is nevertheless a widely accepted
> term
> > > for that case style. If an alternative is deemed necessary then
> > > "ProperCase" might work - since that is also how English proper nouns
> are
> > > cased. Understanding that name just depends on your knowledge of
> English
> > >

Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-09 Thread Daniel Watson
Here's my stab at a spec. Wanted to clarify some parts of the Case
interface first before jumping into the implementations. Wondering what a
good package name for this stuff is, given that "case" is a reserved word?

Case (interface)
The Case interface defines two methods:
* String format(Iterable tokens)
The format method accepts an Iterable of String tokens and returns a single
String formatted according to the implementation. The format method is
intended to handle transforming between cases, thus tokens passed to the
format() method need not be properly formatted for the given Case instance,
though they must still respect any reserve character restrictions.
* List parse(String string)
The parse method accepts a single string and returns a List of string
tokens that abide by the Case implementation.
Note: format() and parse() methods must be fully reciprocal. ie. On a
single Case instance, when calling parse() with a valid string, and passing
the resulting tokens into format(), a matching string should be returned.

DelimitedCase (base class for kebab and snake)
Defines a Case where all tokens are separated by a single character
delimiter. The delimiter is considered a reserved character and is not
allowed to appear within tokens when formatting. No further restrictions
are placed on token contents by this base implementation. Tokens can
contain any valid Java String character. DelimitedCases can support
zero-length tokens, which can occur if there are no characters between two
instances of the delimiter or if the parsed string begins or ends with the
delimiter.
Note: Other Case implementations may not support zero-length tokens, and
attempts to call format(...) with empty tokens may fail.

KebabCase
Extends DelimitedCase and initializes the delimiter as the hyphen '-'
character. This case allows only alphanumeric characters within tokens.

SnakeCase
Extends DelimitedCase and initializes the delimiter as the underscore '_'
character. This case allows only alphanumeric characters within tokens.

PascalCase
Defines a Case where tokens begin with an uppercase alpha character. All
subsequent token characters must be lowercase alpha or numeric characters.
Whenever an uppercase alpha character is encountered, the previous token is
considered complete and a new token begins, with the uppercase character
being the first character of the new token. PascalCase does not allow
zero-length tokens when formatting, as it would violate the reciprocal
contract of format() and parse().

CamelCase
Extends PascalCase and sets one additional restriction - that the first
character of the first token (ie the first character of the full string)
must be a lowercase alpha character (rather than the uppercase requirement
of PascalCase). All other restrictions of PascalCase apply.


On Tue, Aug 8, 2023 at 8:55 PM Daniel Watson  wrote:

> Kebab case is extremely common for web identifiers, eg html element ids,
> classes, attributes, etc.
>
> In regards to PascalCase, i agree that most people won't understand the
> reasoning behind the name, but it is nevertheless a widely accepted term
> for that case style. If an alternative is deemed necessary then
> "ProperCase" might work - since that is also how English proper nouns are
> cased. Understanding that name just depends on your knowledge of English
> grammar.
>
> A spec can definitely be written for the 4 provided concrete
> implementations. And... I may eat these words but... the spec should not be
> all that complex. I will take a stab at it.
>
> Thanks for the feedback!
> Any other thoughts or comments are welcome!
>
> Dan
>
>
> On Tue, Aug 8, 2023, 7:45 PM Elliotte Rusty Harold 
> wrote:
>
>> This is a good idea and seems like useful functionality. In order to
>> accept it into commons, it needs solid documentation and excellent
>> test coverage. I've worked on code like this in another language (not
>> Java) and the production bugs were bad. E.g. what happens when a
>> string contains numbers as well as letters?
>>
>> I'd like to see a full spec that unambiguously defines how every
>> Unicode string is converted into camel/snake/kebab case. The spec
>> should be independent of the code. That's not easy to write but it's
>> essential.
>>
>> I don't want any loose/strict modes. It should all be strict according to
>> spec.
>>
>> I've never heard of kebab cases before. Is that a common name? I'd
>> also like to rename Pascal case. How many programmers under 40 have
>> even heard of Pascal, much less are familiar with its case
>> conventions?
>>
>> Long story short - a PR is premature until there's an agreed upon spec.
>>
>> On Tue, Aug 8, 2023 at 8:04 PM Daniel Watson 
>> wrote:
>> >
>> > I have a bit of code that adds the ability to pa

Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
I believe the convention is to take the *least* precise term and apply that
precision (here "precision" != "sigfigs" - Ive been using both terms to
mean sigfigs, but for these purposes precision is actually defined as how
small a fraction the measurement is able to convey - e.g 0.01 is more
precise than 1.1, despite the latter having more sigfigs).

The results should be...

12345 + 10.0 = 12355
12345 + 10 =  12355
12345 + 1 =  12346
12345 + 1.0 =  12346
12345 + 1.0 = 12346

None of these will have decimal places because the left term was not
precise enough to have them. When adding/subtracting you can end up with
more significant figures in your result than you had in one of your terms,
you just can end up with a more "precise" result than either of your
terms.e.g.

999.0 + 9.41 = 1008.4
4 sigfigs + 3 sigfigs = 5 sigfigs - It's perfectly fine that we ended up
with more here, as long as we didnt increase the "precision".

So in this case I think the correct logic is to add the two terms together
in the normal way, reduce the precision to that of the limiting term, and
then recalculate the number of significant figures on the result.

I believe that, conveniently, the BigDecimal class already tracks this as
scale(). So the information is available to determine the new precision. It
would just be a matter of retaining it within the wrapper class and
applying it when producing the final output string. I'd need to play around
with a few more examples, but I think that's the logic at a high level.

Dan

On Wed, Aug 9, 2023 at 2:08 PM Alex Herbert 
wrote:

> On Wed, 9 Aug 2023 at 17:13, Daniel Watson  wrote:
>
> > BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))
>
> Multiply is easy as you take the minimum significant figures. What
> about addition?
>
> 12345 + 0.0001
>
> Here the significant figures should remain at 5.
>
> And for this:
>
> 12345 + 10.0
> 12345 + 10
> 12345 + 1
> 12345 + 1.0
> 12345 + 1.00
>
> You have to track the overlap of significant digits somehow.
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
Ah I see what you were asking. Yes it is up to the human entering data to
understand that 1 has exactly one sigfig according to standard
convention. If you need it to have more then you must write it in full
scientific notation. Obviously If a specific precision is required due to
some flaw in the dataset then the user could manually override the detected
sigfig count. But the assumption of the parsing logic is that the input
abides by the standard convention, which is well defined. I don't see it as
being much different than any other Number class expecting the input to
abide by a specific format. Conventions for SigFig counting are well
defined. It just so happens that most people don't often need them (but the
same could be said for o.a.c.numbers.Complex).

As far as exact calculations, if the user did:

BigSigFig result = new BigSigFig("1.1").multiply(new BigDecimal("2.54"))

I would expect the BigSigFig class should understand that BigDecimal has no
sigfig limit, and would retain it's current minimum of 2. It would only
apply a new minimum in the case of operating against another BigSigFig...

BigSigFig result = new BigSigFig("1.1").multiply(new BigSigFig("2"))

The result of that should be a BigSigFig with an internal value of exactly
2.2 but would output as "2" to respect the new sigfig count. I think
something like that should be possible. In the end this is more of a
parsing / formatting exercise. The wrinkle is the tracking aspect, where we
need to dynamically reduce the sigfigs based on other operations. That's
where a wrapper class I think comes in handy.

Dan


On Wed, Aug 9, 2023 at 11:23 AM Alex Herbert 
wrote:

> On Wed, 9 Aug 2023 at 15:43, Daniel Watson  wrote:
> >
> > Hope that answers more questions than it creates!
>
> It does not address the issue of the last significant zero, e.g:
>
> 1 (4 sf)
> 1 (3 sf)
> 1 (2 sf)
>
> One way to solve this with standard parsing would be to use scientific
> notation:
>
> 1.000e4
> 1.00e4
> 1.0e4
>
> Note that for the example of inch to cm conversions the value 2.54
> cm/inch is exact. This leads to the issue that there should be a way
> to exclude some input from limiting the detection of the lowest
> significant figure (i.e. mark numbers as exact). This puts some
> responsibility on the provider of the data to follow a format; and
> some on the parser to know what fields to analyse.
>
> Alex
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
Before I answer your questions - I'll say that looking at the commons-math
codebase it is apparent that it's focused on specific functional
computation, rather than util-like features. So I agree this probably
doesn't fit well there. I honestly did not know commons-numbers existed.
I'll check there and then either move this discussion there or commons-lang.

(I'll respond to your questions anyway just in case this ever comes up
again or anyone is curious)

The use case is reading of text data (e.g. CSV) where significant figures
are implied according to the standard rules. Data that is already typed to
a standard java Number would have no inherent significant figure tracking
and it cannot be reliably determined (for the reasons you mentioned). If
the data is represented in that fashion then sigfigs must be
provided/applied separately.

The significant figures of the input data are inherently "verified" because
scientific calculations of this nature are provided by humans (obviously
cant account for some forms of human error) and humans will know
the precision of their apparatus, and can communicate it using the standard
rules of sigfigs - If thats not the case then the user should not be using
this api. Because the input data is verified, the output data is also
"verified" as long as this logic is correct.

I don't believe there is a need for repeating special characters when a
number of significant figures is known. In the case of infinite precision,
the BigDecimal class already handles that. When significant figures are
known then something like 1000/3 can and should be reported as 0.3 (or in
scientific notation) because there is only a single significant figure in
that calculation. A repeating 3 would imply precision that does not exist.
(Admittedly I need to double check this. I know that for pure mathematical
values e.g. conversion from feet to inches, the conversion has infinite
precision. However as long as the initial measurement has a precision then
the output will also necessarily have that same precision). Intermediate
calculations can use infinite precision, which could be handled internally
via BigDecimal. But final results should be reported with proper sigfig
rules applied.

You are correct that "1" would not be the same as "1.000" and for clinical
/ scientific data this is known to be important. "1" implies 1 sigfig,
"1.000" implies 4. This is why the data most likely will be represented as
text.

Determining if the String is a number is simpler in this case I think?
Assuming decimal base (and potentially scientific notation) there are a
limited number of characters and syntax. isCreateable() attempts to handle
different bases as well as type qualifiers whereas this logic would be
restricted to decimal base and syntax. (theoretically I suppose you could
use a different bases, but scientific calculations are rarely, if ever,
carried out in anything other than decimal. Seems natural that they would
be out of scope).

As for a wrapped class, my initial thought (though I havent worked out the
details) would be to extend BigDecimal and use its arithmetic logic.
Relevant methods would be overridden to ensure the sigfig subclass is
returned. There may be issues with that, I havent fleshed it out.

Ultimately the initial goal would be to simply count the number of sigfigs
through some text util/parse method. The fact that sigfigs are normally
conveyed via textual representation means that many of the issues you might
encounter trying to derive them from pure numbers doesn't apply.

Hope that answers more questions than it creates!

Dan

On Wed, Aug 9, 2023 at 8:48 AM Alex Herbert 
wrote:

> Hi,
>
> On Wed, 9 Aug 2023 at 12:27, Daniel Watson  wrote:
>
> > This feature is necessary when working with scientific/clinical data
> which
> > was reported with significant figures in mind, and for which calculation
> > results must respect the sigfig count. As far as I could tell there is no
> > Number implementation which correctly respects this. e.g.
> >
> > "11000" has 2 significant figures,
> > "11000." has 5
> > ".11000" has 5
> > "11000.0" has 6
>
> This functionality is not in Commons AFAIK. Is the counting to accept
> a String input?
>
> Q. What is the use case that you would read data in text format and
> have to compute the significant figures? Or are you reading data in
> numeric format and computing the decimal significant figures of the
> base-2 data representation? Note: Differences between base-10 and
> base-2 representations can lead to an implementation that satisfies
> one use case and not others due to rounding conversions (see
> NUMBERS-199 [1]). I would advise against this and only support text
> input when referring to decimal significant figures.
>
> I presume you have i

[commons-math] function or Number class to count/track number of significant figures

2023-08-09 Thread Daniel Watson
I noticed there is not (or I could not find) a function within commons-math
to count the number of significant figures in a number string. I wrote a
function to do it and want to make sure I'm not missing something within
commons-math before submitting a PR.

This feature is necessary when working with scientific/clinical data which
was reported with significant figures in mind, and for which calculation
results must respect the sigfig count. As far as I could tell there is no
Number implementation which correctly respects this. e.g.

"11000" has 2 significant figures,
"11000." has 5
".11000" has 5
"11000.0" has 6

Other points:
* BigDecimal.precision is not a substitute because it trailing whole zeros
are significant
* Floats, which can report scientific notation, are not a substitute when
calculations must be exact
* Ive also considered extending BigDecimal to support tracking and
enforcing sigfigs. This would still require the function to initially count
them.

Is this appropriate for a PR? Or have I missed an existing feature?

Dan


Re: [commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-08 Thread Daniel Watson
Kebab case is extremely common for web identifiers, eg html element ids,
classes, attributes, etc.

In regards to PascalCase, i agree that most people won't understand the
reasoning behind the name, but it is nevertheless a widely accepted term
for that case style. If an alternative is deemed necessary then
"ProperCase" might work - since that is also how English proper nouns are
cased. Understanding that name just depends on your knowledge of English
grammar.

A spec can definitely be written for the 4 provided concrete
implementations. And... I may eat these words but... the spec should not be
all that complex. I will take a stab at it.

Thanks for the feedback!
Any other thoughts or comments are welcome!

Dan


On Tue, Aug 8, 2023, 7:45 PM Elliotte Rusty Harold 
wrote:

> This is a good idea and seems like useful functionality. In order to
> accept it into commons, it needs solid documentation and excellent
> test coverage. I've worked on code like this in another language (not
> Java) and the production bugs were bad. E.g. what happens when a
> string contains numbers as well as letters?
>
> I'd like to see a full spec that unambiguously defines how every
> Unicode string is converted into camel/snake/kebab case. The spec
> should be independent of the code. That's not easy to write but it's
> essential.
>
> I don't want any loose/strict modes. It should all be strict according to
> spec.
>
> I've never heard of kebab cases before. Is that a common name? I'd
> also like to rename Pascal case. How many programmers under 40 have
> even heard of Pascal, much less are familiar with its case
> conventions?
>
> Long story short - a PR is premature until there's an agreed upon spec.
>
> On Tue, Aug 8, 2023 at 8:04 PM Daniel Watson  wrote:
> >
> > I have a bit of code that adds the ability to parse and format strings
> into
> > various case patterns. Wanted to check if it's of worth and in-scope for
> > commons-text...
> >
> > Its a bit broader than the existing CaseUtils.toCamelCase(...) method.
> > Rather than simply formatting tokens into the case, this API adds the
> > additional goal of being able to transform one case to another. e.g.
> >
> > SnakeCase.format(PascalCase.parse("MyPascalString")); // returns
> > My_Pascal_String
> > CamelCase.format(SnakeCase.parse("my_snake_string")); // returns
> > mySnakeString
> > KebabCase.format(CamelCase.parse("myCamelString")); // returns
> > my-Camel-String
> > //Note that kebab and snake do not alter the alphabetic case of the
> tokens,
> > as they are essentially case agnostic joining, according to this
> > implementation. Though this can be overridden by end users.
> >
> > The API has one core interface: Case, which has format and parse methods.
> > There is a single abstract implementation of it -
> AbstractConfigurableCase
> > - which is a configuration driven way to create a case pattern. It has
> > enough options to accommodate the 4 popular cases, and thus the
> subclasses
> > just have to configure these options rather than implement them directly.
> > Any further extensions can override or extend the api as necessary.
> >
> > There are five core concrete implementations:
> >
> > PascalCase
> > CamelCase (extends PascalCase)
> > DelimitedCase
> > KebabCase (extends DelimitedCase)
> > SnakeCase (extends DelimitedCase)
> >
> > Each has a static INSTANCE field to avoid redundant instantiation.
> >
> > Some of my reasoning / concerns...
> >
> > * I considered bundling all of this logic into static methods, similar to
> > CaseUtils, but that prevents the user from truly customizing or extending
> > the code for odd cases. This approach is, in my opinion, far easier to
> > understand, extend, and debug.
> > * I believe the parsing side should potentially have a loose / strict
> mode,
> > in that the logic can ignore non-critical rules on the parsing side. e.g.
> > the command CamelCase.parse("MyString") should work, even though the
> input
> > is not strictly camel case. Strict parsing would ensure (if possible)
> that
> > the input abides by all elements of the format.
> > * I'm still unsure about how best to handle reserved characters when
> > translating. e.g. How should
> > KebabCase.format(PascalCase.parse("MyPascal-String")) handle the hyphen?
> > Should the kebab case strip the reserved character from the token values?
> >
> > Long story short - is this worth pursuing in the form of a pull request
> for
> > review? Or is it out of scope for commons-text?
> >
> > Dan
>
>
>
> --
> Elliotte Rusty Harold
> elh...@ibiblio.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


[commons-text] Additional CaseUtils type functionality that can handle snake, kebab, camel, pascal, and others

2023-08-08 Thread Daniel Watson
I have a bit of code that adds the ability to parse and format strings into
various case patterns. Wanted to check if it's of worth and in-scope for
commons-text...

Its a bit broader than the existing CaseUtils.toCamelCase(...) method.
Rather than simply formatting tokens into the case, this API adds the
additional goal of being able to transform one case to another. e.g.

SnakeCase.format(PascalCase.parse("MyPascalString")); // returns
My_Pascal_String
CamelCase.format(SnakeCase.parse("my_snake_string")); // returns
mySnakeString
KebabCase.format(CamelCase.parse("myCamelString")); // returns
my-Camel-String
//Note that kebab and snake do not alter the alphabetic case of the tokens,
as they are essentially case agnostic joining, according to this
implementation. Though this can be overridden by end users.

The API has one core interface: Case, which has format and parse methods.
There is a single abstract implementation of it - AbstractConfigurableCase
- which is a configuration driven way to create a case pattern. It has
enough options to accommodate the 4 popular cases, and thus the subclasses
just have to configure these options rather than implement them directly.
Any further extensions can override or extend the api as necessary.

There are five core concrete implementations:

PascalCase
CamelCase (extends PascalCase)
DelimitedCase
KebabCase (extends DelimitedCase)
SnakeCase (extends DelimitedCase)

Each has a static INSTANCE field to avoid redundant instantiation.

Some of my reasoning / concerns...

* I considered bundling all of this logic into static methods, similar to
CaseUtils, but that prevents the user from truly customizing or extending
the code for odd cases. This approach is, in my opinion, far easier to
understand, extend, and debug.
* I believe the parsing side should potentially have a loose / strict mode,
in that the logic can ignore non-critical rules on the parsing side. e.g.
the command CamelCase.parse("MyString") should work, even though the input
is not strictly camel case. Strict parsing would ensure (if possible) that
the input abides by all elements of the format.
* I'm still unsure about how best to handle reserved characters when
translating. e.g. How should
KebabCase.format(PascalCase.parse("MyPascal-String")) handle the hyphen?
Should the kebab case strip the reserved character from the token values?

Long story short - is this worth pursuing in the form of a pull request for
review? Or is it out of scope for commons-text?

Dan


[Meta] gitlab error responses to mailing list

2023-08-06 Thread Daniel Watson
Does anyone else get gitlab error messages in response to emails sent to
this list (coming from supp...@cons3rt.com) ? The messages have no
information as to the cause or resolution. Can't find any documentation
about it on mailing list page.


Re: [commons-lang] Comments on new FunctionUtils / nested lambda feature

2023-08-06 Thread Daniel Watson
Yep that's correct. You cant get strong typing with varargs. Overloading
(yes, lazy) is how I handle it right now. I believe there's really one 2
methods that do anything significant to accomplish the goal, one that calls
a single nested function, and one that calls a single nested BiConsumer.
The rest essentially just chain on top of those.

More thought could certainly be given to it. There may be other related use
cases I haven't encountered. And I've yet to need nesting beyond 3 levels,
but would  probaby offer methods that go a few levels beyond that, if only
because the cost is very little.

On Sun, Aug 6, 2023, 5:49 AM Rob Spoor  wrote:

> I don't think that function chaining with varargs works, except with
> UnaryOperator. After all, the output type of the first must be
> compatible with the input type of the second, the output type of the
> second must be compatible with the input type of the third, etc.
>
> If you want to continue this way, the best option would be to have some
> overloads:
>
>   Function nested(
>  Function first,
>  Function second)
>   Function nested(
>  Function first,
>  Function second,
>  R defaultValue)
>   Function nested(
>  Function first,
>  Function second,
>  Function third)
>   Function nested(
>  Function first,
>  Function second,
>  Function third,
>  R defaultValue)
>  ...
>
> If you're lazy you can delegate the overload with N functions to the
> overload with N-1 functions:
>
>   Function nested(
>  Function first,
>  Function second,
>  Function third,
>  Function fourth,
>  R defaultValue) {
>
>  return nested(first, nested(second, third, fourth),
>  defaultValue);
>  }
>
>
> Rob
>
>
> On 06/08/2023 01:28, Gary Gregory wrote:
> > I'm not sure the "nested" example API is quite what it should be, because
> > the last argument is the default value, you cannot make the input
> functions
> > a vararg, which seems very limiting. I should be able to use the same API
> > whether I need to go 1, 2, or N functions deep. I'm saying the above
> > independently of whether this type of code should be in Lang.
> >
> > Gary
> >
> > On Sat, Aug 5, 2023, 9:27 AM Daniel Watson  wrote:
> >
> >> Nice.
> >>
> >> Sounds like everyone is leaning towards "no". Would it be worth
> submitting
> >> a PR to include more usage examples - which I assume could also serve
> as a
> >> place to collect more feedback? Or just keep it within this thread given
> >> the way it's leaning? (or unless that consensus changes)
> >>
> >> Ultimately in my web/UI project the reduction (after using
> function(...))
> >> is something like...
> >>
> >> Failable.asFunction(Parent::getChild)
> >> .andThen(Optional::ofNullable)
> >> .andThen(o -> o.map(Child::getGrandChild))
> >> .andThen(o-> o.map(GrandChild::getName).orElse(defaultValue));
> >>
> >> vs my util method
> >>
> >> FunctionUtils.nested(Parent::getChild, Child::getGrandChild,
> >> GrandChild::getName, defaultValue);
> >>
> >> So it's still a big difference in clarity for me, given how often its
> used.
> >> FWIW - My project is using Vaadin, and this util function is used to
> bind
> >> nested bean properties to Vaadin input fields. On that note - In
> addition
> >> to the bean "getter" binding, it also uses a similar util method to bind
> >> bean "setter" methods - because input fields obviously need access to
> both.
> >> The setter util call looks similar, with the last argument being
> >> a BiConsumer...
> >>
> >> FunctionUtils.nested(Parent::getChild, Child::getGrandChild,
> >> GrandChild::setName);
> >>
> >> Although in general this code does not reference any Vaadin specific
> >> functionality, the overall use case may be quite specific to those
> needs,
> >> so all of these utilities may be better suited to a utils class within a
> >> vaadin specific library.
> >>
> >> Dan
> >>
> >> On Fri, Aug 4, 2023 at 9:11 PM Gary Gregory 
> >> wrote:
> >>
> >>> The function() method is a great technique, it's now in Functions and
> >>> FailableFunction (git master).
> >>>
> >>> I'll see later if it can be used within La

Re: [commons-lang] Comments on new FunctionUtils / nested lambda feature

2023-08-05 Thread Daniel Watson
Nice.

Sounds like everyone is leaning towards "no". Would it be worth submitting
a PR to include more usage examples - which I assume could also serve as a
place to collect more feedback? Or just keep it within this thread given
the way it's leaning? (or unless that consensus changes)

Ultimately in my web/UI project the reduction (after using function(...))
is something like...

Failable.asFunction(Parent::getChild)
.andThen(Optional::ofNullable)
.andThen(o -> o.map(Child::getGrandChild))
.andThen(o-> o.map(GrandChild::getName).orElse(defaultValue));

vs my util method

FunctionUtils.nested(Parent::getChild, Child::getGrandChild,
GrandChild::getName, defaultValue);

So it's still a big difference in clarity for me, given how often its used.
FWIW - My project is using Vaadin, and this util function is used to bind
nested bean properties to Vaadin input fields. On that note - In addition
to the bean "getter" binding, it also uses a similar util method to bind
bean "setter" methods - because input fields obviously need access to both.
The setter util call looks similar, with the last argument being
a BiConsumer...

FunctionUtils.nested(Parent::getChild, Child::getGrandChild,
GrandChild::setName);

Although in general this code does not reference any Vaadin specific
functionality, the overall use case may be quite specific to those needs,
so all of these utilities may be better suited to a utils class within a
vaadin specific library.

Dan

On Fri, Aug 4, 2023 at 9:11 PM Gary Gregory  wrote:

> The function() method is a great technique, it's now in Functions and
> FailableFunction (git master).
>
> I'll see later if it can be used within Lang. I know I can use it in other
> projects.
>
> Wrt an API for a vararg of functions that implements chaining internally,
> I'm not so sure. I've though I needed something like that in past, but I've
> always ended up with other coding patterns I found better at the time for
> whatever reason..
>
> Gary
>
> Gary
>
> On Fri, Aug 4, 2023, 3:24 PM Gary Gregory  wrote:
>
> > Worth adding adding function(Function)? Seems low cost to add it
> > FailableFunction.
> >
> > Gary
> >
> > On Fri, Aug 4, 2023, 2:04 PM Rob Spoor  wrote:
> >
> >> With just one simple utility method you can get all the chaining you
> want:
> >>
> >>  public static  Function function(Function func) {
> >>  return func;
> >>  }
> >>
> >> This doesn't look very useful, but it allows you to turn a method
> >> reference or lambda into a typed Function without needing a cast. After
> >> that it's really simple using what's provided in the Java API:
> >>
> >>  Function func = function(MyBean::getChild)
> >>  .andThen(Child::getName);
> >>
> >> You want a default value? Almost just as easy:
> >>
> >>  someFrameworkThing.setProperty(function(ParentBean::getChild)
> >>  .andThen(ChildBean::getName)
> >>  .andThen(Optional::ofNullable)
> >>  .andThen(o -> o.orElse("defaultName"));
> >>
> >>
> >> On 04/08/2023 16:04, Daniel Watson wrote:
> >> > Asking for comments and thoughts on a potential new feature. Already
> >> > developed in a commons-like style, but dont want to submit PR without
> >> > discussion as it may be considered out of scope or too use case
> >> specific.
> >> >
> >> > Justification and details...
> >> >
> >> > I've run into a scenario a few times where nested lamba functions
> would
> >> be
> >> > incredibly useful. e.g.
> >> >
> >> > MyBean::getChild::getName
> >> >
> >> > Obviously this is not a language feature, but can be simulated in a
> >> useful
> >> > way. So far my use has mostly been related to code that works with
> POJO
> >> > beans, and frameworks that use function references to understand those
> >> > beans and properties. Specifically useful where the context of the
> code
> >> > block is the parent entity, but you need to reference a child, and
> >> without
> >> > nested lambdas you end up with things like the below...
> >> >
> >> > ParentBean parentBean = new ParentBean();
> >> > parentBean.setChild(new ChildBean("name"));
> >> > //imagine that FrameworkThing is a generic class, and thus the generic
> >> type
> >> > is ParentBean
> >> > FrameworkThing someFrameworkThing = new FrameworkThing
> >> (ParentBean.class)
> &

Re: [commons-lang] Comments on new FunctionUtils / nested lambda feature

2023-08-04 Thread Daniel Watson
Appreciate the feedback. That's a great point. I missed the potential of
the andThen(...) method.

One minor thing to point out - My proposed purpose of the default value
parameter was not to substitute the final value if it is null, but to
substitute the final value if it cannot be obtained, due the *parent* being
null. So in my example, it is the return value of getChild() that would be
null, and your code would fail with a NPE. To handle this using the
chaining approach I think would look something like:

function(ParentBean::getChild)

.andThen(Optional::ofNullable)
.andThen(o -> {

return o.map(ChildBean::getName).orElse("defaultName");

});

So overall it's similar to yours, you just need the .map() call to change
the optional type to match the final return type.

That probably covers a lot of scenarios, however I still consider it a bit
tedious, and it becomes even more tedious if we nest it one level further
because the handling of null is now always an inline function. (I realize
that level of nesting might be rare. I personally have needed it, but I
understand that alone is not justification enough)

For my usage of it, It's still much clearer to see a util method call, with
method references, rather than chaining via andThen, because most uses need
to handle null, which means I'd still be stuck with inline functions
everywhere. In the end the biggest benefit of the util call is the clarity
of quickly knowing that the purpose is to retrieve a simple nested
property, which I don't think you can realistically get when having to
decipher a chain of functions and optionals.

Dan


On Fri, Aug 4, 2023 at 2:04 PM Rob Spoor  wrote:

> With just one simple utility method you can get all the chaining you want:
>
>  public static  Function function(Function func) {
>  return func;
>  }
>
> This doesn't look very useful, but it allows you to turn a method
> reference or lambda into a typed Function without needing a cast. After
> that it's really simple using what's provided in the Java API:
>
>  Function func = function(MyBean::getChild)
>  .andThen(Child::getName);
>
> You want a default value? Almost just as easy:
>
>  someFrameworkThing.setProperty(function(ParentBean::getChild)
>  .andThen(ChildBean::getName)
>  .andThen(Optional::ofNullable)
>      .andThen(o -> o.orElse("defaultName"));
>
>
> On 04/08/2023 16:04, Daniel Watson wrote:
> > Asking for comments and thoughts on a potential new feature. Already
> > developed in a commons-like style, but dont want to submit PR without
> > discussion as it may be considered out of scope or too use case specific.
> >
> > Justification and details...
> >
> > I've run into a scenario a few times where nested lamba functions would
> be
> > incredibly useful. e.g.
> >
> > MyBean::getChild::getName
> >
> > Obviously this is not a language feature, but can be simulated in a
> useful
> > way. So far my use has mostly been related to code that works with POJO
> > beans, and frameworks that use function references to understand those
> > beans and properties. Specifically useful where the context of the code
> > block is the parent entity, but you need to reference a child, and
> without
> > nested lambdas you end up with things like the below...
> >
> > ParentBean parentBean = new ParentBean();
> > parentBean.setChild(new ChildBean("name"));
> > //imagine that FrameworkThing is a generic class, and thus the generic
> type
> > is ParentBean
> > FrameworkThing someFrameworkThing = new FrameworkThing (ParentBean.class)
> > //but we need to get to a property of a child bean
> > someFrameworkThing.setProperty((parentBean) ->  {
> >
> > return parentBean.getChild().getName();
> >
> > });
> >
> > Obviously this could be handled with a getChildName() method on the
> parent
> > bean, but that has pitfalls as well (e.g. bean class cannot be changed,
> or
> > adding of properties interferes with other usage of the class e.g. JPA,
> > JAX).  However with a util class the second call can be reduced to
> > something like below, leaving the bean API untouched.
> >
> >
> someFrameworkThing.setProperty(FunctionUtils.nested(ParentBean::getChild,ChildBean::getName));
> >
> > Taken alone, that single reduction may seem trivial, but in a scenario
> > where these nested references are commonly needed, the reduction makes
> the
> > code clearer (In my opinion), as it is immediately apparent on a single
> > line of code that the reference is a simple nested property, rather than
> > having to interpret an 

[commons-lang] Comments on new FunctionUtils / nested lambda feature

2023-08-04 Thread Daniel Watson
Asking for comments and thoughts on a potential new feature. Already
developed in a commons-like style, but dont want to submit PR without
discussion as it may be considered out of scope or too use case specific.

Justification and details...

I've run into a scenario a few times where nested lamba functions would be
incredibly useful. e.g.

MyBean::getChild::getName

Obviously this is not a language feature, but can be simulated in a useful
way. So far my use has mostly been related to code that works with POJO
beans, and frameworks that use function references to understand those
beans and properties. Specifically useful where the context of the code
block is the parent entity, but you need to reference a child, and without
nested lambdas you end up with things like the below...

ParentBean parentBean = new ParentBean();
parentBean.setChild(new ChildBean("name"));
//imagine that FrameworkThing is a generic class, and thus the generic type
is ParentBean
FrameworkThing someFrameworkThing = new FrameworkThing (ParentBean.class)
//but we need to get to a property of a child bean
someFrameworkThing.setProperty((parentBean) ->  {

return parentBean.getChild().getName();

});

Obviously this could be handled with a getChildName() method on the parent
bean, but that has pitfalls as well (e.g. bean class cannot be changed, or
adding of properties interferes with other usage of the class e.g. JPA,
JAX).  However with a util class the second call can be reduced to
something like below, leaving the bean API untouched.

someFrameworkThing.setProperty(FunctionUtils.nested(ParentBean::getChild,ChildBean::getName));

Taken alone, that single reduction may seem trivial, but in a scenario
where these nested references are commonly needed, the reduction makes the
code clearer (In my opinion), as it is immediately apparent on a single
line of code that the reference is a simple nested property, rather than
having to interpret an inline lambda function. It also discourages errant
placement of code by avoiding the inline function (since the only purpose
of the lambda was to retrieve a single nested value). In addition, If
intermediate nulls need to be handled then the reduction becomes more
apparent, as the null checks can be handled in the util class rather than
cluttering the app code. e.g.

someFrameworkThing.setProperty(FunctionUtils.nested(ParentBean::getChild,ChildBean::getName,"defaultName"));
//or...
someFrameworkThing.setProperty(FunctionUtils.nested(ParentBean::getChild,ChildBean::getName,null));

The third parameter here is a String (typed genetically based on the return
type of getName) and indicates the default value to be returned if the
first call to getChild() returns null. e.g. it replaces something like...

someFrameworkThing.setProperty((parentBean) ->  {

ChildBean cb = parentBean.getChild();
if(cb == null) return null; //or other default value
else return cb.getName();

});

Given that commons-lang aims to extend existing language features, this
seemed like a reasonable place for a nested lambda util class. So far my
concerns are...

   1. Does this feel too specific to an application to warrant inclusion in
   commons? (For me it has been useful enough to place into a common library,
   but commons-lang has a broader scope)
   2. If not commons-lang, is there some other commons library that this is
   more suited to?
   3. There are still wrinkles that may prove complex and potentially
   overly specific e.g. exception handling. Does that potential complexity
   make it not worth adding?
   4. Assuming the features discussed here *are* valuable, Is handling only
   java.util.Function a complete-enough feature? Or is it useless unless it
   also attempts to handle BiFunctions - which become increasingly complex
   (potentially unfeasible) to implement - i.e. is it too big a feature to
   consider including?

If folks feel like this is a solid "no" let me know. If the devil is in the
details and we need to see the PR first I can do that as well.

Dan