I also would be hesitant to cast every to to xs:integer, since our
implementation backs that with java.math.BigInteger. I would guess there's a
performance hit from switching to primitive types to BigInteger. Not sure if
that would be enough to notice though, especially since DPath epressions
aren't
usually that common.

I agree promoting everything to BigInteger has performance implications I
don't like.
These are all boxed numbers inside Daffodil, but still a BigInteger is more
expensive than a boxed primitive number.


There's also the consideration that if we cast everything to xs:integer
then we

still need to downcast to the expected resulting type, e.g.:


<element name="foo" type="xs:short"

   dfdl:inputValueCalc="{ ../short * ../short }" />


We could add an implicit downcast to the result of the expression, and maybe

overflow is just considered an error in that case?

Whether we convert to xs:integer or xs:int (Java style) or do promotion to
next bigger size (int * int => long) (byte * byte => short) you'd still
need to insert a downcast in this situation of short * short result type
going into an element of type short.

If we insert that automatically, then that would be compatible with
behavior today. If that downcast causes a runtime error, that's expected.

The change in behavior would be intermediate results inside an expression.
Ex: an expression like a * b + c and they are all shorts, and a * b
overflows a short, that is incorrect behavior. We really do want a * b to
create an int, which is an incompatible change in behavior for the case
where a * b causes overflow, or the "+ c" causes overflow.

Though as incompatibilities go, I expect this one is very rarely hit.







On Thu, Oct 2, 2025 at 8:54 AM Steve Lawrence <[email protected]> wrote:

> I couldn't find that phrasing about casting to an xs:integer in the spec.
> Maybe
> AI hallucinated?
>
> I did find this in the spec in Section B.1:
>
> > Note that type promotion is different from subtype substitution. For
> example:
> >
> >     A function that expects a parameter $p of type xs:float can be
> invoked with a value of type xs:decimal. This is an example of type
> promotion. The value is actually converted to the expected type. Within the
> body of the function, $p instance of xs:decimal returns false.
> >
> >     A function that expects a parameter $p of type xs:decimal can be
> invoked with a value of type xs:integer. This is an example of subtype
> substitution. The value retains its original type. Within the body of the
> function, $p instance of xs:integer returns true.
>
> And here's the definition of subtype substitution:
>
> > [Definition: The use of a value whose dynamic type is derived from an
> expected type is known as subtype substitution.] Subtype substitution does
> not change the actual type of a value. For example, if an xs:integer value
> is used where an xs:decimal value is expected, the value retains its type
> as xs:integer.
>
> In the case of an xs:short being passed into a function that expects an
> xs:integer, that sounds like it would just be subtype substitution, so we
> would
> not cast the xs:short to an xs:integer, and inside the function the type
> is
> treated as an xs:short. But the spec isn't clear to me if that implies the
> result is also an xs:short or if that is cast to something. It feels like
> keeping it a short as very likely to run into overflow/underflow.
>
> I also would be hesitant to cast every to to xs:integer, since our
> implementation backs that with java.math.BigInteger. I would guess there's
> a
> performance hit from switching to primitive types to BigInteger. Not sure
> if
> that would be enough to notice though, especially since DPath epressions
> aren't
> usually that common.
>
> There's also the consideration that if we cast everything to xs:integer
> then we
> still need to downcast to the expected resulting type, e.g.:
>
> <element name="foo" type="xs:short"
>    dfdl:inputValueCalc="{ ../short * ../short }" />
>
> We could add an implicit downcast to the result of the expression, and
> maybe
> overflow is just considered an error in that case?
>
>
>
> On 2025-10-01 05:17 PM, Mike Beckerle wrote:
> > Ok, I looked at this and got some AI coaching....
> >
> > The phrase in the XPath spec says:
> >
> >    "If both operands are of type xs:integer or are derived from
> xs:integer,
> > then the operands are cast to xs:integer and the result is an
> xs:integer."
> >
> > This is explicit about operands being derived from xs:integer in that
> part,
> > but when it says they are cast, it doesn't qualify that in any way, so I
> > think the right interpretation of this is that they are cast to exactly
> the
> > xs:integer type.
> >
> > ChatGPT agrees:  " XPath and XQuery .. deliberately avoid proliferating
> > narrow integer subtypes in arithmetic results. Instead, the specification
> > says:
> >
> >     -
> >
> >     For + - * div idiv mod, if both operands are subtypes of xs:integer,
> >     they are *promoted to xs:integer*, not kept at the narrower type.
> >     -
> >
> >     That way, all arithmetic on integer subtypes collapses to
> xs:integer."
> >
> > -
> > - Now, admittedly, I wrote a bunch of that code, and my thought would not
> > have been to do that lazy thing of just casting everything to xs:
> integer.
> > - Rather I would have wanted promotion to have been to the least common
> > supertype for addition and multiplicatoin, and promotion to just the
> larger
> > of the two arg types for division and subtraction of unsigned types.
> > (Subtraction of signed types has to be treated like addition).
> > -
> > - So probably if we just did this promotion right the problem wouldn't
> > occur.
> > -
> > - Certainly having short + short create short is a bug.
> > -
> > - I am wondering if I made the mistake of taking *least** upper bound* of
> > the arg types, not least common supertype. The least upper bound of X
> and X
> > is, well X.
> > -
> > -
> >
> > On Wed, Oct 1, 2025 at 2:18 PM Steve Lawrence <[email protected]>
> wrote:
> >
> >> I'm trying to fix https://issues.apache.org/jira/browse/DAFFODIL-2574.
> >> The core
> >> issue is that Java arithmetic operations return Int, even if for example
> >> you are
> >> adding two Shorts. Our DPath implementation doesn't expect that, and
> >> assumes
> >> xs:short + xs:short always result in an xs:short, that way it knows all
> >> the
> >> types at compile time and can put in appropriate conversions.
> >>
> >> I was looking through the Xpath/XQuery spec to figure what the corret
> >> behavior
> >> is, and it feels kindof ambiguous. It defines arithmetic functions like:
> >>
> >> op:numeric-add($arg1 as numeric, $arg2 as numeric) as numeric
> >>
> >> But it doesn't really say what the resulting numeric should be. It
> really
> >> just says
> >>
> >> op:operation(xs:integer, xs:integer)
> >>
> >> should return "xs:integer", but it's not completely clear if that's
> saying
> >> the
> >> result should be promoted to an xs:integer, or the result just should
> >> derive
> >> xs:integer. The later is my interpretation, suggesting we should not
> >> promote,
> >> and I think is what DPath intends.
> >>
> >> But that then has issues with underflow/overflow--what happens when a
> >> short +
> >> short doesn't fit into a short. Do we promote to an int? Do we error.
> The
> >> spec
> >> does say this regarding overflow underflow:
> >>
> >> For xs:integer operations, implementations that support
> limited-precision
> >> integer operations ·must· select from the following options:
> >> They ·may· choose to always raise an error [err:FOAR0002].
> >> They ·may· provide an ·implementation-defined· mechanism that allows
> users
> >> to
> >> choose between raising an error and returning a result that is modulo
> the
> >> largest representable integer value. See [ISO 10967].
> >>
> >> So we could just detect overflow and error, but that feels like
> short/byte
> >> operations are likely to overflow. Which might break usability, but it
> >> might
> >> detect cases people weren't expecting?
> >>
> >> Or we could do what Java does and just promote arithmetic operations to
> >> Int,
> >> which is likely to just do the right thing and not overflow. But does me
> >> you
> >> would likely need to add downcasts that might not be expected,e.g.
> >>
> >>     <element name="foo" type="xs:short"
> >>       dfdl:inputValueCalc="{ xs:short(../short1 + ../short2) }" />
> >>
> >> In order for DPath to work the way it does, I think we do need to make a
> >> compile
> >> time decision, I don't think DPath really wants to promote things at
> >> runtime to
> >> whatever type fits the arithmetic result and just assume everything is a
> >> Numeric. But I guess that could be an option too, and there just might
> be
> >> little
> >> bit of runtime overhead to check types and arithmetic results.
> >>
> >> Thoughts?
> >>
> >
>
>

Reply via email to