I couldn't find that phrasing about casting to an xs:integer in the spec. Maybe AI hallucinated?

I did find this in the spec in Section B.1:

Note that type promotion is different from subtype substitution. For example:

    A function that expects a parameter $p of type xs:float can be invoked with 
a value of type xs:decimal. This is an example of type promotion. The value is 
actually converted to the expected type. Within the body of the function, $p 
instance of xs:decimal returns false.

    A function that expects a parameter $p of type xs:decimal can be invoked 
with a value of type xs:integer. This is an example of subtype substitution. 
The value retains its original type. Within the body of the function, $p 
instance of xs:integer returns true.

And here's the definition of subtype substitution:

[Definition: The use of a value whose dynamic type is derived from an expected 
type is known as subtype substitution.] Subtype substitution does not change 
the actual type of a value. For example, if an xs:integer value is used where 
an xs:decimal value is expected, the value retains its type as xs:integer.

In the case of an xs:short being passed into a function that expects an xs:integer, that sounds like it would just be subtype substitution, so we would not cast the xs:short to an xs:integer, and inside the function the type is treated as an xs:short. But the spec isn't clear to me if that implies the result is also an xs:short or if that is cast to something. It feels like keeping it a short as very likely to run into overflow/underflow.

I also would be hesitant to cast every to to xs:integer, since our implementation backs that with java.math.BigInteger. I would guess there's a performance hit from switching to primitive types to BigInteger. Not sure if that would be enough to notice though, especially since DPath epressions aren't usually that common.

There's also the consideration that if we cast everything to xs:integer then we still need to downcast to the expected resulting type, e.g.:

<element name="foo" type="xs:short"
  dfdl:inputValueCalc="{ ../short * ../short }" />

We could add an implicit downcast to the result of the expression, and maybe overflow is just considered an error in that case?



On 2025-10-01 05:17 PM, Mike Beckerle wrote:
Ok, I looked at this and got some AI coaching....

The phrase in the XPath spec says:

   "If both operands are of type xs:integer or are derived from xs:integer,
then the operands are cast to xs:integer and the result is an xs:integer."

This is explicit about operands being derived from xs:integer in that part,
but when it says they are cast, it doesn't qualify that in any way, so I
think the right interpretation of this is that they are cast to exactly the
xs:integer type.

ChatGPT agrees:  " XPath and XQuery .. deliberately avoid proliferating
narrow integer subtypes in arithmetic results. Instead, the specification
says:

    -

    For + - * div idiv mod, if both operands are subtypes of xs:integer,
    they are *promoted to xs:integer*, not kept at the narrower type.
    -

    That way, all arithmetic on integer subtypes collapses to xs:integer."

-
- Now, admittedly, I wrote a bunch of that code, and my thought would not
have been to do that lazy thing of just casting everything to xs: integer.
- Rather I would have wanted promotion to have been to the least common
supertype for addition and multiplicatoin, and promotion to just the larger
of the two arg types for division and subtraction of unsigned types.
(Subtraction of signed types has to be treated like addition).
-
- So probably if we just did this promotion right the problem wouldn't
occur.
-
- Certainly having short + short create short is a bug.
-
- I am wondering if I made the mistake of taking *least** upper bound* of
the arg types, not least common supertype. The least upper bound of X and X
is, well X.
-
-

On Wed, Oct 1, 2025 at 2:18 PM Steve Lawrence <[email protected]> wrote:

I'm trying to fix https://issues.apache.org/jira/browse/DAFFODIL-2574.
The core
issue is that Java arithmetic operations return Int, even if for example
you are
adding two Shorts. Our DPath implementation doesn't expect that, and
assumes
xs:short + xs:short always result in an xs:short, that way it knows all
the
types at compile time and can put in appropriate conversions.

I was looking through the Xpath/XQuery spec to figure what the corret
behavior
is, and it feels kindof ambiguous. It defines arithmetic functions like:

op:numeric-add($arg1 as numeric, $arg2 as numeric) as numeric

But it doesn't really say what the resulting numeric should be. It really
just says

op:operation(xs:integer, xs:integer)

should return "xs:integer", but it's not completely clear if that's saying
the
result should be promoted to an xs:integer, or the result just should
derive
xs:integer. The later is my interpretation, suggesting we should not
promote,
and I think is what DPath intends.

But that then has issues with underflow/overflow--what happens when a
short +
short doesn't fit into a short. Do we promote to an int? Do we error. The
spec
does say this regarding overflow underflow:

For xs:integer operations, implementations that support limited-precision
integer operations ·must· select from the following options:
They ·may· choose to always raise an error [err:FOAR0002].
They ·may· provide an ·implementation-defined· mechanism that allows users
to
choose between raising an error and returning a result that is modulo the
largest representable integer value. See [ISO 10967].

So we could just detect overflow and error, but that feels like short/byte
operations are likely to overflow. Which might break usability, but it
might
detect cases people weren't expecting?

Or we could do what Java does and just promote arithmetic operations to
Int,
which is likely to just do the right thing and not overflow. But does me
you
would likely need to add downcasts that might not be expected,e.g.

    <element name="foo" type="xs:short"
      dfdl:inputValueCalc="{ xs:short(../short1 + ../short2) }" />

In order for DPath to work the way it does, I think we do need to make a
compile
time decision, I don't think DPath really wants to promote things at
runtime to
whatever type fits the arithmetic result and just assume everything is a
Numeric. But I guess that could be an option too, and there just might be
little
bit of runtime overhead to check types and arithmetic results.

Thoughts?



Reply via email to