On 15/06/18 15:00, ajs6f wrote:
On Jun 15, 2018, at 9:49 AM, Maxime Lefrançois <maxime.lefranc...@emse.fr> 
wrote:

In a nutshell this is what I was thinking about:

Add use of the standard Java Service Provider API to load things automatically 
found in the classpath:

- In TypeMapper --> a method that uses the Service Provider API to find more 
Datatypes

Should this be a method, or rather additional behavior for getTypeByName, etc.? Are you 
thinking of something like "void getMoreMappings()" which would check for more 
available datatypes?

Jena already uses ServiceLoader for initialization - can this be used? or alternatively, is a separate one of some specific advantage?

https://jena.apache.org/documentation/notes/system-initialization.html

and code in the custom initializer calls of

TypeMapper.getInstance().registerDatatype(....)

- Datatype subclasses are not for just one URI, but could be for a set of URIs

Would that be true of Java types, as well?

NodeValue follows the rules for XSD arithmetic.


TypeMapper and NodeValue are not very connected. Types in jena-core don't have arithmetic or comparison.

Maybe there are two different contributiosn here - for TypeMapper/jena-core, and NodeValue/jena-arq.


- ValueSpaceClassification should not be an enum any more --> maybe use a class 
ValueSpace ...
- should add some interface like NodeValueComparator, with some methods like:
  - canCompare(ValueSpace vs, ValueSpace vs)
  - sameAs(NodeValue nv, NodeValue nv)
  - compare(NodeValue nv, NodeValue nv)

My goal here is to make sure that extensions can be done and that the additional flexibility does not impact the performance of the Xpath/Xquery F&O evaluations.

What is the relation to QUDT? http://www.qudt.org

Before we get into the Java detail: I'd like to be sure what is being supported exactly? It can get a bit weird!

Is arithmetic involving numbers also going to be supported? (the playground says "no" for plus an dyes for multiply - is that right? - but it does not follow XSD so "1m*2" becomes 2.0m - integer -> decimal)

Is cdt:length * cdt:length a cdt:area?

(we'll accept answers for Euclidean space :-)

Can the query cast? If so, how does it set the measurement scale?

A "cdt:cast(quantity, unit, datatype)" would be nice.


Should this return a Comparator<NodeValue> instead? (Thinking of sorting.)

In sorting, two NodeValues are always comparable, and it falls back to lexical form and datatype. Comparing by implicit value can get into unstable sorting. The playground says it is:

SELECT ?value {
VALUES ?value {2 4 "1 m/s2 "^^cdt:acceleration "3 m/s2 "^^cdt:acceleration }
} ORDER BY ?value

NB if  "1m/s" is "1 km/s" the accelerations don't sort

I think this is because of instability:

  1.5 < 2ft
  2ft < 1m

but

  1m < 1.5

There are two forms of comparison: for the "<" operation and sorting.
Comparison must agree with sorting when comparison isn't an eval exception.


  - add(NodeValue nv, NodeValue nv)
  - substract(NodeValue nv, NodeValue nv)
- in NodeValue class, method sameAs(NodeValue nv1, NodeValue nv2) and 
compare(...) should  uses the Service Provider API to find NodeValueComparators 
in the classpath
- in class NodeValueOps, method divisionNV(NodeValue nv1, NodeValue nv2), 
multiplicationNV(...) additionNV(...)  , subtractionNV(...)   should  uses the 
Service Provider API to find more NodeValueComparators in the classpath

One way is to have a new value space "VSPACE_EXT".

This is an NodeValueExt in ARQ with a method to return a handler for operations.

The extension provides NodeValueCDT and the code for the handler.

There is code at the end of NodeValue._setByValue to do a datatype->factory look up and the fatory returns NodeValueExt.

If either argument of a binary operator is "VSPACE_EXT", then the
NodeValueExt is used to get the custom evaluation operation.

The existing code remains as-is. The existing VSPACE aren't converted to a provider.

This is the extension mechanism - if an extension is seen, then extension code is called and it has to deal with the arguments - otherwise existing code is used as it is at the moment.

Hm. Is there some way this could happen via a lookup in TypeMapper? I'd rather 
not see too many paths to the same service impls...

Any thoughts about this?

Yes: thank you so much for doing this excellent work!

+1

        Andy


Best regards,
Maxime Lefrançois



Le sam. 7 avr. 2018 à 15:13, ajs6f <aj...@apache.org> a écrit :

We're (well, Andy is) working on 3.7.0 now. We've been trying to maintain
a 6-month or so release cadence, so you've hit a really good time to begin
this work. That having been said, I don't think anyone would say that we
are especially stringent about it, so I wouldn't worry too much about the
timing myself.

ajs6f

On Apr 6, 2018, at 9:36 AM, Maxime Lefrançois <maxime.lefranc...@emse.fr>
wrote:

Well,

I think I have a pretty clear idea how I would do this. We would end up
using a registery like for custom functions or datatypes.
That registry would contain an ordered list of SPARQL operator handlers,
pre-filled by one for handling XSD datatypes.

I am currently requesting the right to fill the Apache individual
contributor license agreement.

What would be the timeline if we wanted this shipped in the next release?

Best,
Maxime

Le mar. 3 avr. 2018 à 15:30, ajs6f <aj...@apache.org> a écrit :

I agree. I can imagine plenty of use cases for such a powerful pair of
extension points.

Maxime, how can we help you attack that work? Is there a design that is
already clear to you? Are there any blockers we can help remove?

ajs6f

On Mar 28, 2018, at 5:08 AM, Rob Vesse <rve...@dotnetrdf.org> wrote:

I think work towards Option 2 would be the most valuable to the
community



The SPARQL specification allows for the overloading of any
operator/expression where the spec currently defines the evaluation to
be
an error so extending operators is a natural and valid extension point
to
provide



The Terms of Use for UCUM would probably need us to obtain a licensing
assessment from Apache Legal as it is a non-standard OSS license even if
the code that implements it is under BSD (which is fine from an Apache
perspective).  Therefore having a well defined extension mechanism and
then
having UCUM support live outside Apache Jena that as an extension
implementation maintained by yourself would be the easiest approach



Rob



From: Maxime Lefrançois <maxime.lefranc...@emse.fr>
Reply-To: <dev@jena.apache.org>
Date: Wednesday, 28 March 2018 at 09:29
To: <dev@jena.apache.org>
Subject: Re: Contribution proposal for Jena: support of a datatype for
quantity values



Dear all,



Happy to see you are interested the UCUM datatypes !



Ok so let's dive in the technical details.



# Compare Jena 3.6.0 and Jena 3.6.0-ucum





https://github.com/apache/jena/compare/master...OpenSensingCity:jena-3.6.0-ucum



# Modules, dependencies, licences



Two modules forked so far: jena-core and jena-arq.

One dependency added to jena-core (after a minor change I made today):



systems.uom:systems-ucum-java8:0.7.2

-> BSD license of systems-uom,

   and license of UCUM http://unitsofmeasure.org/trac/wiki/TermsOfUse



--> this use implementation of JSR 363 indeed - Units of Measurement
API

(see attached for the transitive dependencies, all from
https://github.com/unitsofmeasurement )



# External module ?



I would have been happy to develop a separate extension of Jena for the
UCUM datatypes.

One of the main reasons why this is not possible was pointed out by
Andy:

I had to add a new value space VSPACE_QUANTITY to overload the SPARQL
operators '<>=' and arithmetic functions '+-*/'.



Indeed, there are two parts: the necessary extensions for operators,
and
the units themselves.



We could choose some other unit system than UCUM, but UCUM is very
comprehensive and has different implementations in different programming
languages. It would be possible to implement UCUM datatypes in other
RDF-SPARQL engines.



# possible directions



I see three main possible directions of work there:



1. work on the proposal as and potentially integrate it completely

2. work on jena-core and jena-arq to make the definition of new
datatypes and the overloading of operators as easy as the definition of
new
custom functions --> so that I can easily implement UCUM datatypes as an
extension (and not a fork)

3. add VSPACE_QUANTITY value space and NodeValueQuantity in jena-arq,
and externalize the support for the UCUM systems of unit in an external
module



Best,

Maxime



Le mar. 27 mars 2018 à 17:16, Andy Seaborne <a...@apache.org> a écrit
:

Extending the operators for SPARQL is a new value space
VSPACE_QUANTITY.

See (comparison):



https://github.com/OpenSensingCity/jena-ucum/blob/jena-3.6.0-ucum/jena-arq/src/main/java/org/apache/jena/sparql/expr/NodeValue.java#L566

and (multiply)



https://github.com/OpenSensingCity/jena-ucum/blob/jena-3.6.0-ucum/jena-arq/src/main/java/org/apache/jena/sparql/expr/nodevalue/NodeValueOps.java#L283

with a new NodeValueQuantity for javax.measure.Quantity

I'm seeing this a "one dimensional units" - a quantity and a unit.

Even then, there are two part - the necessary extensions for operators
and the units themselves to allow for other unit systems (?).

There are new dependencies in jena-arq and jena-core.

http://unitsofmeasurement.github.io/
JSR 363 - Units of Measurement API
BSD-license

and an old version of something is on central:

http://central.maven.org/maven2/javax/measure/unit-api/1.0

if that's the right thing.

---

Maxime - what are the dependencies for this contribution and for which
pieces are they needed?

   Andy

On 27/03/18 15:49, ajs6f wrote:
Bruno raises an interesting question-- would this contribution have
any
effect (or should it) on jena-spatial? Would it be either necessary or
if
not, appropriate to integrate there? (I'm particularly interested in
this
because it might help decide between core and an extension.)


ajs6f

On Mar 26, 2018, at 5:40 PM, Bruno P. Kinoshita <ki...@apache.org>
wrote:

Hi Maxime,
Don't know whether it would be best as part of jena core or in an
extension, but sounds very interesting! Will let others comment on this.
At work, one item in my backlog is to replace jscience by jsr363 -
Units of Measurement
|
|
|
|   |    |

|

|
|
|   |
Units of Measurement

Units of Measurement provides a set of APIs and services for handling
units and quantities.
|   |

|

|


We use it for weather forecast and GIS, with things like wind speed,
rain amount, etc.
I think another GIS library that we use did the switch as well (some
OGC lib I think).
Perhaps it would be nice to consider taking a look at their api for
compatibility with other systems.
CheersBruno

Sent from Yahoo Mail on Android

On Tue, 27 Mar 2018 at 2:07, Maxime Lefrançois<
maxime.lefranc...@emse.fr> wrote:   Dear all,

I am Associate Professor at MINES Saint-Étienne, France, working on
Semantic Web and Linked Data. I'd like to let you know about our
project *Custom
Datatypes for Quantity Values*[1], that leverages the Unified Code of
Units
of Measures, a code system intended to include all units of measures
being
contemporarily used in international science, engineering, and
business.
Using our UCUM Datatypes, one can encode and query quantity values
in a
lightweight manner:

PREFIX cdt: <http://w3id.org/lindt/custom_datatypes#>
PREFIX ex: <http://example.org/>

SELECT ?value1 ?value2 ?result
WHERE{
VALUES ( ?value1 ?value2 ) {
   ( "1.0 m/s"^^cdt:speed "2 s"^^cdt:time )
}
BIND( ?value1 * ?value2 AS ?result )
}

Results in


----------------------------------------------------------------------
| value1              | value2              | result              |

======================================================================
| "1.0 m/s"^^cdt:speed | "2 s"^^cdt:time      | "2.0 m"^^cdt:length
|

See our demonstration online [2].
It uses *a fork of Jena where we implemented UCUM datatypes* [3] (in
jena-core and jena-arq, with several unit tests) our implementation
uses
the recent JSR 385, Units of Measurement API 2.0, and the UCUM
extension
[4].

This is not the first project I develop into/using Jena.
- I forked it to Supporting Arbitrary Custom Datatypes in RDF and
SPARQL
fetching some Javascript definition at the URI of the datatype [5]
- I develop SPARQL-Generate, an extension of SPARQL implemented on
ARQ
to
generate RDF from web documents in XML, JSON, CSV, HTML, CBOR, and
plain
text with regular expressions  [6]


If you agree we me that supporting UCUM datatypes would be a nice
addition
to Apache Jena and a nice contribution to the Semantic Web
community, I
would be willing to help to integrate our contribution to other
modules
(with jena-tdb, ... ), and help maintaining it in the future.

Best regards,
Maxime Lefrançois,
Associate Professor, MINES Saint-Étienne

[1] - http://w3id.org/lindt/custom_datatypes#
[2] - http://w3id.org/lindt/playground.html?example=05-Multiply
[3] - http://w3id.org/lindt/custom_datatypes#implementation
[4] -


https://github.com/unitsofmeasurement/uom-systems/tree/master/ucum-java8
[5] - https://ci.mines-stetienne.fr/lindt/spec.html
[6] - https://ci.mines-stetienne.fr/sparql-generate/









Reply via email to