Re: Value types supported by the MicroKernel

2012-03-22 Thread Stefan Guggisberg
On Wed, Mar 21, 2012 at 8:26 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 Hi,

 The MicroKernel interface currently says:

 supported property types: string, number

that's outdated, it now says:  string, number, boolean


 In addition there's explicit support for binary blobs.

 For oak-core (see OAK-33 [1]) we'd need a bit more detailed definition
 of the supported value types. A few questions:

the microkernel as currently implemented does not interpret property
value types,
it just stores the value token.


 1) Are boolean values (the true and false keywords in JSON) also
 supported? It would be nice if they were.

yes


 2) Are null values (the null keyword) supported? It would be nice if not.

setting a property to null (e.g. ^foo:null ) removes it (as in JCR).

however, adding a node with null valued properties is currently possible.
that should probably be fixed.


 3) Are arrays (i.e. [ ... ]) supported? If yes, are there
 constraints on element types? It would be nice if arrays indeed were
 supported. With the restriction that all elements are scalar values of
 the same type.

it does support arrays but doesn't enforce restrictions.

the microkernel should IMO just support the basic json model.
restrictions should be handled on a higher level.


 4) Are there limits on number values? Most notably, can a number
 that's larger than Long.MAX_VALUE be stored reliably (as a number)?
 Larger than Double.MAX_VALUE? What about things like Double.NaN? Or
 weirder, the other 2^51 NaN values allowed by IEEE754?

whatever you pass in the commit is stored and returned as-is.
as already mentioned, the value is (currently) not interpreted.


 5) Are there limits on string values? For example, can I expect to
 store a string value that's larger than 1MB? Larger than 1GB?

the heap is your limit ;)


 6) Presumably strings are in Unicode. Are characters beyond the BMP
 supported? Is it possible for a string value to contain a Unicode
 non-character, for example U+?

erm, haven't thought about that. do you think that there could be a problem?

cheers
stefan


 [1] https://issues.apache.org/jira/browse/OAK-33

 BR,

 Jukka Zitting


Re: Value types supported by the MicroKernel

2012-03-22 Thread Stefan Guggisberg
On Thu, Mar 22, 2012 at 4:22 PM, Jukka Zitting jukka.zitt...@gmail.com wrote:
 Hi,

 On Thu, Mar 22, 2012 at 3:04 PM, Stefan Guggisberg
 stefan.guggisb...@gmail.com wrote:
 On Wed, Mar 21, 2012 at 8:26 PM, Jukka Zitting jukka.zitt...@gmail.com 
 wrote:
 For oak-core (see OAK-33 [1]) we'd need a bit more detailed definition
 of the supported value types. A few questions:

 the microkernel as currently implemented does not interpret property
 value types, it just stores the value token.

 Can we turn that into a harder API contract, i.e. one that should hold
 true for any MK implementation? If not, we need to define the
 boundaries of what oak-core can expect the MK to store reliably. We
 don't need the answers right away, but ultimately this needs to be
 defined for us to properly resolve OAK-33.

ack and agreed. i am fine with specifying a harder api contract something
along the lines of 'values are treated as opaque character sequences'.


 A related question, will a value like \u0058 be considered equal to
 X for example when merging concurrent changes? What about [ ] vs. []
 (i.e. extra whitespace within an empty array)?

hmm, now it gets tricky... if the mk treats values as opaque character sequences
it would consider the above examples non-equal. we'll have to see whether
that's an issue. if it turns out to be an issue we'll have to provide
for smarter
merge-logic and perhaps some basic normalization (e.g. removing ws
within arrays).


 2) Are null values (the null keyword) supported? It would be nice if not.

 setting a property to null (e.g. ^foo:null ) removes it (as in JCR).

 however, adding a node with null valued properties is currently possible.
 that should probably be fixed.

 OK, sounds good. So oak-core can treat a case like {foo:null} being
 returned from MicroKernel.getNodes() as an unexpected error, probably
 caused by an administrator directly accessing the MicroKernel.

 3) Are arrays (i.e. [ ... ]) supported? If yes, are there
 constraints on element types? It would be nice if arrays indeed were
 supported. With the restriction that all elements are scalar values of
 the same type.

 it does support arrays but doesn't enforce restrictions.

 the microkernel should IMO just support the basic json model.
 restrictions should be handled on a higher level.

 OK, cool.

 4) Are there limits on number values? Most notably, can a number
 that's larger than Long.MAX_VALUE be stored reliably (as a number)?
 Larger than Double.MAX_VALUE? What about things like Double.NaN? Or
 weirder, the other 2^51 NaN values allowed by IEEE754?

 whatever you pass in the commit is stored and returned as-is.
 as already mentioned, the value is (currently) not interpreted.

 OK. Can we get rid of the (currently) qualifier? Also, the not
 interpreted part needs to be qualified in terms of things like
 conflict merging. I'd be fine with store and used as an opaque,
 unparsed character sequence.

i agree with that wording.

cheers
stefan


 Basically this means that in oak-core or -jcr we need to either a) not
 support all possible Java double values or b) use some custom string
 encoding for such values. I suppose a) is the better alternative, but
 we need to check whether that will cause problems down the line.

 6) Presumably strings are in Unicode. Are characters beyond the BMP
 supported? Is it possible for a string value to contain a Unicode
 non-character, for example U+?

 erm, haven't thought about that. do you think that there could be a problem?

 Like with the Double.NaN case, unless the underlying MicroKernel
 supports the full range of characters, we need to consider adding some
 extra level of encoding (possibly simply the \u escapes) on the
 oak-core or -jcr level. But it sounds like this won't be needed.

 BR,

 Jukka Zitting


Re: Value types supported by the MicroKernel

2012-03-22 Thread Jukka Zitting
Hi,

On Thu, Mar 22, 2012 at 4:53 PM, Stefan Guggisberg
stefan.guggisb...@gmail.com wrote:
 On Thu, Mar 22, 2012 at 4:22 PM, Jukka Zitting jukka.zitt...@gmail.com 
 wrote:
 Can we turn that into a harder API contract, i.e. one that should hold
 true for any MK implementation? If not, we need to define the
 boundaries of what oak-core can expect the MK to store reliably. We
 don't need the answers right away, but ultimately this needs to be
 defined for us to properly resolve OAK-33.

 ack and agreed. i am fine with specifying a harder api contract something
 along the lines of 'values are treated as opaque character sequences'.

Sounds good.

 A related question, will a value like \u0058 be considered equal to
 X for example when merging concurrent changes? What about [ ] vs. []
 (i.e. extra whitespace within an empty array)?

 hmm, now it gets tricky... if the mk treats values as opaque character 
 sequences
 it would consider the above examples non-equal. we'll have to see whether
 that's an issue. if it turns out to be an issue we'll have to provide
 for smarter merge-logic and perhaps some basic normalization (e.g.
 removing ws within arrays).

I suppose we can handle that on the oak-core level, i.e. make sure
that any values sent to the MK are always in normalized form.

BR,

Jukka Zitting