Re: New proposal for type definitons

Darrel Schneider Thu, 22 Dec 2016 16:43:33 -0800

The @refTypeId is hard to understand. It is unclear to me how it interacts
with other things like "dataType" and "subType". I think you can either
specify a dataType/subType OR a @refTypeId. Is this correct? The current
spec makes it look like you can specify both but your example just show one
or the other.


If so wouldn't it be clearer to just have one of the values of "dataType"
or "subType" to be "@nnnnn" where nnnnn is a number referring to an already
defined typeId?

Saying a field has to be of a specific type is much stronger than pdx
currently supports. It only has support for specific basic types and then
the generic "Object" type. If you plan on using the existing pdx registry
then that type system would need to be expanded to deal with @refTypeId
fields.

Also the "formatter" field seems like a new feature that is not described
in your proposal. You have a comment that says it applies to Dates and
Doubles but it seems like your type syntax would allow you to specify it on
any field type.

On Thu, Dec 22, 2016 at 4:32 PM, Darrel Schneider <[email protected]>
wrote:

> You proposal seems to be only handle types for JSON. I do not see that you
> support all the pdx field types.
> Also you have things like List and "subType" which currently have no
> support explicit support in the pdx type system.
>
> So do you intend this proposal to be specific to JSON? If so the gfsh and
> apis need to make this clear. If not then your proposal should make sure it
> supports all the existing pdx types.
>
> On Thu, Dec 22, 2016 at 4:27 PM, Darrel Schneider <[email protected]>
> wrote:
>
>> Something I did not see in your proposal was the rules that would be used
>> when a JSON document uses "@typeId" to determine if that type is valid for
>> the current document.
>> For example I think you want to allow the type to have a field that does
>> not exist in the document.
>> I think you also want to say that if the document has a field that does
>> not exist in the type then an exception is thrown.
>>
>> You may also have exceptions for when the document's field data can not
>> be represented in the type's field. For example the type may say the field
>> is Boolean but the document may have a String whose value is "foobar".
>> Before the field type was derived from the actual value in the document but
>> now you can have a mismatch.
>>
>>
>> On Thu, Dec 22, 2016 at 4:16 PM, Darrel Schneider <[email protected]>
>> wrote:
>>
>>> One danger of this solution is users may think they can modify a
>>> previously defined type. Since they specify the type they may think they
>>> can just edit the file and reload the types with modified definitions. In
>>> most cases if data has already been serialized using the old type then
>>> modifying the type will lead to data that can no longer be deserialized.
>>>
>>> Are you thinking that these new user defined types would be loaded into
>>> the PDX registry and remembered? If you later tried to reload the same type
>>> and it differs then the reload fails? If so then I think this would keep
>>> users from making illegal changes.
>>>
>>> On Thu, Dec 22, 2016 at 4:11 PM, Darrel Schneider <[email protected]
>>> > wrote:
>>>
>>>> When generating a pdx type for a JSON document couldn't we sort the
>>>> field names from the JSON document so that field order would not generated
>>>> different pdx types?
>>>> Also when choosing a pdx field type if we always picked a "wider" type
>>>> then it would reduce the number of types generated because of different
>>>> field types.
>>>>
>>>>
>>>> On Thu, Dec 22, 2016 at 10:02 AM, Udo Kohlmeyer <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi there Dan,
>>>>>
>>>>> You are correct, the thought is there to add a flag to the registry to
>>>>> indicate that a definition is custom and thus should not conflict with the
>>>>> existing ids. Even if they types were to be stored with the current Pdx
>>>>> type definitions, upon loading/registration of the custom type 
>>>>> definitions,
>>>>> any conflict will be reported and the custom set will not be registered
>>>>> until all issues were addressed.
>>>>>
>>>>> I also had the opinion of the "if they can provide me a typeId, then
>>>>> surely they can provide me with a fully populated JSON document".
>>>>> Referencing the example document from the wiki, an user can be created 
>>>>> with
>>>>> just a first and surname. It is not required to provide currentAddress,
>>>>> previousAddresses, dob,etc... Whilst one could force the client to provide
>>>>> all fields in the JSON document, it is not always possible nor feasible to
>>>>> do so. In the POJO world we have a structured data definition and the
>>>>> generation of a type definition is simple. This done because from a
>>>>> serialization perspective we always make sure that all fields are
>>>>> serialized. BUT if we were to change the serialization, i.e not serialize 
>>>>> a
>>>>> field because it is null, the type definition behavior would be exactly 
>>>>> the
>>>>> same as JSON. Only, in this case, because we changed the type definition
>>>>> for the 'com.demo.User' object (at runtime) the deserialization step for
>>>>> previous versions would fail.
>>>>>
>>>>> I believe that if we were to be able to describe WHAT the structure of
>>>>> a JSON document should be and define the type according to that 
>>>>> definition,
>>>>> we could improve performance (as we don't have to determine type
>>>>> definitions for every JSON document), be more flexible in consuming JSON
>>>>> documents that are only partially populated and lastly not potentially
>>>>> cause a vast amount of JSON-based type definitions to be generated.
>>>>>
>>>>> In addition to just the JSON benefits, having a formal way of
>>>>> describing the type definitions will allow us to better maintain the
>>>>> current registered type definitions. In addition to this, it would allow
>>>>> customers/clients to create type definitions, by hand, if they were to 
>>>>> have
>>>>> lost their type registry.
>>>>>
>>>>> As  final thought, the addition of the external type registration
>>>>> process is not meant replace the current behavior. But rather enhance its
>>>>> capabilities. If no external types will have been defined OR the client
>>>>> does not provide a '@typeId' tag, the current JSON type definition 
>>>>> behavior
>>>>> will stay the same.
>>>>>
>>>>> --Udo
>>>>>
>>>>>
>>>>> On 12/21/16 18:20, Dan Smith wrote:
>>>>>
>>>>>> I'm assuming the type ids here are a different set than the type ids
>>>>>> used
>>>>>> with regular PDX serialization so they won't conflict if the pdx
>>>>>> registry
>>>>>> assigns 1 to some class and a user puts @typeId: 1 in their json?
>>>>>>
>>>>>> I'm concerned that this won't really address the type explosion issue.
>>>>>> Users that are able to go to the effort of adding these typeIds to
>>>>>> all of
>>>>>> their json are probably users that can produce consistently formatted
>>>>>> json
>>>>>> in the first place. Users that have inconsistently formatted json are
>>>>>> probably not going to want or be able to add these type ids.
>>>>>>
>>>>>> It might be better for us to pursue a way to store arbitrary
>>>>>> documents that
>>>>>> are self describing. Our current approach for json documents is
>>>>>> assuming
>>>>>> that the documents are all consistently formatted. We are infer a
>>>>>> schema
>>>>>> for their documents store the field names in the type registry and the
>>>>>> field values in the serialized data. If we give people the option to
>>>>>> store
>>>>>> and query self describing values, then users with inconsistent json
>>>>>> could
>>>>>> just use that option and pay the extra storage cost.
>>>>>>
>>>>>> -Dan
>>>>>>
>>>>>> On Tue, Dec 20, 2016 at 4:53 PM, Udo Kohlmeyer <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>> Hey there,
>>>>>>>
>>>>>>> I've just completed a new proposal on the wiki for a new mechanism
>>>>>>> that
>>>>>>> could be used to define a type definition for an object.
>>>>>>> https://cwiki.apache.org/confluence/display/GEODE/Custom+
>>>>>>> External+Type+Definition+Proposal+for+JSON
>>>>>>>
>>>>>>> Primarily the new type definition proposal will hopefully help with
>>>>>>> the
>>>>>>> "structuring" of JSON document definitions in a manner that will
>>>>>>> allow
>>>>>>> users to submit JSON documents for data types without the need to
>>>>>>> provide
>>>>>>> every field of the whole domain object type.
>>>>>>>
>>>>>>> Please review and comment as required.
>>>>>>>
>>>>>>> --Udo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: New proposal for type definitons

Reply via email to