+1 > On 30. Jan 2019, at 15:31, Paul Davis <paul.joseph.da...@gmail.com> wrote: > > Jiffy preserves duplicate keys if its not decoding into a map (in > which case last value for duplicate keys wins). Its significantly > corner case and not at all supported by nearly any other JSON library > so changing that shouldn't be considered a breaking change in my > opinion. > > On Wed, Jan 30, 2019 at 8:21 AM Mike Rhodes <couc...@dx13.co.uk> wrote: >> >> From what I recall Jiffy is able to cope with the valid-but-kinda-silly[1] >> thing where you have multiple JSON keys with the same name, i.e., { "foo": >> 1, "foo": 2 }. >> >> Are the proposals on the table able to continue this support (or am I wrong >> about Jiffy)? >> >> [1] https://tools.ietf.org/html/rfc8259#section-4, "The names within an >> object SHOULD be unique.", though >> https://tools.ietf.org/html/rfc7493#section-2.3 does sensibly close that >> down. >> >> -- >> Mike. >> >> On Wed, 30 Jan 2019, at 13:33, Jan Lehnardt wrote: >>> >>> >>>> On 30. Jan 2019, at 14:22, Jan Lehnardt <j...@apache.org> wrote: >>>> >>>> Thanks Ilya for getting this started! >>>> >>>> Two quick notes on this one: >>>> >>>> 1. note that JSON does not guarantee object key order and that CouchDB has >>>> never guaranteed it either, and with say emit(doc.foo, doc.bar), if either >>>> emit() parameter was an object, the undefined-sort-order of SpiderMonkey >>>> would mix things up. While worth bringing up, this is not a BC break. >>>> >>>> 2. This would have the fun property of being able to rename a key inside >>>> all docs that have that key. >>> >>> …in one short operation. >>> >>> Best >>> Jan >>> — >>>> >>>> Best >>>> Jan >>>> — >>>> >>>>> On 30. Jan 2019, at 14:05, Ilya Khlopotov <iil...@apache.org> wrote: >>>>> >>>>> # First proposal >>>>> >>>>> In order to overcome FoudationDB limitations on key size (10 kB) and >>>>> value size (100 kB) we could use the following approach. >>>>> >>>>> Bellow the paths are using slash for illustration purposes only. We can >>>>> use nested subspaces, tuples, directories or something else. >>>>> >>>>> - Store documents in a subspace or directory (to keep prefix for a key >>>>> short) >>>>> - When we store the document we would enumerate all field names (0 and 1 >>>>> are reserved) and store the mapping table in the key which look like: >>>>> ``` >>>>> {DB_DOCS_NS} / {DOC_KEY} / 0 >>>>> ``` >>>>> - Flatten the JSON document (convert it into key value pairs where the >>>>> key is `JSON_PATH` and value is `SCALAR_VALUE`) >>>>> - Replace elements of JSON_PATH with integers from mapping table we >>>>> constructed earlier >>>>> - When we have array use `1 / {array_idx}` >>>>> - Store scalar values in the keys which look like the following (we use >>>>> `JSON_PATH` with integers). >>>>> ``` >>>>> {DB_DOCS_NS} / {DOC_KEY} / {JSON_PATH} >>>>> ``` >>>>> - If the scalar value exceeds 100kB we would split it and store every >>>>> part under key constructed as: >>>>> ``` >>>>> {DB_DOCS_NS} / {DOC_KEY} / {JSON_PATH} / {PART_IDX} >>>>> ``` >>>>> >>>>> Since all parts of the documents are stored under a common `{DB_DOCS_NS} >>>>> / {DOC_KEY}` they will be stored on the same server most of the time. The >>>>> document can be retrieved by using range query >>>>> (`txn.get_range("{DB_DOCS_NS} / {DOC_KEY} / 0", "{DB_DOCS_NS} / {DOC_KEY} >>>>> / 0xFF")`). We can reconstruct the document since the mapping is returned >>>>> as well. >>>>> >>>>> The downside of this approach is we wouldn't be able to ensure the same >>>>> order of keys in the JSON object. Currently the `jiffy` JSON encoder >>>>> respects order of keys. >>>>> ``` >>>>> 4> jiffy:encode({[{bbb, 1}, {aaa, 12}]}). >>>>> <<"{\"bbb\":1,\"aaa\":12}">> >>>>> 5> jiffy:encode({[{aaa, 12}, {bbb, 1}]}). >>>>> <<"{\"aaa\":12,\"bbb\":1}">> >>>>> ``` >>>>> >>>>> Best regards, >>>>> iilyak >>>>> >>>>> On 2019/01/30 13:02:57, Ilya Khlopotov <iil...@apache.org> wrote: >>>>>> As you might already know the FoundationDB has a number of limitations >>>>>> which influences the way we might store JSON documents. The limitations >>>>>> are: >>>>>> >>>>>> | limitation |recommended value|recommended >>>>>> max|absolute max| >>>>>> |-------------------------|----------------------:|--------------------:|--------------:| >>>>>> | transaction duration | | >>>>>> | 5 sec | >>>>>> | transaction data size | | >>>>>> | 10 Mb | >>>>>> | key size | 32 bytes | >>>>>> 1 kB | 10 kB | >>>>>> | value size | | >>>>>> 10 kB | 100 kB | >>>>>> >>>>>> In order to fit the JSON document into 100kB we would have to partition >>>>>> it in some way. There are three ways of partitioning the document >>>>>> 1. store multiple binary blobs (parts) in different keys >>>>>> 2. flatten JSON structure and store every path leading to a scalar value >>>>>> under own key >>>>>> 3. measure the size of different branches of a tree representing the >>>>>> JSON document (while we parse) and use another key for the branch when >>>>>> we about to exceed the limit >>>>>> >>>>>> - The first approach is the simplest but it wouldn't allow us to access >>>>>> parts of the document. >>>>>> - The downsides of a second approach are: >>>>>> - flattened JSON structure would have long paths which means longer keys >>>>>> - the scalar value cannot be more than 100kb (unless we split it as well) >>>>>> - Third approach falls short in cases when the structure of the document >>>>>> doesn't allow a clean cut off branches: >>>>>> - complex rules to handle all corner cases >>>>>> >>>>>> The goals of this thread are: >>>>>> - to collect ideas on how to encode and store the JSON document >>>>>> - to comment on the collected ideas >>>>>> >>>>>> Non goals: >>>>>> - the storage of metadata for the document would be discussed elsewhere >>>>>> - thumb stones >>>>>> - edit conflicts >>>>>> - revisions >>>>>> >>>>>> Best regards, >>>>>> iilyak >>>>>> >>>> >>>> -- >>>> Professional Support for Apache CouchDB: >>>> https://neighbourhood.ie/couchdb-support/ >>>> >>> >>> -- >>> Professional Support for Apache CouchDB: >>> https://neighbourhood.ie/couchdb-support/ >>> >>>
-- Professional Support for Apache CouchDB: https://neighbourhood.ie/couchdb-support/