[ https://issues.apache.org/jira/browse/PHOENIX-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Jasani reassigned PHOENIX-7330: ------------------------------------- Assignee: Viraj Jasani > Introducing Binary JSON (BSON) with Complex Document structures in Phoenix > -------------------------------------------------------------------------- > > Key: PHOENIX-7330 > URL: https://issues.apache.org/jira/browse/PHOENIX-7330 > Project: Phoenix > Issue Type: New Feature > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Major > > The purpose of this Jira is to introduce new data type in Phoenix: Binary > JSON (BSON) to manage more complex document data structures in Phoenix. > BSON or Binary JSON is a Binary-Encoded serialization of JSON-like documents. > BSON data type is specifically used for users to store, update and query part > or whole of the BsonDocument in the most performant way without having to > serialize/deserialize the document to/from binary format. Bson allows > deserializing only part of the nested documents such that querying or > indexing any attributes within the nested structure becomes more efficient > and performant as the deserialization happens at runtime. Any other document > structure would require deserializing the binary into the document, and then > perform the query. > BSONSpec: [https://bsonspec.org/] > JSON and BSON are closely related by design. BSON serves as a binary > representation of JSON data, tailored with specialized extensions for wider > application scenarios, and finely tuned for efficient data storage and > traversal. Similar to JSON, BSON facilitates the embedding of objects and > arrays. > > One particular way in which BSON differs from JSON is in its support for some > more advanced data types. For instance, JSON does not differentiate between > integers (round numbers), and floating-point numbers (with decimal > precision). BSON does distinguish between the two and store them in the > corresponding BSON data type (e.g. BsonInt32 vs BsonDouble). Many server-side > programming languages offer advanced numeric data types (standards include > integer, regular precision floating point number i.e. “float”, > double-precision floating point i.e. “double”, and boolean values), each with > its own optimal usage for efficient mathematical operations. > Another key distinction between BSON and JSON is that BSON documents have the > capability to include Date or Binary objects, which cannot be directly > represented in pure JSON format. BSON also provides the ability to store and > retrieve user defined Binary objects. Likewise, by integrating advanced data > structures like Sets into BSON documents, we can significantly enhance the > capabilities of Phoenix for storing, retrieving, and updating Binary, Sets, > Lists, and Documents as nested or complex data types. > Moreover, JSON format is human as well as machine readable, whereas BSON > format is only machine readable. Hence, as part of introducing BSON data > type, we also need to provide a user interface such that users can provide > human readable JSON as input for BSON datatype. > This Jira also introduces access and update functions for BSON documents. > BSON_CONDITION_EXPRESSION can evaluate condition expression on the document > fields, similar to how WHERE clause evaluates condition expression on various > columns of the given row(s) for the relational tables. > BSON_UPDATE_EXPRESSION can perform one or more document field updates similar > to how UPSERT statements can perform update to one or more columns of the > given row(s) for the relational tables. > Overall, by combining various functionalities available in Phoenix like > secondary indexes, conditional updates, high throughput read/write with BSON, > we can evolve Phoenix into highly scalable Document Database. -- This message was sent by Atlassian Jira (v8.20.10#820010)