[ 
https://issues.apache.org/jira/browse/PHOENIX-7330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viraj Jasani reassigned PHOENIX-7330:
-------------------------------------

    Assignee: Viraj Jasani

> Introducing Binary JSON (BSON) with Complex Document structures in Phoenix
> --------------------------------------------------------------------------
>
>                 Key: PHOENIX-7330
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-7330
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Viraj Jasani
>            Assignee: Viraj Jasani
>            Priority: Major
>
> The purpose of this Jira is to introduce new data type in Phoenix: Binary 
> JSON (BSON) to manage more complex document data structures in Phoenix.
> BSON or Binary JSON is a Binary-Encoded serialization of JSON-like documents. 
> BSON data type is specifically used for users to store, update and query part 
> or whole of the BsonDocument in the most performant way without having to 
> serialize/deserialize the document to/from binary format. Bson allows 
> deserializing only part of the nested documents such that querying or 
> indexing any attributes within the nested structure becomes more efficient 
> and performant as the deserialization happens at runtime. Any other document 
> structure would require deserializing the binary into the document, and then 
> perform the query.
> BSONSpec: [https://bsonspec.org/]
> JSON and BSON are closely related by design. BSON serves as a binary 
> representation of JSON data, tailored with specialized extensions for wider 
> application scenarios, and finely tuned for efficient data storage and 
> traversal. Similar to JSON, BSON facilitates the embedding of objects and 
> arrays.
>  
> One particular way in which BSON differs from JSON is in its support for some 
> more advanced data types. For instance, JSON does not differentiate between 
> integers (round numbers), and floating-point numbers (with decimal 
> precision). BSON does distinguish between the two and store them in the 
> corresponding BSON data type (e.g. BsonInt32 vs BsonDouble). Many server-side 
> programming languages offer advanced numeric data types (standards include 
> integer, regular precision floating point number i.e. “float”, 
> double-precision floating point i.e. “double”, and boolean values), each with 
> its own optimal usage for efficient mathematical operations.
> Another key distinction between BSON and JSON is that BSON documents have the 
> capability to include Date or Binary objects, which cannot be directly 
> represented in pure JSON format. BSON also provides the ability to store and 
> retrieve user defined Binary objects. Likewise, by integrating advanced data 
> structures like Sets into BSON documents, we can significantly enhance the 
> capabilities of Phoenix for storing, retrieving, and updating Binary, Sets, 
> Lists, and Documents as nested or complex data types.
> Moreover, JSON format is human as well as machine readable, whereas BSON 
> format is only machine readable. Hence, as part of introducing BSON data 
> type, we also need to provide a user interface such that users can provide 
> human readable JSON as input for BSON datatype.
> This Jira also introduces access and update functions for BSON documents.
> BSON_CONDITION_EXPRESSION can evaluate condition expression on the document 
> fields, similar to how WHERE clause evaluates condition expression on various 
> columns of the given row(s) for the relational tables.
> BSON_UPDATE_EXPRESSION can perform one or more document field updates similar 
> to how UPSERT statements can perform update to one or more columns of the 
> given row(s) for the relational tables.
> Overall, by combining various functionalities available in Phoenix like 
> secondary indexes, conditional updates, high throughput read/write with BSON, 
> we can evolve Phoenix into highly scalable Document Database.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to