Hi all, I am working on a project to replace the current RDBMS based database of the message broker store with a file based database system. Currently the implementation is carried out in LevelDB which is a key-value based data store. The following is an explanation of suggested key schema for the data store with related design decisions.
*Overview :* LevelDB is a key value based database where a value can be stored under a certain unique key. This key-value mapping is one directional which means a value only can be retrieved by accessing corresponding key. One of the main features in LevelDB is that it stores keys in a lexicographically (alphabetically) sorted order. All the keys and values are stored in byte array format in the database store which should be accordingly converted to string format within the application. For this LevelDB store implementation leveldbjni-1.8[1] is used which provides a Java based API for LevelDB by providing following main functionalities. 1. put(key,value) : stores given value under the provided key 2. get(key) : returns corresponding value to the key 3. delete(key) : deletes given key 4. batch() : provides atomicity for the operations 5. iterator() : traverse through the stored keys When designing the key schema in Level DB the following factors are mainly considered. 1. Lexicographical order of the stored keys 2. Traversing through the keys 3. Data organization *Key Schema :* The key schema implementation was carried out for following tables of the current RDBMS database. [image: Screenshot from 2017-08-14 01-13-33.png] The key schema is mainly designed by analyzing implemented queries for data retrieval and inserting in the RDBMS. The key schema for above three tables is represented below table. [image: Screenshot from 2017-08-15 02-11-24.png] *Key : Value* *Purpose* MESSAGE.$message_id.QUEUE_ID : queue_id Stores queue id of the message. MESSAGE.$message_id.DLC_QUEUE_ID : dlc_queue_id Stores dlc queue id of the message. MESSAGE.$message_id.MESSAGE_METADATA : message_metadata Stores metadata of the message. MESSAGE.$message_id.$content_offset.MESSAGE_CONTENT : message_content Stores message content for a given message offset of the message. QUEUE.$queue_id.QUEUE_NAME : queue_name Stores name of the queue under the id. QUEUE.$queue_name.QUEUE_ID : queue_id Stores id of the queue under the name. QUEUE.$queue_name.message_id. MESSAGE_METADATA : message_metadata Stores metadata of the messages which belongs to the queue. LAST_MESSAGE_ID Stores last message id. LAST_QUEUE_ID Stores last queue id. As it can be seen some data repetition is higher when using this schema. That is mainly due to one directional key-value mapping of LevelDB. As an example two keys (QUEUE.$queue_id.QUEUE_NAME , QUEUE.$queue_name.QUEUE_ID) are required to build the bidirectional relation (get queue name given queue id and get queue id given queue name) between queue name and the queue id. As LevelDB has better writing performances than RDBMS data repetition may not be an much of an overhead in inserting data. Moreover batch operations can be used in multiple insertions. The main purpose of using of prefixes like MESSAGE and QUEUE in keys is to organize them properly. As LevelDB stores keys lexicographically these prefixes will make sure that message related and queue related keys are stored separately as displayed below. The following shows the keys of the LevelDB store after publishing a JMS message to the broker. It can be clearly seen that the keys are stored in lexicographical order. [image: Screenshot from 2017-08-14 19-57-13.png] Organize keys in such a manner also improves the efficiency of traversing the keys using iterators when retrieving and deleting data. As displayed in the diagram below, iterators traverse by starting from the first stored key in the store. When iterator head reaches a key it can either move to the next key or previous key. (similar to double linked list) Hence storing related keys successively improves the efficiency of traversing when retrieving and deleting data by reducing the seeking time. [image: Screenshot from 2017-08-15 02-11-40.png] Basically these are the factors and decisions which have been taken in implementing this key schema. And this schema should be extended to provide functionalities like storing message expiration data etc. It would be great to to have a feedback on the proposed schema specially regarding how to reduce data repetition and improve efficiency furthermore. [1] https://github.com/fusesource/leveldbjni Best Regards, -- *Wishmitha Mendis* *Intern - Software Engineering* *WSO2* *Mobile : +94 777577706*
_______________________________________________ Architecture mailing list Architecture@wso2.org https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture