Hi all,

I am working on a project to replace the current RDBMS based database of
the message broker store with a file based database system. Currently the
implementation is carried out in LevelDB which is a key-value based data
store. The following is an explanation of suggested key schema for the data
store with related design decisions.

*Overview :*

LevelDB is a key value based database where a value can be stored under a
certain unique key. This key-value mapping is one directional which means a
value only can be retrieved by accessing corresponding key. One of the main
features in LevelDB is that it stores keys in a lexicographically
(alphabetically) sorted order. All the keys and values are stored in byte
array format in the database store which should be accordingly converted to
string format within the application.

For this LevelDB store implementation leveldbjni-1.8[1] is used which
provides a Java based API for LevelDB by providing following main
functionalities.


   1.

   put(key,value) : stores given value under the provided key
   2.

   get(key) : returns corresponding value to the key
   3.

   delete(key) : deletes given key
   4.

   batch() : provides atomicity for the operations
   5.

   iterator() : traverse through the stored keys


When designing the key schema in Level DB the following factors are mainly
considered.


   1.

   Lexicographical order of the stored keys
   2.

   Traversing through the keys
   3.

   Data organization


*Key Schema :*

The key schema implementation was carried out for following tables of the
current RDBMS database.

[image: Screenshot from 2017-08-14 01-13-33.png]

The key schema is mainly designed by analyzing implemented queries for data
retrieval and inserting in the RDBMS. The key schema for above three tables
is represented below table.


[image: Screenshot from 2017-08-15 02-11-24.png]

*Key : Value*

*Purpose*

MESSAGE.$message_id.QUEUE_ID : queue_id

Stores queue id of the message.

MESSAGE.$message_id.DLC_QUEUE_ID : dlc_queue_id

Stores dlc queue id of the message.

MESSAGE.$message_id.MESSAGE_METADATA : message_metadata

Stores metadata of the message.

MESSAGE.$message_id.$content_offset.MESSAGE_CONTENT : message_content

Stores message content for a given message offset of the message.

QUEUE.$queue_id.QUEUE_NAME : queue_name

Stores name of the queue under the id.

QUEUE.$queue_name.QUEUE_ID : queue_id

Stores id of the queue under the name.

QUEUE.$queue_name.message_id. MESSAGE_METADATA : message_metadata

Stores metadata of the messages which belongs to the queue.

LAST_MESSAGE_ID

Stores last message id.

LAST_QUEUE_ID

Stores last queue id.

As it can be seen some data repetition is higher when using this schema.
That is mainly due to one directional key-value mapping of LevelDB. As an
example two keys (QUEUE.$queue_id.QUEUE_NAME , QUEUE.$queue_name.QUEUE_ID)
are required to build the bidirectional relation (get queue name given
queue id and get queue id given queue name) between queue name and the
queue id. As LevelDB has better writing performances than RDBMS data
repetition may not be an much of an overhead in inserting data. Moreover
batch operations can be used in multiple insertions.

The main purpose of using of prefixes like MESSAGE and QUEUE in keys is to
organize them properly. As LevelDB stores keys lexicographically these
prefixes will make sure that message related and queue related keys are
stored separately as displayed below. The following shows the keys of the
LevelDB store after publishing a JMS message to the broker. It can be
clearly seen that the keys are stored in lexicographical order.

[image: Screenshot from 2017-08-14 19-57-13.png]

Organize keys in such a manner also improves the efficiency of traversing
the keys using iterators when retrieving and deleting data. As displayed in
the diagram below, iterators traverse by starting from the first stored key
in the store. When iterator head reaches a key it can either move to the
next key or previous key. (similar to double linked list) Hence storing
related keys successively improves the efficiency of traversing when
retrieving and deleting data by reducing the seeking time.


[image: Screenshot from 2017-08-15 02-11-40.png]


Basically these are the factors and decisions which have been taken in
implementing this key schema. And this schema should be extended to provide
functionalities like storing message expiration data etc. It would be great
to to have a feedback on the proposed schema specially regarding how to
reduce data repetition and improve efficiency furthermore.

[1] https://github.com/fusesource/leveldbjni


Best Regards,
-- 

*Wishmitha Mendis*

*Intern - Software Engineering*
*WSO2*

*Mobile : +94 777577706*
_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to