Re: [Architecture] Proposed Key Schema for LevelDB Store of Message Broker

2017-11-22 Thread Wishmitha Mendis
Hi all, A File-Based DTX store was implemented using LevelDB to store
distributed transaction related data. As XA transaction (two-phase commit)
is used to handle the distributed transaction in MB, the related xid along
with node id and branch id should be stored. Each transaction consists of
message enqueue/dequeue (or both) operations. In the prepare phase of the
transaction, metadata and the content of the messages which should be
enqueued/dequeued have to be stored temporarily. When the transaction is in
the commit phase, enqueue/dequeue operations of the transaction should be
committed and temporary records during the prepare phase should be deleted.
LevelDB WriteBatch[1] was used to implement these operations.

Following is the key schema for the DTX Store. These keys are defined to
hold the related data temporary in the prepare phase of the transaction.

Key Schema :

Key

Value

DTX_ENTRY : $xid: $node_id : BRANCH_ID

Branch ID

DTX_ENTRY : $xid : $node_id : FORMAT_CODE

Format code

DTX_ENTRY : $xid : $node_id : GLOBAL_ID

Global ID

DTX_ENQUEUE : $xid : MESSAGE_METADATA : $message_id

Metadata of the message which should be enqueued

DTX_ENQUEUE : $xid : MESSAGE_CONTENT : $message_id : $offset

Content chunk of the message which should be enqueued

DTX_DEQUEUE : $xid : MESSAGE_METADATA : $message_id

Metadata of the message which should be dequeued

DTX_DEQUEUE : $xid :  MESSAGE_CONTENT : $message_id : $offset

Content chunk of the message which should be dequeued

DTX_DEQUEUE : $xid : DESTINATION_NAME : $message_id

Destination of the message which should be dequeued

Implementation :

The implementation of the file based DTX Store is done by implementing
methods in DtxStore interface.[2]


[1]https://github.com/google/leveldb/blob/master/doc/index.md#atomic-updates
[2] Pull Request :  https://github.com/wso2/andes/pull/944/files
<https://github.com/wso2/andes/pull/944/files>
Best Regards,

On Fri, Nov 10, 2017 at 11:53 AM, Wishmitha Mendis <wishmi...@wso2.com>
wrote:

> Hi all,
>
> The key schema for LevelDB store was updated due to following reasons.
>
>1.
>
>Reduce data replication
>2.
>
>Minimize the lookup time for iterators when retrieving data
>
>
> When updating the schema lexicographically ordered storage mechanism of
> LevelDB was taken into consideration.
>
> Updated Key Schema :
>
> Key
>
> Value
>
> MESSAGE.$message_id.$content_offset.MESSAGE_CONTENT
>
> Message content chunk
>
> MESSAGE.$message_id.DESTINATION_NAME
>
> Name of the queue/topic where the message is stored
>
> MESSAGE.$message_id.DLC_STATUS
>
> “1” if the message is in dead letter channel, else “0”
>
> DESTINATION.MESSAGE_COUNT.$destination_name
>
> Message count of the queue/topic
>
> DESTINATION.$destination_name.MESSAGE_METADATA.$message_id
>
> Metadata of the message which is stored in the queue/topic.
>
> DESTINATION.$destination_name.MESSAGE_EXPIRATION_TIME.$message_id
>
> Expiration time of the message which is stored in the queue/topic.
>
> Comparison :
>
> When compared to the previous schema, this schema has less data
> repetition. Previously, the message metadata and expiration time were
> stored both under the message and the destination, whereas in this schema
> metadata and expiration time are stored only under the destination.
>
> Attributes like QUEUE_ID, DLC_QUEUE_ID were removed from the schema, as
> all the queues/topics are guaranteed to have a unique name.
>
> In the previous key schema, metadata and expiration time of a message in a
> destination were stored successively.
>
> [image: Screenshot from 2017-11-10 10-58-12.png]
>
> With this schema, when an iterator looks up for all message metadata in a
> destination, it will have to traverse through all the keys including the
> keys related to expiration time as well. This lookup time can be reduced by
> separating expiration time and message metadata as follows.
>
> [image: Screenshot from 2017-11-10 11-06-40.png]
>
> Now the iterator does not have to traverse through expiration time related
> keys as they are separated from the message metadata related keys. This is
> used in the updated schema. Hence the lookup time for both metadata and
> expiration time will be halved compared to the previous schema.
>
> As the total number of keys per message is reduced in this schema,
> publisher rates will be increased too.
>
> Dead Letter Channel :
>
> Dead letter channel is considered as a destination(queue) where messages
> will be moved to it, once the maximum redelivery attempts are reached. The 
> DLC_STATUS
> of the message will be updated from "0" to “1”, once the message is moved
> to the dead letter channel.
>
> On Wed, Aug 16, 2017 at 5:24 PM, Hasitha Hir

Re: [Architecture] Proposed Key Schema for LevelDB Store of Message Broker

2017-08-15 Thread Wishmitha Mendis
Hi Asanka,

1. We can initially use a Network File System as you mentioned for HA.
However data replication in LevelDB is used in ActiveMQ replicated store.
[1]

2. Yes, the iterator should traverse all the related keys to get message
data. This is why the key schema is designed in a such way that all the
messages in a queue are stored successively to reduce the traversing time.
And as you said it is much more complex than the RDBMS as the data cannot
be retrieved by simply executing a query. However, LevelDB can still
provide much faster data retrieval rates than RDBMS due to its high
performances. Hence, even though the data retrieving operation is complex,
it is not a much of an overhead when it comes to overall performances.

(Additional : Most of the RDBMS engines use file based stores underneath.
As an example, LevelDB is used as a database engine in MariaDB. [2] Hence,
even when executing a query in RDBMS, these kind of traversing operations
may occur underneath.)

3. Data cannot be inserted while traversing. Actually traversing through
the keys is done by an iterator which should be eventually closed after the
operation is completed. A sample code is represented below. [3]

DBIterator iterator = db.iterator();
try {
  for(iterator.seekToFirst(); iterator.hasNext(); iterator.next()) {
String key = asString(iterator.peekNext().getKey());
String value = asString(iterator.peekNext().getValue());
System.out.println(key+" = "+value);
  }
} finally {
  // Make sure you close the iterator to avoid resource leaks.
  iterator.close();
}

These iterators are mainly used in methods such as getMetaDataList() and
deleteMessages() in the implementation. The iterator should be closed in
those methods, as displayed in the above code.

4. Yes this will be a performance limitation. The throughput get reasonably
low when publishing/retrieving messages in multi-threaded environment. Even
though LevelDB is capable of providing higher throughput than RDBMS even in
a multi-threaded environment according to the test results, that can be a
bottleneck in concurrent access of DB. Main purpose of this PoC is actually
develop a generic key schema, so that we can switch between and select the
optimal file based store for the message broker.

5. LevelDB does not have an inbuilt transaction support. Therefore
transactions should implemented as a external layer within the application.
Currently I am working on this and exploring how the transactions are
implemented in ActiveMQ LevelDB store. [4] I will post a separate thread on
LevelDB transactions.


[1] http://activemq.apache.org/replicated-leveldb-store.html
[2] https://mariadb.com/kb/en/mariadb/leveldb/
[3] https://github.com/fusesource/leveldbjni
[4] https://github.com/apache/activemq/tree/master/activemq-leveldb-store

On Tue, Aug 15, 2017 at 11:21 AM, Asanka Abeyweera <asank...@wso2.com>
wrote:

> Hi Wishmitha,
>
>1. How are we going to support HA deployment with LevelDB? Are we
>going to use a network file system or replicate data?
>2. If we wanted to get a set of message matching a given queue ID, do
>we have to traverse all messages to get that? In RDBMS this is easier to do
>with a where clause.
>3. What happens if we insert data while traversing?
>4. It seems "*only a single process (possibly multi-threaded) can
>access a particular database at a time*"[1] in LevelDB. Will this be a
>bottleneck when we need to access the DB concurrently?
>5. How are the transactions handled in LevelDB? When
>implementing distributed transactions feature we required row level
>locking instead of table level locking. Does LevelDB support that?
>
> [1] https://github.com/google/leveldb
>
> On Tue, Aug 15, 2017 at 10:47 AM, Wishmitha Mendis <wishmi...@wso2.com>
> wrote:
>
>> Hi Sumedha,
>>
>> The Java library for LevelDB (leveldbjni) creates the database as follows
>> as mentioned in the docs. [1]
>>
>> Options options = new Options();
>> DB db = factory.open(new File("example"), options);
>>
>> This will create the database in a directory on a given path. And in the
>> library docs, it is mentioned that the library supports several platforms
>> if not specifically configured. Therefore using this library does not
>> require to ship LevelDB and it also won't take away platform agnostic
>> installation capability of MB. However the implementation is currently only
>> tested on Linux, I will test it on Windows and other platforms and let you
>> know.
>>
>> When considering the LevelDB architecture, it is already used as a broker
>> store in ActiveMQ. [2] [3] This proves that LevelDB has the architectural
>> capability to efficiently insert and delete messages in a broker.
>>
>> [1] https://github.com/fusesou

Re: [Architecture] Proposed Key Schema for LevelDB Store of Message Broker

2017-08-15 Thread Wishmitha Mendis
Hi Sajith,

Yes we are aware of the fact that ActiveMQ does not recommend LevelDB store
currently. That is mainly because ActiveMQ is more focused on improving
KahaDB which is the default database of their broker [1]. And they
introduced LevelDB store mainly to increase the performances of the broker
[2]. As mentioned in [2], LevelDB store can provide certain advantages over
current KahaDB store of ActiveMQ. And there can be certain trade offs in
using LevelDB as a broker store. Our primary objective is identifying those
trade offs and selecting the optimal file based data store by PoC with this.

[1]
http://activemq.2283324.n4.nabble.com/DISCUSS-LevelDB-deprecation-td4719227.html
[2] https://github.com/apache/activemq/tree/master/activemq-leveldb-store

Best Regards,

On Tue, Aug 15, 2017 at 10:59 AM, Sajith Kariyawasam <saj...@wso2.com>
wrote:

> Hi Wishmitha,
>
> As per ActiveMQ [1] it says "The LevelDB store has been deprecated and is
> no longer supported or recommended for use. The recommended store is
> KahaDB" ... Are we aware of the reason LevelDB is not recommending in
> ActiveMQ? In that case are we on the correct path to use LevelDB ?
>
> [1]  http://activemq.apache.org/leveldb-store.html
>
> On Tue, Aug 15, 2017 at 10:47 AM, Wishmitha Mendis <wishmi...@wso2.com>
> wrote:
>
>> Hi Sumedha,
>>
>> The Java library for LevelDB (leveldbjni) creates the database as follows
>> as mentioned in the docs. [1]
>>
>> Options options = new Options();
>> DB db = factory.open(new File("example"), options);
>>
>> This will create the database in a directory on a given path. And in the
>> library docs, it is mentioned that the library supports several platforms
>> if not specifically configured. Therefore using this library does not
>> require to ship LevelDB and it also won't take away platform agnostic
>> installation capability of MB. However the implementation is currently only
>> tested on Linux, I will test it on Windows and other platforms and let you
>> know.
>>
>> When considering the LevelDB architecture, it is already used as a broker
>> store in ActiveMQ. [2] [3] This proves that LevelDB has the architectural
>> capability to efficiently insert and delete messages in a broker.
>>
>> [1] https://github.com/fusesource/leveldbjni
>> [2] http://activemq.apache.org/leveldb-store.html
>> [3] https://github.com/apache/activemq/tree/master/activemq-leveldb-store
>>
>> Best Regards,
>>
>> On Tue, Aug 15, 2017 at 2:29 AM, Sumedha Rubasinghe <sume...@wso2.com>
>> wrote:
>>
>>> Hi Wishmitha,
>>> Would leveldb architecture be efficient for a message broker where
>>> removing delivered messages is very frequent?
>>>
>>> This requires WSO2 Message Broker to ship leveldb. leveldb (
>>> https://github.com/google/leveldb) has native distributions for
>>> platforms. AFAIC this will take away platform agnostic installation
>>> capability of MB.
>>>
>>>
>>>
>>> On Tue, Aug 15, 2017 at 2:20 AM, Wishmitha Mendis <wishmi...@wso2.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am working on a project to replace the current RDBMS based database
>>>> of the message broker store with a file based database system. Currently
>>>> the implementation is carried out in LevelDB which is a key-value based
>>>> data store. The following is an explanation of suggested key schema for the
>>>> data store with related design decisions.
>>>>
>>>> *Overview :*
>>>>
>>>> LevelDB is a key value based database where a value can be stored under
>>>> a certain unique key. This key-value mapping is one directional which means
>>>> a value only can be retrieved by accessing corresponding key. One of the
>>>> main features in LevelDB is that it stores keys in a lexicographically
>>>> (alphabetically) sorted order. All the keys and values are stored in byte
>>>> array format in the database store which should be accordingly converted to
>>>> string format within the application.
>>>>
>>>> For this LevelDB store implementation leveldbjni-1.8[1] is used which
>>>> provides a Java based API for LevelDB by providing following main
>>>> functionalities.
>>>>
>>>>
>>>>1.
>>>>
>>>>put(key,value) : stores given value under the provided key
>>>>2.
>>>>
>>>>get(key) : returns corresponding value to the key
>>>>3.
&

Re: [Architecture] Proposed Key Schema for LevelDB Store of Message Broker

2017-08-14 Thread Wishmitha Mendis
Hi Sumedha,

The Java library for LevelDB (leveldbjni) creates the database as follows
as mentioned in the docs. [1]

Options options = new Options();
DB db = factory.open(new File("example"), options);

This will create the database in a directory on a given path. And in the
library docs, it is mentioned that the library supports several platforms
if not specifically configured. Therefore using this library does not
require to ship LevelDB and it also won't take away platform agnostic
installation capability of MB. However the implementation is currently only
tested on Linux, I will test it on Windows and other platforms and let you
know.

When considering the LevelDB architecture, it is already used as a broker
store in ActiveMQ. [2] [3] This proves that LevelDB has the architectural
capability to efficiently insert and delete messages in a broker.

[1] https://github.com/fusesource/leveldbjni
[2] http://activemq.apache.org/leveldb-store.html
[3] https://github.com/apache/activemq/tree/master/activemq-leveldb-store

Best Regards,

On Tue, Aug 15, 2017 at 2:29 AM, Sumedha Rubasinghe <sume...@wso2.com>
wrote:

> Hi Wishmitha,
> Would leveldb architecture be efficient for a message broker where
> removing delivered messages is very frequent?
>
> This requires WSO2 Message Broker to ship leveldb. leveldb (
> https://github.com/google/leveldb) has native distributions for
> platforms. AFAIC this will take away platform agnostic installation
> capability of MB.
>
>
>
> On Tue, Aug 15, 2017 at 2:20 AM, Wishmitha Mendis <wishmi...@wso2.com>
> wrote:
>
>> Hi all,
>>
>> I am working on a project to replace the current RDBMS based database of
>> the message broker store with a file based database system. Currently the
>> implementation is carried out in LevelDB which is a key-value based data
>> store. The following is an explanation of suggested key schema for the data
>> store with related design decisions.
>>
>> *Overview :*
>>
>> LevelDB is a key value based database where a value can be stored under a
>> certain unique key. This key-value mapping is one directional which means a
>> value only can be retrieved by accessing corresponding key. One of the main
>> features in LevelDB is that it stores keys in a lexicographically
>> (alphabetically) sorted order. All the keys and values are stored in byte
>> array format in the database store which should be accordingly converted to
>> string format within the application.
>>
>> For this LevelDB store implementation leveldbjni-1.8[1] is used which
>> provides a Java based API for LevelDB by providing following main
>> functionalities.
>>
>>
>>1.
>>
>>put(key,value) : stores given value under the provided key
>>2.
>>
>>get(key) : returns corresponding value to the key
>>3.
>>
>>delete(key) : deletes given key
>>4.
>>
>>batch() : provides atomicity for the operations
>>5.
>>
>>iterator() : traverse through the stored keys
>>
>>
>> When designing the key schema in Level DB the following factors are
>> mainly considered.
>>
>>
>>1.
>>
>>Lexicographical order of the stored keys
>>2.
>>
>>Traversing through the keys
>>3.
>>
>>Data organization
>>
>>
>> *Key Schema :*
>>
>> The key schema implementation was carried out for following tables of the
>> current RDBMS database.
>>
>> [image: Screenshot from 2017-08-14 01-13-33.png]
>>
>> The key schema is mainly designed by analyzing implemented queries for
>> data retrieval and inserting in the RDBMS. The key schema for above three
>> tables is represented below table.
>>
>>
>> [image: Screenshot from 2017-08-15 02-11-24.png]
>>
>> *Key : Value*
>>
>> *Purpose*
>>
>> MESSAGE.$message_id.QUEUE_ID : queue_id
>>
>> Stores queue id of the message.
>>
>> MESSAGE.$message_id.DLC_QUEUE_ID : dlc_queue_id
>>
>> Stores dlc queue id of the message.
>>
>> MESSAGE.$message_id.MESSAGE_METADATA : message_metadata
>>
>> Stores metadata of the message.
>>
>> MESSAGE.$message_id.$content_offset.MESSAGE_CONTENT : message_content
>>
>> Stores message content for a given message offset of the message.
>>
>> QUEUE.$queue_id.QUEUE_NAME : queue_name
>>
>> Stores name of the queue under the id.
>>
>> QUEUE.$queue_name.QUEUE_ID : queue_id
>>
>> Stores id of the queue under the name.
>>
>> QUEUE.$queue_name.message_id. MESSAGE_METADATA : me

[Architecture] Proposed Key Schema for LevelDB Store of Message Broker

2017-08-14 Thread Wishmitha Mendis
Hi all,

I am working on a project to replace the current RDBMS based database of
the message broker store with a file based database system. Currently the
implementation is carried out in LevelDB which is a key-value based data
store. The following is an explanation of suggested key schema for the data
store with related design decisions.

*Overview :*

LevelDB is a key value based database where a value can be stored under a
certain unique key. This key-value mapping is one directional which means a
value only can be retrieved by accessing corresponding key. One of the main
features in LevelDB is that it stores keys in a lexicographically
(alphabetically) sorted order. All the keys and values are stored in byte
array format in the database store which should be accordingly converted to
string format within the application.

For this LevelDB store implementation leveldbjni-1.8[1] is used which
provides a Java based API for LevelDB by providing following main
functionalities.


   1.

   put(key,value) : stores given value under the provided key
   2.

   get(key) : returns corresponding value to the key
   3.

   delete(key) : deletes given key
   4.

   batch() : provides atomicity for the operations
   5.

   iterator() : traverse through the stored keys


When designing the key schema in Level DB the following factors are mainly
considered.


   1.

   Lexicographical order of the stored keys
   2.

   Traversing through the keys
   3.

   Data organization


*Key Schema :*

The key schema implementation was carried out for following tables of the
current RDBMS database.

[image: Screenshot from 2017-08-14 01-13-33.png]

The key schema is mainly designed by analyzing implemented queries for data
retrieval and inserting in the RDBMS. The key schema for above three tables
is represented below table.


[image: Screenshot from 2017-08-15 02-11-24.png]

*Key : Value*

*Purpose*

MESSAGE.$message_id.QUEUE_ID : queue_id

Stores queue id of the message.

MESSAGE.$message_id.DLC_QUEUE_ID : dlc_queue_id

Stores dlc queue id of the message.

MESSAGE.$message_id.MESSAGE_METADATA : message_metadata

Stores metadata of the message.

MESSAGE.$message_id.$content_offset.MESSAGE_CONTENT : message_content

Stores message content for a given message offset of the message.

QUEUE.$queue_id.QUEUE_NAME : queue_name

Stores name of the queue under the id.

QUEUE.$queue_name.QUEUE_ID : queue_id

Stores id of the queue under the name.

QUEUE.$queue_name.message_id. MESSAGE_METADATA : message_metadata

Stores metadata of the messages which belongs to the queue.

LAST_MESSAGE_ID

Stores last message id.

LAST_QUEUE_ID

Stores last queue id.

As it can be seen some data repetition is higher when using this schema.
That is mainly due to one directional key-value mapping of LevelDB. As an
example two keys (QUEUE.$queue_id.QUEUE_NAME , QUEUE.$queue_name.QUEUE_ID)
are required to build the bidirectional relation (get queue name given
queue id and get queue id given queue name) between queue name and the
queue id. As LevelDB has better writing performances than RDBMS data
repetition may not be an much of an overhead in inserting data. Moreover
batch operations can be used in multiple insertions.

The main purpose of using of prefixes like MESSAGE and QUEUE in keys is to
organize them properly. As LevelDB stores keys lexicographically these
prefixes will make sure that message related and queue related keys are
stored separately as displayed below. The following shows the keys of the
LevelDB store after publishing a JMS message to the broker. It can be
clearly seen that the keys are stored in lexicographical order.

[image: Screenshot from 2017-08-14 19-57-13.png]

Organize keys in such a manner also improves the efficiency of traversing
the keys using iterators when retrieving and deleting data. As displayed in
the diagram below, iterators traverse by starting from the first stored key
in the store. When iterator head reaches a key it can either move to the
next key or previous key. (similar to double linked list) Hence storing
related keys successively improves the efficiency of traversing when
retrieving and deleting data by reducing the seeking time.


[image: Screenshot from 2017-08-15 02-11-40.png]


Basically these are the factors and decisions which have been taken in
implementing this key schema. And this schema should be extended to provide
functionalities like storing message expiration data etc. It would be great
to to have a feedback on the proposed schema specially regarding how to
reduce data repetition and improve efficiency furthermore.

[1] https://github.com/fusesource/leveldbjni


Best Regards,
-- 

*Wishmitha Mendis*

*Intern - Software Engineering*
*WSO2*

*Mobile : +94 777577706*
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


[Architecture] Proposed Key Schema for LevelDB Store of Message Broker

2017-08-14 Thread Wishmitha Mendis
Hi all,

I am working on a project to replace the current RDBMS based database of
the message broker store with a file based database system. Currently the
implementation is carried out in LevelDB which is a key-value based data
store. The following is an explanation of suggested key schema for the data
store with related design decisions.

Overview :

LevelDB is a key value based database where a value can be stored under a
certain unique key. This key-value mapping is one directional which means a
value only can be retrieved by accessing corresponding key. One of the main
features in LevelDB is that it stores keys in a lexicographically
(alphabetically) sorted order. All the keys and values are stored in byte
array format in the database store which should be accordingly converted to
string format within the application.

For this LevelDB store implementation leveldbjni-1.8[1] is used which
provides a Java based API for LevelDB by providing following main
functionalities.


   1.

   put(key,value) : stores given value under the provided key
   2.

   get(key) : returns corresponding value to the key
   3.

   delete(key) : deletes given key
   4.

   batch() : provides atomicity for the operations
   5.

   iterator() : traverse through the stored keys


When designing the key schema in Level DB the following factors are mainly
considered.


   1.

   Lexicographical order of the stored keys
   2.

   Traversing through the keys
   3.

   Data organization


Key Schema :

The key schema implementation was carried out for following tables of the
current RDBMS database.

[image: Screenshot from 2017-08-14 01-13-33.png]

The key schema is mainly designed by analyzing implemented queries for data
retrieval and inserting in the RDBMS. The key schema for above three tables
can be represented as follows.



Key : Value

Purpose

MESSAGE.$message_id.QUEUE_ID : queue_id

Stores queue id of the message.

MESSAGE.$message_id.DLC_QUEUE_ID : dlc_queue_id

Stores dlc queue id of the message.

MESSAGE.$message_id.MESSAGE_METADATA : message_metadata

Stores metadata of the message.

MESSAGE.$message_id.$content_offset.MESSAGE_CONTENT : message_content

Stores message content for a given message offset of the message.

QUEUE.$queue_id.QUEUE_NAME : queue_name

Stores name of the queue under the id.

QUEUE.$queue_name.QUEUE_ID : queue_id

Stores id of the queue under the name.

QUEUE.$queue_name.message_id. MESSAGE_METADATA : message_metadata

Stores metadata of the messages which belongs to the queue.

LAST_MESSAGE_ID

Stores last message id.

LAST_QUEUE_ID

Stores last queue id.

As it can be seen some data repetition is higher when using this schema.
That is mainly due to one directional key-value mapping of LevelDB. As an
example two keys (QUEUE.$queue_id.QUEUE_NAME , QUEUE.$queue_name.QUEUE_ID)
are required to build the bidirectional relation (get queue name given
queue id and get queue id given queue name) between queue name and the
queue id. As LevelDB has better writing performances than RDBMS data
repetition may not be an much of an overhead in inserting data. Moreover
batch operations can be used in multiple insertions.

The main purpose of using of prefixes like MESSAGE and QUEUE in keys is to
organize them properly. As LevelDB stores keys lexicographically these
prefixes will make sure that message related and queue related keys are
stored separately as displayed below. The following shows the keys of the
LevelDB store after publishing a JMS message to the broker. It can be
clearly seen that the keys are stored in lexicographical order.

[image: Screenshot from 2017-08-14 19-57-13.png]

Organize keys in such a manner also improves the efficiency of traversing
the keys using iterators when retrieving and deleting data. As displayed in
the diagram below, iterators traverse by starting from the first stored key
in the store. When iterator head reaches a key it can either move to the
next key or previous key. (similar to double linked list) Hence storing
related keys successively improves the efficiency of traversing when
retrieving and deleting data by reducing the seeking time.


Basically these are the factors and decisions which have been taken in
implementing this key schema. And this schema should be extended to provide
functionalities like storing message expiration data etc. It would be great
to to have a feedback on the proposed schema specially regarding how to
reduce data repetition and improve efficiency furthermore.

[1] https://github.com/fusesource/leveldbjni


Best Regards,

-- 

*Wishmitha Mendis*

*Intern - Software Engineering*
*WSO2*

*Mobile : +94 777577706*
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture


Re: [Architecture] File Based Database for Message Broker Store

2017-07-15 Thread Wishmitha Mendis
Hi Kevin,

I just tried jedis for the initial implementation, no actual reason reason
to be honest. I will try other java-clients you mentioned, and figure out
what will fit the best for the project scenario.

Best Regards,

On Fri, Jul 14, 2017 at 11:17 PM, Kevin Ratnasekera <ke...@wso2.com> wrote:

> Hi Wishmitha,
>
> There are many java clients available for redis, there are several very
> good clients, see comparison [1]. Any reason using jedis over others?
>
> [1] https://redis.io/clients#java
>
> Regards
> Kevin
>
>
>
> On Fri, Jul 14, 2017 at 11:02 PM, Wishmitha Mendis <wishmi...@wso2.com>
> wrote:
>
>> Hi Indika,
>>
>> I have started a basic implementation to test the concurrency handling,
>> message order and transaction support of Redis. Basic message producers and
>> consumers were implemented and messages are stored in a basic Redis queue.
>> GitHub : https://github.com/Wishmitha/RedisTesting
>>
>> Best Regards,
>>
>> On Thu, Jul 13, 2017 at 6:51 PM, Indika Sampath <indi...@wso2.com> wrote:
>>
>>> Hi Wishmitha,
>>>
>>> No SQL database is not an option as we have prior experience with
>>> Cassandra. Basically, the important part is to find out performance and
>>> fault tolerance capability in the database. Message broker
>>> read/write/delete records all the time and database should be able to
>>> handle high concurrency. Some metadata and message content write as byte
>>> data. Also, message order matters when delivery. Hence database should
>>> capable to sort records rather than bringing that logic into broker code.
>>> As you mentioned transaction is another factor we are looking in the
>>> database. Shall we start POC with Redis and see how it goes?
>>>
>>> Cheers!
>>>
>>> On Thu, Jul 13, 2017 at 6:07 PM, Wishmitha Mendis <wishmi...@wso2.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I am working on a project to replace the current RDBMS based database
>>>> of the message broker store with a file based database system. I have been
>>>> researching on potential file based databases which can be used in the
>>>> project scenario. I have evaluated the pros and cons of the each of the
>>>> potential databases mainly based on read and write performances,
>>>> transaction support, data structure (queue) implementation support and
>>>> overhead of replacement.
>>>>
>>>> A summary of database evaluation is represented in the following table.
>>>>
>>>>
>>>> Database
>>>>
>>>> Pros
>>>>
>>>> Cons
>>>>
>>>> LevelDB
>>>>
>>>>-
>>>>
>>>>A simple key value storage
>>>>-
>>>>
>>>>Used in ActiveMQ (replaced with KahaDB)
>>>>
>>>>
>>>>1.
>>>>
>>>>Fast due to sorted keys
>>>>2.
>>>>
>>>>Simple transaction support with batch( ) operation
>>>>3.
>>>>
>>>>Sublevel feature (provides more organization and transactions
>>>>facilitation)
>>>>4.
>>>>
>>>>LiveStream feature (able to query things which are yet being
>>>>receiving)
>>>>5.
>>>>
>>>>Support triggers, groupby (mapreduce) and join (fanout) etc.
>>>>
>>>>
>>>>1.
>>>>
>>>>Performance decrease in concurrent / multi threaded scenarios.
>>>>(improved in RocksDB.)
>>>>2.
>>>>
>>>>No object relational mapping.
>>>>3.
>>>>
>>>>Poor performance when memory is not enough.
>>>>
>>>> MongoDB
>>>>
>>>>
>>>>-
>>>>
>>>>Document oriented
>>>>-
>>>>
>>>>Can define JSON schemas
>>>>
>>>>
>>>>
>>>>1.
>>>>
>>>>Direct object mapping (good replacement for RDBMS)
>>>>2.
>>>>
>>>>Can implement queue like data structures (store messages in
>>>>separate documents and delete them once delivered / already implemented
>>>>data structures : https://www.npmjs.com/package/mongodb-queue)
>>>>3.
>>>>
>>>>Cap

Re: [Architecture] File Based Database for Message Broker Store

2017-07-14 Thread Wishmitha Mendis
Hi Indika,

I have started a basic implementation to test the concurrency handling,
message order and transaction support of Redis. Basic message producers and
consumers were implemented and messages are stored in a basic Redis queue.
GitHub : https://github.com/Wishmitha/RedisTesting

Best Regards,

On Thu, Jul 13, 2017 at 6:51 PM, Indika Sampath <indi...@wso2.com> wrote:

> Hi Wishmitha,
>
> No SQL database is not an option as we have prior experience with
> Cassandra. Basically, the important part is to find out performance and
> fault tolerance capability in the database. Message broker
> read/write/delete records all the time and database should be able to
> handle high concurrency. Some metadata and message content write as byte
> data. Also, message order matters when delivery. Hence database should
> capable to sort records rather than bringing that logic into broker code.
> As you mentioned transaction is another factor we are looking in the
> database. Shall we start POC with Redis and see how it goes?
>
> Cheers!
>
> On Thu, Jul 13, 2017 at 6:07 PM, Wishmitha Mendis <wishmi...@wso2.com>
> wrote:
>
>> Hi all,
>>
>> I am working on a project to replace the current RDBMS based database of
>> the message broker store with a file based database system. I have been
>> researching on potential file based databases which can be used in the
>> project scenario. I have evaluated the pros and cons of the each of the
>> potential databases mainly based on read and write performances,
>> transaction support, data structure (queue) implementation support and
>> overhead of replacement.
>>
>> A summary of database evaluation is represented in the following table.
>>
>>
>> Database
>>
>> Pros
>>
>> Cons
>>
>> LevelDB
>>
>>-
>>
>>A simple key value storage
>>-
>>
>>Used in ActiveMQ (replaced with KahaDB)
>>
>>
>>1.
>>
>>Fast due to sorted keys
>>2.
>>
>>Simple transaction support with batch( ) operation
>>3.
>>
>>Sublevel feature (provides more organization and transactions
>>facilitation)
>>4.
>>
>>LiveStream feature (able to query things which are yet being
>>receiving)
>>5.
>>
>>Support triggers, groupby (mapreduce) and join (fanout) etc.
>>
>>
>>1.
>>
>>Performance decrease in concurrent / multi threaded scenarios.
>>(improved in RocksDB.)
>>2.
>>
>>No object relational mapping.
>>3.
>>
>>Poor performance when memory is not enough.
>>
>> MongoDB
>>
>>
>>-
>>
>>Document oriented
>>-
>>
>>Can define JSON schemas
>>
>>
>>
>>1.
>>
>>Direct object mapping (good replacement for RDBMS)
>>2.
>>
>>Can implement queue like data structures (store messages in separate
>>documents and delete them once delivered / already implemented data
>>structures : https://www.npmjs.com/package/mongodb-queue)
>>3.
>>
>>Capped collections (automatically remove old docs to make space for
>>new docs.)
>>
>>
>>1.
>>
>>No transaction support for multiple documents.
>>2.
>>
>>Low performance compared to other NoSQL databases.
>>
>> Caassandra
>>
>>
>>-
>>
>>Column oriented (a key is denoted by row and an attribute is denoted
>>by a column which can dynamically change.)
>>
>>
>>-
>>
>>Node - Cluster configuration (all nodes are identical)
>>
>>
>>1.
>>
>>High availability (No single point of failure. In case of node
>>failure other identical nodes are available.)
>>2.
>>
>>No data loss due to replication (identical nodes)
>>3.
>>
>>Easy to model (column store is very similar to RDBMS tables.)
>>4.
>>
>>Similar query language to SQL (CQL)
>>5.
>>
>>High performance (No master node concept. Any node can provide
>>requested query. No performance bottleneck.)
>>6.
>>
>>Transaction implementation
>>
>>
>>1.
>>
>>Does not support queue structure (message queue implementation is an
>>anti-pattern : http://www.datastax.com/dev/bl
>>og/cassandra-anti-patterns-queues-and-queue-like-datasets
>>
>> <http://www.datastax.com/dev/blog/ca

[Architecture] File Based Database for Message Broker Store

2017-07-13 Thread Wishmitha Mendis
Hi all,

I am working on a project to replace the current RDBMS based database of
the message broker store with a file based database system. I have been
researching on potential file based databases which can be used in the
project scenario. I have evaluated the pros and cons of the each of the
potential databases mainly based on read and write performances,
transaction support, data structure (queue) implementation support and
overhead of replacement.

A summary of database evaluation is represented in the following table.


Database

Pros

Cons

LevelDB

   -

   A simple key value storage
   -

   Used in ActiveMQ (replaced with KahaDB)


   1.

   Fast due to sorted keys
   2.

   Simple transaction support with batch( ) operation
   3.

   Sublevel feature (provides more organization and transactions
   facilitation)
   4.

   LiveStream feature (able to query things which are yet being receiving)
   5.

   Support triggers, groupby (mapreduce) and join (fanout) etc.


   1.

   Performance decrease in concurrent / multi threaded scenarios. (improved
   in RocksDB.)
   2.

   No object relational mapping.
   3.

   Poor performance when memory is not enough.

MongoDB


   -

   Document oriented
   -

   Can define JSON schemas



   1.

   Direct object mapping (good replacement for RDBMS)
   2.

   Can implement queue like data structures (store messages in separate
   documents and delete them once delivered / already implemented data
   structures : https://www.npmjs.com/package/mongodb-queue)
   3.

   Capped collections (automatically remove old docs to make space for new
   docs.)


   1.

   No transaction support for multiple documents.
   2.

   Low performance compared to other NoSQL databases.

Caassandra


   -

   Column oriented (a key is denoted by row and an attribute is denoted by
   a column which can dynamically change.)


   -

   Node - Cluster configuration (all nodes are identical)


   1.

   High availability (No single point of failure. In case of node failure
   other identical nodes are available.)
   2.

   No data loss due to replication (identical nodes)
   3.

   Easy to model (column store is very similar to RDBMS tables.)
   4.

   Similar query language to SQL (CQL)
   5.

   High performance (No master node concept. Any node can provide requested
   query. No performance bottleneck.)
   6.

   Transaction implementation


   1.

   Does not support queue structure (message queue implementation is an
   anti-pattern :
   
http://www.datastax.com/dev/blog/cassandra-anti-patterns-queues-and-queue-like-datasets
   )

Redis


   -

   A simple key value storage
   -

   In memory database
   -

   Support data structures like lists, caped lists, hashes, sets etc. (
   https://redis.io/topics/data-types-intro)


   1.

   Fast read,write performances
   2.

   Provide persistence by writing to hard disk in configurable intervals.
   (snapshots : https://redis.io/topics/persistence)
   3.

   Can implement message queue structures. (
   
http://fiznool.com/blog/2016/02/24/building-a-simple-message-queue-with-redis/
   )
   4.

   Use to implement message stores. (https://redis.io/topics/pubsub)
   5.

   In built basic transaction support (MULTI/EXEC commands :
   https://redis.io/topics/transactions)


   1.

   Loss of data (in case of power outage etc.)
   2.

   Depends on CPU performances when data amount is too high (
   https://redis.io/topics/persistence)

RocksDB


   -

   A simple key value storage (upgrade of LevelDB)
   -

   In memory database


   1.

   Fast read,write performances (faster than LevelDB)
   2.

   Can implement queue structures (
   
https://github.com/facebook/rocksdb/wiki/Implement-Queue-Service-Using-RocksDB
   )
   3.

   Support concurrency
   4.

   Highly configurable (Pluggable Architecture)
   5.

   Support persistence (memtables and transactional logs manage in memory.
   Memtable is flushed to a sstfile to provide persistence.)


   1.

   Loss of data (in case of power outage etc.)
   2.

   No in built transaction support (have to use TansactionDB :
   https://github.com/facebook/rocksdb/wiki/Transactions )


According to this evaluation I suggest Redis as the most suitable database
for the message broker store. Even though it has an element of risk in data
loss, the persistence storing is configurable unlike in other key-value
stores. And it provides fast read and write performances compared to other
databases with basic transaction support.

And this is basically my evaluation and the opinion. It would be great to
have a feedback on this, specially regarding the additional criteria to
consider and other suggested databases.

Best Regards

-- 

*Wishmitha Mendis*

*Intern - Software Engineering*
*WSO2*

*Mobile : +94 777577706*
___
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture