Re: [Dev] Integrating topics to MB

Pamod Sylvester Sun, 05 Oct 2014 02:00:07 -0700

Hi Asitha,

I agree the content should be written before the meat data. What i meant
was not having a separate process to do the content clean up rather going
with the solution which was proposed by Hasitha where the message count
will be maintained in memory instead of the DB.


Also if we're going to duplicate both the message content and meta data per
node it should not affect, as it was initially mentioned. Instead of
duplicating if we're going to share the content among all the nodes then we
cannot maintain a local reference anyhow since even if the reference count
goes 0 locally there will be other nodes who has subscribers referring to
the same content.

The solution i proposed was to address the problem of losing the in memory
counts at a time where the node gets killed. If a node was killed and the
in memory reference was lost when the node will be re started it will first
check for the ids which have not being purged through comparison between
the mata data and the content and will do the needful.

Thanks,
Pamod


On Sun, Oct 5, 2014 at 1:01 PM, Asitha Nanayakkara <asi...@wso2.com> wrote:

> Hi Pamod,
>
> In a clustered  set-up when some other nodes are running. They store
> message content for topic first and then store message meta data. This is
> not done atomically. While this is happening if we start another node with
> a logic to scan the database and delete inconsistent content that will pick
> some of the new topic messages that have stored content but in the process
> of storing metadata. And the process will delete that content too. And this
> will make database having messages with meta data but without any
> corresponding content. I think there is a possibility of this happening if
> There is a working cluster with topic messages being publish at a higher
> rate with high concurrency(publishing) and new node is started at the same
> time. Correct me if I'm wrong.
>
> Yes for each message we will have to store content, metadata and update
> the reference count. But we can increment the reference count per message
> not per duplicate metadata  (since we know how many duplicates of metadata
> we need). If there is a bigger performance hit due to DB update call it's
> better to go with in memory approach rather than trying to clean the
> content at start-up I guess.
>
> Thanks.
>
> On Sun, Oct 5, 2014 at 12:20 PM, Pamod Sylvester <pa...@wso2.com> wrote:
>
>> HI,
>>
>> How would this approach impact on performance ? this will result in a DB
>> operation each time the message is published as well the subscriber acks ?
>>
>> I agree with you on the fact that maintaining the counters in-memory
>> could result in message content to be persisted in the DB and have no way
>> of deleting them if the node gets killed.
>>
>> Also what will be the possibility to check the message content which
>> needs to be deleted at the start up of the node. Where there should be a
>> comparison between the meta data and the content column families, all the
>> ids which are in the content table but not in the meta data CF should be
>> purged ?
>>
>> {MessageContentCF} \ {MessageMetaData} = Message Content to be deleted.
>>
>> this can affect the start up time of the node, but IMO it will not affect
>> the performance of the main flows.
>>
>> WDYT ?
>>
>> Thanks,
>> Pamod
>>
>> On Sun, Oct 5, 2014 at 11:09 AM, Asitha Nanayakkara <asi...@wso2.com>
>> wrote:
>>
>>> Hi Hasitha
>>>
>>> In this if a node with a reference count get killed, all the details
>>> regarding reference counts are lost right? Is there a way to delete the
>>> content?
>>>
>>> Btw what if we have the reference count in database. Something similar
>>> to what we have for queue message counting now (We create a counter when a
>>> queue is created and then increment/ decrement count when messages are
>>> received and sent)
>>>
>>> What I suggest is when a topic message is created we add a reference
>>> counter for the message (Via AndesContextStore a new method 
>>> createReferenceCounter(long
>>> messageID)) when meta data is duplicated we increment the counter when
>>> acknowledgment is received we decrement the counter (two methods in context
>>> store to increment/decrement counts). And we will have a scheduled task to
>>> periodically check the reference count zero messages and delete the
>>> content. This way by creating separate insert statement to create a ref
>>> counter and separate statement to update count we can over come writing
>>> vendor specific SQL queries for reference counting (For RDBMS). Since the
>>> idea is to recommend Cassandra for MessageStore and a RDBMS
>>> AndesContextStore we would be better off that way. Plus this will avoid the
>>> need to track reference counts in memory avoiding losing the reference
>>> counts when a node gets killed. WDYT?
>>>
>>>
>>> Thanks
>>>
>>> On Sun, Oct 5, 2014 at 6:57 AM, Hasitha Hiranya <hasit...@wso2.com>
>>> wrote:
>>>
>>>> Hi Team,
>>>>
>>>>
>>>> Following is my vision on intregating topics to MB
>>>>
>>>> >> we duplicate metadata per subscriber. It will not create a big
>>>> overhead.
>>>> >> we do not duplicate content per subscriber, but we duplicate content
>>>> per node
>>>> >> I hereby assume that we do handle acks for topics. We need a
>>>> reasearch on that.
>>>>
>>>> When a topic subscriber is created
>>>> 1. qpid creates a temp queue
>>>> 2. qpid creates a binding for that queue to topic exchange using topic
>>>> name as binding key.
>>>> 3. qpid creates a subscription for the temp queue.
>>>>
>>>> when a topic subscriber is closed qpid does above 3 things in reverse
>>>> order.
>>>>
>>>> Adhering to this model,
>>>>
>>>> 1. We store metadata in the same way we use for normal queues.
>>>> 2. We use the same SlotDelivery worker and the flusher. There is
>>>> NOTHING called topic delivery worker
>>>> 3. when show in UI we filter durable ones and show
>>>> 4. when a subscriber closes, queue is deleted. We do same thing as for
>>>> normal queues.
>>>> 5. Whenever we insert metadata, we duplicate metadata for each temp
>>>> queue (per subscriber). We know the nodes where subscriers lies, do we can
>>>> duplicate content for those nodes (one copy for node).
>>>> 6. We need to introduce a new tracking per subscriber in on flight
>>>> message tracker, which is common for queues as well. when a metadata is
>>>> inserted for a message id we increase a count.
>>>>     When an ack came for that metadata we decrement it. If it is zero,
>>>> content is ready to be removed. we do not track this count globally as we
>>>> have a copy of content per node. Thus reference count do not need to be
>>>> global. It is a local in-memory tracking.
>>>> 7. queue change handler - if delete - execute normal delete (remove all
>>>> metadata), decrement reference counts. Thread that delete content will
>>>> detect that and will delete offline. This way only if all subscribers are
>>>> gone, content is removed.
>>>>
>>>> 8. Should be careful abt hierarchical topics. We use our maps to
>>>> identify queues bound to a topic. MQTT, AMQP confusion should be solved
>>>> there.
>>>>
>>>> *Thanks *
>>>>
>>>>
>>>> --
>>>> *Hasitha Abeykoon*
>>>> Senior Software Engineer; WSO2, Inc.; http://wso2.com
>>>> *cell:* *+94 719363063*
>>>> *blog: **abeykoon.blogspot.com* <http://abeykoon.blogspot.com>
>>>>
>>>>
>>>
>>>
>>> --
>>> *Asitha Nanayakkara*
>>> Software Engineer
>>> WSO2, Inc. http://wso2.com/
>>> Mob: + 94 77 85 30 682
>>>
>>>
>>
>>
>> --
>> *Pamod Sylvester *
>>  *Senior Software Engineer *
>> Integration Technologies Team, WSO2 Inc.; http://wso2.com
>> email: pa...@wso2.com cell: +94 77 7779495
>>
>
>
>
> --
> *Asitha Nanayakkara*
> Software Engineer
> WSO2, Inc. http://wso2.com/
> Mob: + 94 77 85 30 682
>
>


-- 
*Pamod Sylvester *
 *Senior Software Engineer *
Integration Technologies Team, WSO2 Inc.; http://wso2.com
email: pa...@wso2.com cell: +94 77 7779495

_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Re: [Dev] Integrating topics to MB

Reply via email to