Hi RocketMQ Community,

I think it is a good choice to start the evolution of architecture for RocketMQ 
with Metadata management architecture upgrade. 

Currently, the metadata consistency of RocketMQ is maintained by full 
connection. For example, each broker registers with each nameserver to ensure 
that the view of routing information seen between nameservers is the same, and 
each consumer instance sends heartbeats (carrying subscription information) to 
broker to ensure that the view of subscription information seen between brokers 
is the same. However, such consistency maintenance is weak. Unreliable network 
and delay may cause inconsistent views, which has caused a lot of issues.

On the other hand, after RocketMQ 4.5.0, we have used the Raft protocol 
(DLedger) to solve the consistency problem of log replication. DLedger is a 
raft-based log storage library. At the beginning of the design, we hoped to 
apply it to consistent metadata storage. If the metadata of RocketMQ is stored 
as log and the consistency is guaranteed by using the raft protocol (DLedger), 
the issue of metadata consistency will be solved.

So I submitted RIP-18 Metadata management architecture upgrade, which describes 
the specific plan in more detail. I hope to hear more voices from the 
community. So please tell me your thoughts by replying to this email or 
commenting on google docs.

Best Regards!
Rongtong Jin

RIP-18 Metadata management architecture upgrade
https://docs.google.com/document/d/1hQxlbtlMDwNxyVDGsIIUpDNWwfS6hP0PGKY9-A2KUOA/edit?usp=sharing


> -----原始邮件-----
&gt; 发件人: "Gosling Von" <[email protected]>
&gt; 发送时间: 2019-01-30 17:54:47 (星期三)
&gt; 收件人: dev <[email protected]>
&gt; 抄送: 
&gt; 主题: [DISCUSS] Thought of The Evolution of The Next Decade Architecture for 
RocketMQ
&gt; 
&gt; Hi,
&gt; 
&gt; I would like to say happy new year to everyone, especially for the guys 
from the eastern hemisphere. I think that when you see this topic, you already 
know what I want to say :-)
&gt; 
&gt; After more than 6 years of inspection from the community and market, 
Apache RocketMQ has been widely used in the field of financial and e-commerce 
online transactions. Known know data has shown that, just in China, RocketMQ 
covers more than 40% of the traditional messaging scene. With the globalization 
of the community in the past two years, this development has spread to all of 
the worlds. However, through continuous community activities, including 
technical exchanges with some of the experts from the Microsoft, Berkeley, 
etc., coupled with the emergence of IoT, AI, Blockchain and other scenarios 
around the world, I began to think about the architecture evolution for 
RocketMQ. I hope we could make it as the data infrastructure of cloud computing 
era. and we could better serve in the next decade.
&gt; 
&gt; 
&gt; First of all, the overall architecture will take the separation of storage 
computing and pluggable architecture. Regarding the separation of storage 
computing, I know that this is a controversial topic in the industry. You may 
also see that Twitter had gave up their messaging solution EventBus, which 
serving and storage layers are decoupled. one of the important reason which is 
given by "introduces an additional hop". That's right, usually, you don't need 
so much. But what I want to express here is that the value of storage computing 
separation is just like the single responsibility in our design pattern, so 
that focus is more focused. For example, if messaging engine is deployed in the 
edge, we could arrange computing nodes to be deployed on demand. Because it is 
a computationally intensive task, we can focus on how to improve computing 
power and response speed without concerned about the machine cost, operation 
and maintenance cost brought by storage. Another case, RocketMQ storage is 
regarded as a kind of time series storage. It not only provides the storage 
capacity of single data, but also the capacity of bulk storage, but in any case 
it is a data type independent sequential additional storage. Under this 
architecture, if you want to realize the current transaction capability, there 
are still some complications, especially when you want to make RocketMQ a 
one-stop microservice transaction solution. We have already tried this. Known 
feedback from the bank is, they have made some modifications to the storage in 
the financial system. For example, when the file storage is replaced with a 
relational type, NoSQL or NewSQL storage, the benefit is enhanced 
maintainability. Enhanced transaction processing capabilities. In this sense, 
we could make a pluggable design in RocketMQ 5.0, by default we will provide 
the ultimate sequential addition capability storage, which is also the best 
storage implementation of the disk seek algorithm. But it also brings another 
question, how to improve the query and processing ability of data. Here I want 
to share another preliminary design idea, we could continue to use the data 
structure such as Commitlog to store the original data, and then build the 
index or intermediate aggregation results based on Commitlog. At present, our 
index structure is not well integrated and utilized, from this we could 
continue to modify and optimize the index. In addition, we can use DPDK/SPDK 
and write Pos atomic increment to achieve the best lock-free design. 
Considering the data that has been committed, this series needs to be explored 
to a large extent, even including cooperation with some other communities and 
universities. So at this level, I think we could make RocketMQ have the 
separation deployment capability, while the storage capacity is pluggable and 
can be replaced as needed.
&gt; 
&gt; Second, support the OpenMessaging standard. I think many guys have already 
noticed the new messaging standard drafted by Alibaba, Yahoo and other company. 
I am also the chair of this project. In this blueprint of the standard, a very 
important problem is solved, that is Interoperability, this interoperability is 
not only between different messaging vendors, but also between the upstream and 
downstream of the messaging. And this interoperability is reflected to the 
user, which is the consistency of the API or the protocol. Although we think 
that the API is also a kind of protocol, I want to emphasize that the 
consistency of the protocol has been tried by countless scenes. But so far, I 
personally have not seen a particularly versatile and simple solution, whether 
it is AMQP, MQTT, including RSocket, which has recently been recognized by 
everyone, there is not much innovation to work on this level. And we want to 
avoid some repetitive innovations. At this time, the API layer standard is 
particularly important, so RocketMQ 5.0 will focus on supporting OpenMessaging 
standardization in API testing. In the future of multi-language, we hope that 
through this set of APIs, we can completely solve all the problems that you 
currently encounter with RocketMQ multi-language.
&gt; 
&gt; The natural support of multiple protocols, I think this is also very 
important. So in 5.0, we could reconstruct the remoting module, to provide a 
pluggable transport layer protocol support in the computing node. HTTP2.0 may 
be our default protocol. On the basis of again, we also consider integrating 
TCP-based MQTT, UDP-based CoAP. Of course, we also clearly see that with the 
gradual popularization of 5.0G networks, we may have to actively follow up the 
needs of the market. Anyway, we could provide the flexible wire protocol 
extension when we want to support more concrete domain protocol. This is 
something we must consider carefully.
&gt; 
&gt; 
&gt; A lightweight streaming engine base on messaging is a very natural 
thought. I am also an early explorer of streaming, but the so-called streaming 
we made in previous years is strictly a pseudo-scene, why is it a pseudo-scene? 
Actually, we don’t need to deploy a streaming engine. Instead of, we could only 
use the messaging to reach a same effect in most cases. In addition, in the 
stream computing scenario, messaging and storage are very important, so why 
don't we let the messaging support the scheduling and calculation of task nodes 
naturally, and our built-in storage can better help us better. We only need to 
provide a lib package, which makes it easy for messaging to have streaming 
capabilities. As for the subsequent SQL processing, CEP, FAAS and etc. I 
believe that this is the evolution of this programming model.
&gt; 
&gt; We have been talking about it before. RocketMQ is a unified messaging 
platform integrating computing, storage and scheduling. Today I share my rough 
thought of the evolution of the overall architecture of RocketMQ 5.0. I also 
hope to hear the opinions of the community. Including other PMC and Committer 
thoughts. Next, we could call for RIP discussion for the details, I hope more 
pmc or committers could act as the sheepherder of the RIP, making landing more 
reliable in the 2019.
&gt; 
&gt; 
&gt; Best Regards,
&gt; Von Gosling


</[email protected]></[email protected]>

Reply via email to