Hi Von, Thank you for your suggestions. I have answered your queries on the doc. After your replies, I will create a formal proposal as per Google guidelines <https://google.github.io/gsocguides/student/writing-a-proposal> for review as the submission period has now started.
Thanks, Sohaib On Mon, Mar 12, 2018 at 9:48 AM, Von Gosling <[email protected]> wrote: > Hi Sohaib, > > I have reviewed and made some suggestion for your concern problems. > > For all other GSoC students, Could we do some practice like Sohaib, > looking forward to your proposal on the Google Doc. > > Best Regards, > Von Gosling > > > 在 2018年3月9日,15:17,Sohaib Iftikhar <[email protected]> 写道: > > > > Hi Yukon, > > > > What do you suggest for the key store itself? Do you propose writing this > > ourselves or using some existing solution and writing a layer on top? > > > > Thanks, > > Sohaib > > > > On Fri, Mar 9, 2018 at 6:20 AM, yukon <[email protected]> wrote: > > > >> ``` > >> Personally, I find RAFT to be much simpler to implement. However, I do > not > >> expect to reinvent the wheel here. > >> ``` > >> > >> That's absolutely right, no need to reinvent the wheel, there are many > >> existing implementations for raft: https://raft.github.io/ > >> > >> ``` > >> I don't think using key store to persist all the messages is a good > idea. > >> ``` > >> > >> Yes, store an ID is enough. > >> > >> > >> On Thu, Mar 8, 2018 at 3:32 PM, Sohaib Iftikhar <[email protected]> > >> wrote: > >> > >>> Hi Dexin, > >>> > >>> Thank you for your suggestions. I will try to answer as much as I can > and > >>> leave the rest to the RocketMQ team. > >>> > >>> 1. The idea with incremental Ids is actually quite good. But @Yukon > >>> mentioned that duplication can also be controlled by an application > >>> (special KV Property) in which case different producers may produce the > >>> same message that needs to deduplicated on the broker. > >>> SessionId+IncrementalId won't work in this scenario I believe. But we > can > >>> actually switch to more efficient storage using the idea you described > >> when > >>> the user is not specifying these special keys. > >>> Also I proposed storing of keys for only a fixed time interval. For all > >>> practical purposes this would still remain constant time. [Log base 2 > of > >>> 10^10 is still just 33 :) ]. It does add the extra cost of > communication > >>> but this would be the case in both scenarios. > >>> 2. As for consensus, the ideas I presented were pretty abstract so I > >>> mentioned a couple of algorithms that could potentially be used. > >>> Personally, I find RAFT to be much simpler to implement. However, I do > >> not > >>> expect to reinvent the wheel here. I strongly believe that in this > case, > >> we > >>> can build upon some tested existing solution. > >>> > >>> > >>> Regards, > >>> Sohaib > >>> > >>> On Thu, Mar 8, 2018 at 1:31 AM, 李 德鑫 <[email protected]> wrote: > >>> > >>>> Hi Sohaib, > >>>> > >>>> > >>>> I‘m a student applying for GSOC too. And I've read all of your > >> discussion > >>>> in the mail list. > >>>> > >>>> I have some questions about your design, and some of the questions may > >>>> need to be answered by RocketMQ team. So I send them here to be > >>> discussed. > >>>> > >>>> I don't think using key store to persist all the messages is a good > >> idea. > >>>> Since MQ is based on O(1) data structure. The key store would harm the > >>>> performance. > >>>> > >>>> I think we can learn from TCP protocol. > >>>> > >>>> In Producer-Broker Communication, we can give an incremental id for > >> every > >>>> message sent in the same session. And the session id should be > >> persistent > >>>> on the disk for producer. So the broker only need to maintain a map > >>> between > >>>> session id to expected message id(And this is how Kafka do it). Since > >>>> messages are much more than producers. However, there's still a K/V > >> store > >>>> needed. So we have to ask RocketMQ team about how many producers in > the > >>>> same time while in practical situation. > >>>> > >>>> Also, the same idea in Consumer-Broker Communication. > >>>> > >>>> > >>>> About consensus algorithm, I think RocketMQ should already have an > >>>> implementation there. I don't know what it is, but maybe you can reuse > >>>> that. Or what if you have to implement one, in my opinion, there's no > >>> need > >>>> to implement both Paxos and Raft. Since they solve the same kind of > >>>> problems. > >>>> > >>>> > >>>> > >>>> Regards, > >>>> > >>>> Dexin > >>>> > >>>> > >>>> ________________________________ > >>>> 发件人: Sohaib Iftikhar <[email protected]> > >>>> 发送时间: 2018年3月7日 18:15:51 > >>>> 收件人: [email protected] > >>>> 主题: Re: [GSOC|ROCKETMQ-124] Support non-redundant message delivery > >>>> mechanism > >>>> > >>>> Hi Yukon, > >>>> > >>>> Thanks for your reply. Yes, it would be nice to concretely define the > >>> scope > >>>> of this project as the doc is a bit ambitious for just a summer. > Should > >>> you > >>>> (or anyone else) have questions/suggestions/clarifications I'd be > glad > >>> to > >>>> discuss more details. > >>>> > >>>> Thanks, > >>>> Sohaib > >>>> > >>>> On Wed, Mar 7, 2018 at 8:58 AM, yukon <[email protected]> wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>> Google doc is better for discussion, your design is great, now we > >> could > >>>>> discuss more details base on it. > >>>>> > >>>>> Any advice is welcome from RocketMQ community. > >>>>> > >>>>> Appreciate your efforts. > >>>>> > >>>>> Regards, > >>>>> yukon > >>>>> > >>>>> On Wed, Mar 7, 2018 at 5:15 AM, Sohaib Iftikhar < > >> [email protected]> > >>>>> wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> @Yukon Thank you for your reply. This clears some doubts. > >>>>>> > >>>>>> Sorry for the delay as I was somewhat occupied with another > >> project. > >>> I > >>>>> have > >>>>>> created an initial design doc. Email is a bit cumbersome for > >>> feedback I > >>>>>> wrote this document in two formats: > >>>>>> > >>>>>> 1. In the form of a Google document: > >>>>>> https://docs.google.com/document/d/1KSpXGNDH0HF5E27lfKJxJnjIjPtlP > >>>>>> 1Q-M6rj3yZde24. > >>>>>> The document is open for comments to all users without signing in. > >> I > >>>>> would > >>>>>> appreciate it if you put your name before the comment so I can > >>> identify > >>>>> who > >>>>>> to follow up the discussion with. > >>>>>> > >>>>>> 2. As a markdown on github: > >>>>>> https://github.com/sohaibiftikhar/rocketmq/blob/ > >>>>> gsoc_design/gsoc_design.md > >>>>>> . > >>>>>> The comments for this can be made on the commit: > >>>>>> https://github.com/sohaibiftikhar/rocketmq/commit/ > >>>>>> dfd55fc69f430fc024217a3b20dde31717334e62 > >>>>>> > >>>>>> After I have received a certain amount of feedback I will try to > >>>>>> incorporate it and put in a subsequent version for review. Please > >>> tell > >>>> me > >>>>>> which methods suits you better (gdoc or github) for review and we > >> can > >>>>>> continue with that for the subsequent versions. > >>>>>> > >>>>>> Lastly, the document is a couple of pages so I appreciate your > >>> patience > >>>>> and > >>>>>> your help. > >>>>>> Looking forward to your opinions. > >>>>>> > >>>>>> Thanks, > >>>>>> Sohaib > >>>>>> > >>>>>> On Mon, Mar 5, 2018 at 1:01 PM, yukon <[email protected]> wrote: > >>>>>> > >>>>>>> Hi Sohaib, > >>>>>>> > >>>>>>> Sorry for the late reply, we could move this project forward now > >> ~ > >>>>>>> > >>>>>>> ``` > >>>>>>> I would at some point like to post > >>>>>>> design ideas to this problem privately to get it reviewed by the > >>>>>>> development community but not make it publicly available so that > >> it > >>>>>> cannot > >>>>>>> be plagiarised. > >>>>>>> ``` > >>>>>>> > >>>>>>> You can send your design ideas to me directly or to our PMC list( > >>>>>>> [email protected]) if you want to make your ideas > >>>> privately. > >>>>>> But > >>>>>>> please don't break away from the community. > >>>>>>> > >>>>>>> I hope you have already understood the goal of this project. Now, > >>>>>> RocketMQ > >>>>>>> support At-least-once delivery, it's an obvious solution > >>>>>>> that achieves Exactly-Once by removing duplicated messages. > >>>>>>> > >>>>>>> Return to your original questions: > >>>>>>> > >>>>>>> 1. What defines a redundant message? > >>>>>>> > >>>>>>> A message id will be generated when new a message, so this id can > >>> be > >>>>> used > >>>>>>> to identify a message. Also, the user could specify a unique > >>>>>>> business-related property to identify a message. > >>>>>>> > >>>>>>> The redundant messages will occur when the network is broken or > >>>>>>> reconnected, rebalance[1] is triggered, etc. > >>>>>>> > >>>>>>> > >>>>>>> 2. Is their a timeline on the redundant messages? > >>>>>>> > >>>>>>> Yes, keep all messages nonredundant is expensive, let's consider > >>> this > >>>>>>> question within a certain time window ~ > >>>>>>> > >>>>>>> Looking forward to your design. > >>>>>>> > >>>>>>> [1]. > >>>>>>> https://github.com/apache/rocketmq/blob/master/client/ > >>>>>>> src/main/java/org/apache/rocketmq/client/impl/consumer/ > >>>>>>> RebalanceService.java > >>>>>>> > >>>>>>> > >>>>>>> Regards, > >>>>>>> yukon > >>>>>>> > >>>>>>> > >>>>>>> On Fri, Mar 2, 2018 at 9:31 PM, Sohaib Iftikhar < > >>>> [email protected]> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> @Zhanhui Thanks for the response. This is not a campaign its > >> just > >>>>> part > >>>>>> of > >>>>>>>> GSoC (https://summerofcode.withgoogle.com/). And community > >> help > >>> is > >>>>>>> gladly > >>>>>>>> welcomed. In fact, it is recommended :) > >>>>>>>> > >>>>>>>> @KaiYuan Thanks for your suggestions. I will come up with a > >> flow > >>>>> chart > >>>>>>> for > >>>>>>>> the proposed solution this weekend. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Sohaib > >>>>>>>> > >>>>>>>> > >>>>>>>> On Fri, Mar 2, 2018 at 3:41 AM, Zhanhui Li < > >> [email protected]> > >>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi Sohaib, > >>>>>>>>> > >>>>>>>>> I have been sort of busy this these days. Sorry to reply you > >> so > >>>>> late! > >>>>>>>>> > >>>>>>>>> So sure what “deadline” you are referring to. If this is part > >>> of > >>>> a > >>>>>>>>> campaign, I have to admit I am not aware of the regulations > >> and > >>>>> what > >>>>>>> kind > >>>>>>>>> of help I should offer to maintain fairness considering other > >>>>> arising > >>>>>>>>> similar issues. > >>>>>>>>> > >>>>>>>>> Regards! > >>>>>>>>> > >>>>>>>>> Zhanhui Li > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> 在 2018年3月1日,上午3:43,Sohaib Iftikhar <[email protected]> > >>> 写道: > >>>>>>>>>> > >>>>>>>>>> Hi guys, > >>>>>>>>>> > >>>>>>>>>> Would be nice to have some feedback on this as the deadline > >>> is > >>>>> not > >>>>>>> too > >>>>>>>>> far :) > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Sohaib > >>>>>>>>>> > >>>>>>>>>> Regards, > >>>>>>>>>> Sohaib Iftikhar > >>>>>>>>>> > >>>>>>>>>> -- Man is still the most extraordinary computer of all.-- > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Mon, Feb 26, 2018 at 10:36 AM, Sohaib Iftikhar < > >>>>>>>> [email protected] > >>>>>>>>> <mailto:[email protected]>> wrote: > >>>>>>>>>> Thank you for the pointers to the code. This was super > >>> helpful. > >>>>> The > >>>>>>>>> multiple keys can probably be serialized better than > >> separating > >>>>> them > >>>>>>>> with a > >>>>>>>>> space but that is already legacy I suppose. > >>>>>>>>>> > >>>>>>>>>> Firstly filters like bloom or cuckoo are heuristic. They > >> can > >>>> help > >>>>>>> make > >>>>>>>>> things faster but definitely cannot be used as the only > >>> solution. > >>>>>>> Hence, > >>>>>>>> in > >>>>>>>>> the end, we will still need a persistent keystore/distributed > >>>> set. > >>>>> My > >>>>>>>> plan > >>>>>>>>> was to have this keystore as distributed (raft guarantee > >> etc.). > >>>> The > >>>>>>>>> keystore can also hold a persistent filter on its end. If a > >>>> broker > >>>>>>>>> collapses it can renew/refresh its filter from the keystore. > >>>> Hence > >>>>>>>>> eliminating the problems about crashes that you mention. The > >>>>> problem > >>>>>>> here > >>>>>>>>> could be in maintaining performance for filters in case of > >>>> removals > >>>>>>> from > >>>>>>>>> the keystore (for eg: sliding windows as mentioned in my > >>> previous > >>>>>>> mail). > >>>>>>>>> Periodic refreshal of filters can help solve this but I am > >> open > >>>> to > >>>>>>>>> suggestions on how to make this better. > >>>>>>>>>> > >>>>>>>>>> I think implementing a distributed set on the client > >> cluster > >>>> has > >>>>>> its > >>>>>>>>> caveats. The way I understand RocketMQ is that we do not have > >>>>> control > >>>>>>>> over > >>>>>>>>> the diskspace/memory on the client end. So we probably only > >>> have > >>>> a > >>>>>>>> constant > >>>>>>>>> amount. A distributed set on the client would also need to be > >>>>>>> persistent. > >>>>>>>>> For eg: if a client restarts/recovers etc. This basically > >> means > >>>> we > >>>>>>> need a > >>>>>>>>> keystore on the client instead of the broker cluster. This > >>>> probably > >>>>>>> puts > >>>>>>>>> too much responsibility on the client cluster. A different > >>>> approach > >>>>>>> would > >>>>>>>>> be to ensure that the offsets are always in sync with the > >>> broker. > >>>>>> Since > >>>>>>>> the > >>>>>>>>> broker only serves unique messages (based on the proposed > >>>> solution > >>>>> on > >>>>>>> the > >>>>>>>>> producer/broker end) all we need to ensure is that a client > >>> does > >>>>> not > >>>>>>>>> consume messages with the same offset twice. > >>>>>>>>>> > >>>>>>>>>> Please suggest improvements if this does not look like the > >>>>> correct > >>>>>>>>> approach. Also would be great if someone can come up with a > >>>>>> completely > >>>>>>>>> different approach so that we can weigh up pros and cons. > >>>>>>>>>> > >>>>>>>>>> Thanks for reading this through and looking forward to your > >>>>>> opinions. > >>>>>>>>>> > >>>>>>>>>> Regards, > >>>>>>>>>> Sohaib > >>>>>>>>>> > >>>>>>>>>> Regards, > >>>>>>>>>> Sohaib Iftikhar > >>>>>>>>>> > >>>>>>>>>> -- Man is still the most extraordinary computer of all.-- > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Mon, Feb 26, 2018 at 3:58 AM, Zhanhui Li < > >>>> [email protected] > >>>>>>>>> <mailto:[email protected]>> wrote: > >>>>>>>>>> Hi Sohaib, > >>>>>>>>>> > >>>>>>>>>> About multiple key support, the following code snippet > >> should > >>>>>> clarify > >>>>>>>>> your doubt: > >>>>>>>>>> org.apache.rocketmq.common.message.Message class has > >>>> overloaded > >>>>>>>> setKeys > >>>>>>>>> methods, allowing your to set multiple keys via > >>> string(separated > >>>> by > >>>>>>>>> space…sorry, we have not yet unified all separators, hoping > >>> this > >>>>> does > >>>>>>> not > >>>>>>>>> confuse you) or collection. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> When broker tries to build index for the message with > >>> multiple > >>>>>> keys, > >>>>>>>>> multiple index entries are inserted into the indexing file. > >>>>>>>>>> See org.apache.rocketmq.store. > >> index.IndexService#buildIndex > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> In terms of eliminating message duplication, personally, I > >>> wish > >>>>> we > >>>>>>> have > >>>>>>>>> a feature of exactly-once semantic covering the whole cluster > >>> and > >>>>> the > >>>>>>>>> complete send-store-consume processes. A rough idea is route > >>> the > >>>>>>> message > >>>>>>>>> according to its unique key to a broker according to a rule; > >>> The > >>>>>>> serving > >>>>>>>>> broker ensures uniqueness of the message according to the > >> key( > >>> as > >>>>> you > >>>>>>>> said, > >>>>>>>>> bloom-filter/cuckoo-filter, etc); Things might looks simple, > >>> but > >>>>>>> issues > >>>>>>>>> resides in scenarios where cluster is experiencing membership > >>>>>> changes: > >>>>>>>> for > >>>>>>>>> example, what if a broker crashed down? We might need > >> propagate > >>>>>>>>> bloom-filter bitset synchronously to other brokers having the > >>>> same > >>>>>>>> topics; > >>>>>>>>> What if a new broker joins in the cluster and starts to > >> serve? > >>> I > >>>> do > >>>>>> not > >>>>>>>>> mean this is too complex to implement. Instead, this is a > >>> pretty > >>>>>>>>> interesting topic and fancy feature to have. Alternatively, > >> we > >>>>> might > >>>>>>>> defer > >>>>>>>>> eliminating duplicates to the consumption phase using kind of > >>>>>>> distributed > >>>>>>>>> set. For sure, my proposing idea suffers the same challenges > >>>>>> including > >>>>>>>>> membership changes. > >>>>>>>>>> > >>>>>>>>>> Guys of dev board, any insights on this issue? > >>>>>>>>>> > >>>>>>>>>> Zhanhui Li > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> 在 2018年2月26日,上午2:47,Sohaib Iftikhar <[email protected] > >>>>>> <mailto: > >>>>>>>>> [email protected]>> 写道: > >>>>>>>>>>> > >>>>>>>>>>> Hi Zhanhui, > >>>>>>>>>>> > >>>>>>>>>>> I have a doubt about these multiple keys. If I am wrong in > >>> any > >>>>> of > >>>>>>> the > >>>>>>>>>>> assumptions I make please point it out. > >>>>>>>>>>> > >>>>>>>>>>> If there is support for multiple keys I cannot see this in > >>> the > >>>>>> code. > >>>>>>>> The > >>>>>>>>>>> class Message only stores a single key in the property map > >>>>> against > >>>>>>> the > >>>>>>>>>>> property name "KEYS". Is this also done in the same ways > >> as > >>>>> tags? > >>>>>>> That > >>>>>>>>> is > >>>>>>>>>>> different keys are separated with ' || '? So basically as > >> a > >>>> user > >>>>>> of > >>>>>>>> the > >>>>>>>>>>> producer API it is the user's responsibility to ensure > >> that > >>> he > >>>>>>>> separates > >>>>>>>>>>> the different keys with the correct separator. I can see > >> an > >>>>>> obvious > >>>>>>>>> problem > >>>>>>>>>>> here. What if the key contains this special character ' || > >>> '? > >>>>> But > >>>>>>>> maybe > >>>>>>>>>>> this event is rare and hence this is not important. Could > >>> you > >>>>>> point > >>>>>>> me > >>>>>>>>> to > >>>>>>>>>>> some source/doc that explains this part? I was looking at > >>> the > >>>>>> index > >>>>>>>>> section > >>>>>>>>>>> rocketmq-store but I have not been able to understand the > >>>>> indexing > >>>>>>>>> process > >>>>>>>>>>> completely for now. I will keep reading the source to get > >> a > >>>>> better > >>>>>>>> idea. > >>>>>>>>>>> > >>>>>>>>>>> Moving on to the implementational details. Here is a broad > >>>> idea > >>>>> of > >>>>>>> one > >>>>>>>>>>> possible way to approach it. > >>>>>>>>>>> > >>>>>>>>>>> The attempt is to remove duplicate messages. In this > >> issue, > >>> I > >>>>>> would > >>>>>>>>> like to > >>>>>>>>>>> aim at eliminating duplicate messages at the > >> producer/broker > >>>>> end. > >>>>>>> For > >>>>>>>>> now, > >>>>>>>>>>> we do not concern ourselves with the duplicate messages > >>>>> happening > >>>>>>> due > >>>>>>>> to > >>>>>>>>>>> unwritten consumer offsets as these two issues have > >>> different > >>>>>>>> solutions. > >>>>>>>>>>> One way to solve this problem at the producer/broker end > >>> could > >>>>> be > >>>>>> to > >>>>>>>>> have a > >>>>>>>>>>> distributed key store that stores the messages. We can > >> make > >>> it > >>>>>>>>> configurable > >>>>>>>>>>> such that this distributed store stores all messages or > >>> works > >>>>> as a > >>>>>>>>> sliding > >>>>>>>>>>> window keeping only the messages from the last X seconds > >>>>> specified > >>>>>>> by > >>>>>>>>> the > >>>>>>>>>>> user. We can have a layer on top to check set membership > >>> such > >>>>> as a > >>>>>>>> bloom > >>>>>>>>>>> filter or a cuckoo filter ( > >>>>>>>>>>> https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf > >> < > >>>>>>>>> https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf>) > >> to > >>>> help > >>>>>>>>>>> performance. Every message being pushed in by a producer > >> are > >>>>>> checked > >>>>>>>> in > >>>>>>>>>>> first with the filter and in case of a positive result > >> with > >>>> this > >>>>>> key > >>>>>>>>> store. > >>>>>>>>>>> If the message is found then it is discarded. This helps > >>>> remove > >>>>>>>>> duplicates > >>>>>>>>>>> completely from a producer perspective. The core of this > >>> idea > >>>> is > >>>>>> the > >>>>>>>>>>> distributed key store which would be completely separate > >>> from > >>>>> the > >>>>>>>>> current > >>>>>>>>>>> message storage. Since the concept of a distributed key > >>> store > >>>>> or a > >>>>>>>>>>> key/value store is not novel there are two ways to this. > >>>>>>>>>>> 1. Implement it ourselves. This would be high effort but > >> no > >>>>>> external > >>>>>>>>>>> dependencies. > >>>>>>>>>>> 2. Use a key-value store such as Redis (which already has > >>>>> timeouts > >>>>>>> and > >>>>>>>>>>> persistence but a large memory footprint) or some other > >>>>> disk-based > >>>>>>>>> storage > >>>>>>>>>>> for set membership. This would include an external > >>> dependency > >>>>> but > >>>>>>>>>>> development time will reduce significantly for such a > >>>> solution. > >>>>>>>>>>> I am inclined towards implementing it by myself as this > >>> would > >>>>>> avoid > >>>>>>>>>>> dependencies on other products especially since RocketMQ > >> is > >>>>>>> currently > >>>>>>>> a > >>>>>>>>>>> self-reliant system. In addition, my past experience with > >>>>> building > >>>>>>>> such > >>>>>>>>> a > >>>>>>>>>>> store should also come in handy. > >>>>>>>>>>> > >>>>>>>>>>> I would like to know the opinions of the development > >>> community > >>>>> on > >>>>>>> this > >>>>>>>>>>> approach and to suggest improvements on it. Looking > >> forward > >>> to > >>>>>> your > >>>>>>>>>>> responses to this. > >>>>>>>>>>> > >>>>>>>>>>> ====<question unrelated to issue>===== > >>>>>>>>>>> To increase my familiarity with the code base and to help > >>>> prove > >>>>>>> that I > >>>>>>>>> am > >>>>>>>>>>> familiar with the tools and technologies in place it would > >>> be > >>>>>> great > >>>>>>>> if I > >>>>>>>>>>> could be pointed to some low effort issues that I could > >> help > >>>> out > >>>>>>> with. > >>>>>>>>> In > >>>>>>>>>>> case there are no 'newbie' issues available I could help > >>>> improve > >>>>>> the > >>>>>>>>>>> comments inside the codebase. I noticed some source files > >>> with > >>>>> no > >>>>>>>>>>> explanations which can be documented via comments to help > >>>>> onboard > >>>>>> a > >>>>>>>> new > >>>>>>>>>>> contributor faster. > >>>>>>>>>>> ====</question unrelated to issue>===== > >>>>>>>>>>> > >>>>>>>>>>> Thanks a lot for reading this through and looking forward > >> to > >>>>> your > >>>>>>>>> opinions. > >>>>>>>>>>> > >>>>>>>>>>> Regards, > >>>>>>>>>>> Sohaib > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Sat, Feb 24, 2018 at 11:50 AM, Zhanhui Li < > >>>>> [email protected] > >>>>>>>>> <mailto:[email protected]>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi Sohaib, > >>>>>>>>>>>> > >>>>>>>>>>>> Happy to know you are interested in RocketMQ. > >>>>>>>>>>>> > >>>>>>>>>>>> First, let me answer questions you raised. > >>>>>>>>>>>> > >>>>>>>>>>>> — can there be multiple tags? > >>>>>>>>>>>> No. At present, the storage engine allows single tag > >> only. > >>>>>>>>> Subscriptions > >>>>>>>>>>>> are allowed to use combination of tags. The current model > >>>>> should > >>>>>>> meet > >>>>>>>>> your > >>>>>>>>>>>> business development. If not, please let us know. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> — key (Similar question to above.) > >>>>>>>>>>>> RocketMQ builds index using message keys. A single > >> message > >>>> may > >>>>>> have > >>>>>>>>>>>> multiple keys. > >>>>>>>>>>>> > >>>>>>>>>>>> — About redundant message > >>>>>>>>>>>> From my understanding, you are trying to eliminate > >>> duplicate > >>>>>>>> messages. > >>>>>>>>>>>> True there are various reasons which may cause message > >>>>>> duplication, > >>>>>>>>> ranging > >>>>>>>>>>>> from message delivery and consumption. Discussion on this > >>>> topic > >>>>>> is > >>>>>>>>> warmly > >>>>>>>>>>>> welcome. Had you had any idea to contribute on this > >> issue, > >>>> the > >>>>>>>>> developer > >>>>>>>>>>>> board is happy to discuss. > >>>>>>>>>>>> > >>>>>>>>>>>> Zhanhui Li > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>> 在 2018年2月24日,上午11:17,Sohaib Iftikhar < > >>> [email protected] > >>>>>>> <mailto: > >>>>>>>>> [email protected]>> 写道: > >>>>>>>>>>>>> > >>>>>>>>>>>>> My earlier email message seems to have gotten lost. So I > >>>> will > >>>>>> try > >>>>>>>>> again. > >>>>>>>>>>>>> Please see the original message for the discussion. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Regards, > >>>>>>>>>>>>> Sohaib Iftikhar > >>>>>>>>>>>>> > >>>>>>>>>>>>> -- Man is still the most extraordinary computer of > >> all.-- > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, Feb 20, 2018 at 1:54 AM, Sohaib Iftikhar < > >>>>>>>>> [email protected] <mailto:[email protected]>> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I am interested in working on this issue ( > >>>>>>>> https://issues.apache.org/ > >>>>>>>>> <https://issues.apache.org/> > >>>>>>>>>>>>>> jira/browse/ROCKETMQ-124) as part of GSOC-18. I have a > >>> few > >>>>>>>> questions > >>>>>>>>> for > >>>>>>>>>>>>>> the same. I am not sure if this discussion needs to be > >> on > >>>> the > >>>>>>> JIRA > >>>>>>>>>>>> issue or > >>>>>>>>>>>>>> here. Feel free to correct me if this is the wrong > >>>> platform. > >>>>>> Also > >>>>>>>>> while > >>>>>>>>>>>> I > >>>>>>>>>>>>>> have worked with distributed pub-sub systems I am still > >>>>> fairly > >>>>>>> new > >>>>>>>> to > >>>>>>>>>>>>>> Rocket-MQ so maybe my understanding of it is > >> incorrect. I > >>>>>>> apologise > >>>>>>>>> if > >>>>>>>>>>>> that > >>>>>>>>>>>>>> is the case and would be happy to stand corrected. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Following are my questions: > >>>>>>>>>>>>>> 1. What defines a redundant message? > >>>>>>>>>>>>>> The constructor that I see for a message is as > >> follows: > >>>>>>>>>>>>>> Message(String topic, String tags, String keys, int > >>> flag, > >>>>>>> byte[] > >>>>>>>>>>>> body, > >>>>>>>>>>>>>> boolean waitStoreMsgOK) > >>>>>>>>>>>>>> Possible candidates to me are topic, tags (can there > >> be > >>>>>>> multiple > >>>>>>>>>>>> tags? > >>>>>>>>>>>>>> I could not find an example for this. If yes how are > >> they > >>>>>>>>> separated?), > >>>>>>>>>>>> keys > >>>>>>>>>>>>>> (Similar question to above.) and of course the body. Is > >>>> there > >>>>>>>>> something > >>>>>>>>>>>>>> that I have missed in this? Is there something that we > >> do > >>>> not > >>>>>>> need > >>>>>>>> to > >>>>>>>>>>>>>> consider? > >>>>>>>>>>>>>> 2. Is their a timeline on the redundant messages? What > >> I > >>>> mean > >>>>>> by > >>>>>>>>> this is > >>>>>>>>>>>>>> that is there a time limit after which a message with > >>>> similar > >>>>>>>>> content is > >>>>>>>>>>>>>> allowed. From what I gather there was no such thing > >>>>> mentioned. > >>>>>>> This > >>>>>>>>>>>> would > >>>>>>>>>>>>>> mean storing all the messages. Depending on the > >>>> requirements > >>>>>> this > >>>>>>>>> may or > >>>>>>>>>>>>>> may not be the best solution. It might be desirable > >> that > >>> no > >>>>>>>>> duplicates > >>>>>>>>>>>> are > >>>>>>>>>>>>>> needed within a certain time window (sliding). This > >>> allows > >>>>>>> ignoring > >>>>>>>>> of > >>>>>>>>>>>>>> duplicate messages that were generated very close to > >> each > >>>>> other > >>>>>>> (or > >>>>>>>>> in > >>>>>>>>>>>> the > >>>>>>>>>>>>>> window indicated). Depending on this requirement > >>>>> implementation > >>>>>>> may > >>>>>>>>>>>> become > >>>>>>>>>>>>>> a little bit more involved. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> For now, these are the only questions. I have ideas > >> that > >>>> need > >>>>>>>> review > >>>>>>>>>>>> about > >>>>>>>>>>>>>> possible implementations but I will mention them once > >> the > >>>>>>>>> specifications > >>>>>>>>>>>>>> are clear to me. As an end question, I would at some > >>> point > >>>>> like > >>>>>>> to > >>>>>>>>> post > >>>>>>>>>>>>>> design ideas to this problem privately to get it > >> reviewed > >>>> by > >>>>>> the > >>>>>>>>>>>>>> development community but not make it publicly > >> available > >>> so > >>>>>> that > >>>>>>> it > >>>>>>>>>>>> cannot > >>>>>>>>>>>>>> be plagiarised. What platform/method can I use to do > >>> that? > >>>> Or > >>>>>> is > >>>>>>>>>>>> submitting > >>>>>>>>>>>>>> a draft to the Google platform the only possible way to > >>>>>>> accomplish > >>>>>>>>> this? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks a lot for reading this through and looking > >> forward > >>>> to > >>>>>> your > >>>>>>>>>>>> inputs. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>> Sohaib Iftikhar > >>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > >
