Camel & MongoDB guidance

Ron Cecchini Tue, 05 Feb 2019 11:59:54 -0800

Hi, guys.

This isn't a burning emergency.  I was just looking for some Camel-specific 
design guidance / feedback if anyone has any.  I'm also really new to Mongo, so 
I have some questions about it that aren't specific to Camel.


As usual, my actual question is relatively short, but I spend a lot of words 
getting there....  Don't waste too much time reading this.  Seriously.  Only 
read this if you have nothing else to do!  If you know of some code examples 
that sort of do what I describe below, feel free to just point me in that 
direction instead of addressing the numerous individual questions.

Thank you SO MUCH in advance for anyone who makes it to the end of this 
message... And THANK YOU for your suggestions!

----------

To date, all of my Camel routes have been pretty straight-forward:

1. read from JMS (RabbitMQ) or UDP
2. do something to the data in a Processor
3. write to JMS

And this pattern has worked well for the half-dozen things I've been tasked to 
do!

All the logic has been in the Processor, which sometimes includes a Producer to 
write to a particular route endpoint if need be.

So far, I have not implemented any logic in my Java DSL routes.

But I would like to change that and start experimenting and playing with 
putting more (or as much as possible) in the route.

----------

My latest task is again super-simple, but this time includes a Mongo data store 
with 2 collections, "history" and "current".  The steps are:

1. read a Google protobuf from JMS
2. pull out an ID from the protobuf
3. store the document in a Mongo "history" collection using the ID as the "_id"
   (the "history" collection is basically a record of everything off the wire)
4. try to find the ID in another Mongo collection, "current"; i.e. a document 
with "_id" = ID
   a. if there is a document, grab it and "merge" it with the current protobuf 
(the "merge" details are irrelevant), and pass the result along
   b. if a document was not found, then use the document that was just written 
to "history" in step 3
5. store the document from step 4 (which is either a new or "merged" document) 
to "current"

----------

I have a basic route successfully running and populating the "history" 
collection in Mongo with the messages I read from the JMS.

I was in the process of fleshing out the Processor to do step 4; i.e. checking 
Mongo to see if it already knows this ID, and doing the "merge", etc.

But I decided to take a step back and see how much of this can actually be done 
in the route.

----------

First of all, say this is the protobuf I get from JMS:

my_proto {
  ...
  details {
    my_id: "123456780-big-long-string-123456789"
  }
}

I convert the above protobuf to JSON, and then embed it in a new JSON, and 
store this in Mongo:

  {
    "_id"   : <my_id>,
    "proto" : <my_proto serialized to JSON>
  }

----------

My initial brain-dead way of doing things was starting to look like this:

    from("rabbitmq: ...")
         .unmarshal(protoFormat)
         .process(new ProtobufMongoProcessor());

    from("direct:history")
         
.to("mongodb:mongodbBean?database=mydatabase&collection=history&operation=insert");

    from("direct:current")
         
.to("mongodb:mongodbBean?database=mydatabase&collection=current&operation=insert");

My ProtobufMongoProcessor() does:

1. convert the protobuf to JSON and get the ID
2. create another JSON "wrapper" document with the original message embedded in 
it
3. use a Producer to write the document to "direct:history" to store it in the 
"history" collection
4. use Mongo client calls to search for the ID in the "current" collection
   if a document is found:
       pull out the "proto", merge the data, create a new JSON wrapper document
   else:
       use the wrapper document from step 2
5. use a Producer to write the document to "direct:current" to store it in the 
"current" collection

----------

But I know I can do better.  I'm mixing Mongo client calls in Processor and 
then writing to Camel routes, etc.

This can be cleaned up.  For example, maybe a Processor really isn't necessary 
for anything other than doing a "merge"?

If that's the case, then I'd like to simplify things as part of this learning 
lesson.

The initial questions I have are:

1. I know Mongo doesn't need to store things in JSON format, but would you?  Or 
would you just store the Protobuf?

2. If you convert it to JSON, would you create the "wrapper" JSON with the 
original messaged embedded in it?  I.e. are there pros and cons to embedding 
the original message, or should it just be plopped into Mongo with "_id" set to 
the ID?

3. Since the ID is already in the protobuf, and thus you can find documents by 
that ID, would you even bother setting the "_id", or would you just let Mongo 
generate the '_id'?  Are there pros and cons?

4. If you *don't* set "_id", can you search/filter on a field that is itself 
"embedded"; i.e. not at the top-level?  In my case, the ID is in a sub-field 
("details"), so what would that "query filter" document look like?  Does Mongo 
index all levels of a document?

5. Is it possible to do any kind of Mongo search ("filter") if the original 
document was stored in Protobuf format instead of JSON?

----------

Finally, could I ultimately do everything in a simple route?  (except for any 
"merge", which would need a Processor.)

Say I don't have to convert the protobuf to JSON.  And say I *do* want to set 
the "_id".

Can I have something like:

    from("rabbitmq: ...")
         .unmarshal(protoFormat)
         // get the ID from the protobuf and set the "_id" and then plop the 
protobuf into the "history" collection
         .setBody().constant("{ \"_id\" : ${SOMETHING} }")  // XXX: how to 
access my_proto.details.my_id ?
         
.to("mongodb:mongodbBean?database=mydatabase&collection=history&operation=insert")
         // XXX: somehow save the current Exchange/Body
         // now search the "current" collection for the ID    XXX: is "_id" 
still set?
         
.to("mongodb:mongodbBean?database=mydatabase&collection=current&operation=findById")
         // now have some kind of logic depending on whether or not a document 
was found
         .choice()
             .when(body().isNotNull())
                 .process(new MergeProcessor());  // XXX: how to pass both this 
new Exchange/Body and the saved Exchange/Body??
             .otherwise()
                 // XXX: nothing found in Mongo.  set the current Exchange/Body 
to the saved Exchange/Body
         
.to("mongodb:mongodbBean?database=mydatabase&collection=current&operation=save")
  // XXX: are save & insert the same here?

----------

THANK YOU again to anyone who has read this far.

Camel & MongoDB guidance

Reply via email to