Message-oriented architectures & 4D

David Adams Tue, 06 Sep 2016 10:36:46 -0700

I thought I'd start another thread about message-oriented architectures in
4D. This is something I've been doing one way or another for about 30 years
(aaaah!) and is a particular interest of mine. It also intersects and
overlaps with some of my other passions. 4D V16 sounds like it will help a
whole lot more 4D developers to start to think about breaking tasks up
suitably for message/queue-based processing. Good news.


Thomas Maul gave a great presentation about this the other day to the
4DMethod user group, already mentioned recently. It sounds like the new
CALL WORKER process enables you to push a method call from one process to a
worker process. From what I understood, you're always calling workers on
the same machine. I've long used processors on remote machines for a
variety or reasons, so I want to offer some suggestions for people before
they get started. If you ever get to the point where you need to extend
what you're doing with CALL WORKER, a few simple tricks can make the
transition easier.

Before that, some opinions about who should be worrying about what. I'm
getting the impression that the cooperative vs. pre-emptive thread
discussion in the 4D world is becoming One of Those Things. For most
developers on most applications I suspect that the discussion *is
completely irrelevant.* Here's the deal, modern versions of 4D Server on
modern hardware can easily handle decent loads without any fancy tricks.
I'm used to breaking up large processing tasks into bits for accelerated
processing and, well, most tasks aren't like that. Some are a good match,
most aren't easy or sensible to split up.

Anyway, I don't know what the distribution curve is on 4D apps relative to
tapping out 4D Server...but I suspect that the far right side, "4D Server
is struggling and we've done everything sensible" is a small minority.
(I've worked on only a couple of such systems.) It's an important minority,
but most of us won't ever deal with it...or deal with it often. So, if
you're not in the group, worrying about threading is probably more for
entertainment value and interest rather than any practical requirement. If
you're in the group that's tapping out 4D Server and you're feeling
nervous...you need to be moving towards a distributed architecture anyway.

Yeah, you can't cluster 4D Server...so there is an absolute limit on how
far you can go. Sometimes people farm out the database portion to
MySQL/MariaDB/SQL Server/PostgreSQL/Etc., but that is always a fairly
painful rewrite. Another approach is to slice off time-consuming processing
tasks that aren't heavily database-bound. That's easier. What kind of
tasks? That depends on the app, but things like document processing, logs,
complex calculations, interacting with remote services, etc. It's really
based on what your app does.

So, back to CALL WORKER. As I understand it, it's all happening on a single
machine. That's great for lots and lots of stuff, but it doesn't really
help with scaling or getting work off of 4D or 4D Server. Instead, you need
some kind of messaging system to send jobs to another program/machine. The
other program can be a compiled-and-merged 4D app (sweet!), PHP (meh?),
Python, Ruby, Language du Jour...whatever. So long as you can send along
enough detail for the remote app to do the job, you're good as gold. If
you're interested in some of the issues/features/trade-offs in messaging
systems, I recommend reading these 'tutorials':

https://www.rabbitmq.com/getstarted.html

These articles also give you a very nice overview of different messaging
patterns. The right pattern depends on your task. For people stepping
outside of 4D to an existing messaging infrastructure, Amazon's Simple
Queuing Service (SQS) is the most obvious choice. Similar but a bit
different is IronMQ. RabbitMQ is less likely but has an appealing
architecture (persistent TCP connections rather than HTTP with/without
"long-polling".) In any case, all of these services have more in common
than they don't.

But back to 4D and CALL WORKER...and the tips about setting things up from
day one to make switching to/adding on support for SQS (or some other
service). Simple:

* Make the only parameter a message in JSON format.

* In each and every message, include three pieces of meta-data:

-- A message signature/name. Like "Calculate_Nightly_Report" or
"Dispatch_Nightly_Report."

-- A version number. Eh? What if you change the job? If you have a rogue
client out there using an old version, what do you do? How do you even
know? With a version number, your client code can test that it knows how to
process the message it's receiving, fall back to handle the old format,
write an error when getting an out-of-date message, etc. This is all super
easy to do if you think to include versions from the start.

* A unique ID. I'd use a UUID. You may never need this but you might. If
you end up having to trace/log messages through the system, having a static
ID for each message can be a life-saver.

As far as the message body goes, there are a lot of things that you can do.
You may be able to pack everything that you need into the JSON or you might
need/want to store the data elsewhere. That could be in 4D, a remote
database, or a file somewhere. In that case, the body should include a
resource reference that your clients know how to interpret and resolve.

I keep saying "clients" because when you're building out a system like
this, you may very well want to end up including code written in something
other than 4D. There are tons of scripts and services that use other
languages that are cheaper and easier to use than rewriting everything in
4D. So long as they can understand JSON (and everything does), you're just
going to have to get your message adapter sorted out and you're good to go.
No more one-server limit.

Right, shifting topics again. What I've just outlined is the sort of stuff
that matters for big systems...and I already said that I suspect that most
4D projects don't have those sorts of problems to solve. But can they still
benefit from message-oriented architectures? 100% yes. Once you get started
thinking this way, you'll find good uses in every program you build,
single-user or multi-user. Below are a few categories of examples:

Efficient Use of a Single Resource
I'm into log files and I'm not alone...Anyway, a hassle with log files (or
any other shared-write file) is this typical cycle:

Wait
Lock
Write
Release
Ugh. And then you're dealing with the file system. Which, for me, is like
turning on the light in the bathroom at night and seeing a giant spider.
(In Aus, we have Huntsmen...they're huge, they're pitch black, and they
wave their pincers at you and hiss. Nasty looking.) I hate dealing with the
file system more than I have to. If you have to do this over and over from
multiple processes or machines, it's just stupidly inefficient. Also, if
you go for the kind of error-checking drama typical of my code, you end up
with screens of code to do something really simple. Boring. This is a
perfect example of where a messaging system helps. Instead of
wait-lock-write-release, you have *one* and only one process with the file.
It's the only one that writes so it opens it up (or creates it), writes and
closes on shutdown. No fighting for or waiting on locks. All of the other
processes send messages to the writer process. The change in throughput is
staggering, as you would intuit. CALL WORKER is fine for this if you're on
one machine. Heck, if you want to log some things live to screen, you could
even use CALL FORM WINDOW too, why not?

Shared Use of a Limited Resources
Let's say you've got some kind of physical or logical limitation on a
resource. Perhaps it's a piece of lab equipment, a special printer, some
software licensed to a particular CPU, or something locked to a specific IP
address. And let's say that you want everyone in the office to have access.
Boom, messaging and you're there. Run a background process or process on
the machine with physical/license access and push jobs to that one machine.
Easy as.

Background Tasks
I love this stuff. I'm a big believer and pre-processing data for reports
and various sorts of intelligence generation. You don't need people to
manage this stuff...just program background processes to do it at night (or
whenever you like.) A lot of 4D systems seem to store vast stacks of data
and then do nothing more with it. Why not track trends, create useful
graphs, highlight oddities, integrity check the data, look for possible
user-entry errors, etc. Part of the problem is probably that 4D's native
reporting tools kind of suck...but there are lots of options there now.
Yeah, grind through your data looking for orphan records, out-of-balance
accounts, etc. so that people can jump on problems quickly.

You Want More Cores? I'll Give you More Cores
People are the world's slowest, most error-prone computer peripherals. They
just really get tired of us. No wonder they take us out milliseconds after
the Singularity ;-0 Anyway, if you're using 4D Server and have a bunch of
remote machines connected, they've got processors. You can use them during
downtime or in the background. Why not?

Anyway, those are some obvious examples that I've used in 4D down the years
regularly. My go-to approach for an all 4D solution is to use a table with
records. Each task has a status and various other bits of data. In one of
those high-stress systems mentioned earlier, all 4D records can prove
unworkable...but they're great for ordinary systems. And, if you use the
JSON format mentioned earlier, you can move jobs between CALL WORKER or
even an outside program, as your system evolves. For those of you that use
IP arrays, etc. up on the server, etc....think again. 4D's a database!
Instead of writing your own lock manager with semaphores, use the record
lock built into 4D. You get it for free. Another nice thing about records
is that you can make them persist across shutdowns (if you like) view them
easily, etc.

Anyway, a bit of a shotgun message here...not supremely well-organized,
I'll grant. If anyone is interested in talking more, on-list or off, let me
know.
**********************************************************************
4D Internet Users Group (4D iNUG)
FAQ:  http://lists.4d.com/faqnug.html
Archive:  http://lists.4d.com/archives.html
Options: http://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**********************************************************************

Message-oriented architectures & 4D

Reply via email to