Dear piler-users,
currently I've been working on the multitenancy feature along
with a scaling option where emails can span several worker nodes.
It involves 3 tasks: getting
#1: the matching sphinx indices
#2: metadata from the mysql database
#3: the actual email or attachments
#1 is done with distributed indices, and #2 seems to be working in my
3 nodes test lab (although a few details are not finished yet), however
#3 is a bit challenging.
Basically I can see two paths:
a) The gui connects to the proper worker node via mysql and nfs, and
retrieves the data
b) Each worker node has an api, eg.
http://node[1,2,...]/api/method[1,2,...]
and this api performs all tasks required by the gui node, eg.
"give me the metadata for these serial ids", or "give me message 1234",
etc.
Method a) requires a not so trivial mysql pooling, and mounting the
worker
nodes via nfs. It may work, but I'm not sure if it scales effectively
on the
gui side.
So I think I'll try method b) and write a quick and dirty minimal api
to do these tasks. Unfortunately it requires a webserver, I bet on
nginx,
and you have to restrict access to the api at the iptables and/or nginx
level - since the worker nodes can't authenticate the request (besides
that a simple and fixed code must be present in the request) - so only
the gui can have access to the api URI.
Method b) gives you more flexibility, and with an appropriate (and
probably
changing) worker node mapping it may provide HA and better performance
as
well.
When this feature is completed, I'll release the first publicly
available
version of the piler enterprise edition (EE) binary packages.
Best regards,
Janos