Dear piler-users,

currently I've been working on the multitenancy feature along
with a scaling option where emails can span several worker nodes.

It involves 3 tasks: getting

#1: the matching sphinx indices
#2: metadata from the mysql database
#3: the actual email or attachments

#1 is done with distributed indices, and #2 seems to be working in my
3 nodes test lab (although a few details are not finished yet), however
#3 is a bit challenging.

Basically I can see two paths:

a) The gui connects to the proper worker node via mysql and nfs, and
retrieves the data

b) Each worker node has an api, eg. http://node[1,2,...]/api/method[1,2,...]
and this api performs all tasks required by the gui node, eg.
"give me the metadata for these serial ids", or "give me message 1234", etc.

Method a) requires a not so trivial mysql pooling, and mounting the worker nodes via nfs. It may work, but I'm not sure if it scales effectively on the
gui side.

So I think I'll try method b) and write a quick and dirty minimal api
to do these tasks. Unfortunately it requires a webserver, I bet on nginx,
and you have to restrict access to the api at the iptables and/or nginx
level - since the worker nodes can't authenticate the request (besides
that a simple and fixed code must be present in the request) - so only
the gui can have access to the api URI.

Method b) gives you more flexibility, and with an appropriate (and probably changing) worker node mapping it may provide HA and better performance as
well.

When this feature is completed, I'll release the first publicly available
version of the piler enterprise edition (EE) binary packages.

Best regards,
Janos

Reply via email to