On 20/03/13 09:26, Ferran Jorba wrote:
Hello Lars,

I think there should be no problem in your case. On your host, you
would install one RabbitMQ server (the broker) - on the broker you
would create 1 RabbitMQ virtual host per Apache virtual host. For each
invenio installation you would start 1 worker.
Great, good to know it's so simple!

[...]
Do you install each Invenio installation in virtual environments? If
not, this might be the only issue, however I think at most a
worker-start-script per invenio installation would need to be created.
No, we are not using virtual environments; I'm trying to keep it as flat
as possible, and I'm not using any extra layer that I don't really need.

Oki, what we will have to do then is to install a "start worker script" into <path to some invenio>/bin which will ensure the right worker is started.

Alternatively, we are also thinking of a "lite"-solution, so you won't
even need to install a broker (RabbitMQ) and start the Celery
workers. Celery has a flag so that it can run tasks synchronously
instead of asynchronously (so the lite version would seem slower, but
still do the job in the end).
In which situations it «seems slower»?  The end user front-end or the
librarians-systems back-end?

Anywhere a Celery task would be called - front-end/back-end. However, a Celery task would normally only be used when you have any request that takes a long time. One example, would be an export of say 2000 records in format X. With a celery task, we can give back a instant reply to the user, saying we are working on it, and will send him an email once the result is ready (like if you download a folder from Google Docs). With a synchronous task, the end-user will have to wait for a reply from the server until the export is done. Another example could be WebDeposit - the user uploads a file, and a Celery task extracts metadata from the PDF, in the background, while the user can start entering information in the deposit form. WIth the synchronous task, the user will have to wait.

So definitely for most cases you do want to run celery, but for small installations without big requirements this can be an easy way to get up and running.

A distributed task queue, is one of the most effective ways of speeding up responsiveness of the application. Simply do everything in the background which is necessary for the system to work, but which the user doesn't have to wait to finish.

Cheers,
Lars


Currently there's an overlap between bibsched and Celery, which we
haven't completely sorted out what goes where. For now, bibsched is
still the master of bibupload and friends. In the short term it seems
most natural that Celery would take over bibtasklets + new
territories. On the long run, we'll have to get some experiences
first.
Sure.  Thanks for answering so fast,

Ferran

Cheers,
Lars

On 20/03/13 08:48, Ferran Jorba wrote:
Hello Lars,

I've finished initial integration of Invenio in Celery for next:
http://invenio-software.org/repo/personal/invenio-lnielsen/commit/?h=next-celery&id=6d09ef545f03edfa6d7c77cd3a2447873b16c87e

It basically follows what we discussed in DevForum
(https://invenio-software.org/wiki/Tools/Celery/InvenioIntegration). Take
a look if you have a minute and let me know if there's issues.
[...]

Yes, please, I have a doubt: at UAB, we have more than one Invenio
installation in the same host, installed as plain users (not root nor
www-data), and served by Apache (specifically, apache-itk) with virtual
hosts.  Will this celery integration be compatible with our setup?

Thanks,

Ferran

PS And congratulations for your zenodo branch.  It looks gorgeous!  We
     are constantly looking at it for inspiration, and we are taking some
     ideas for our forthcoming 1.1 upgrade.


--
Lars Holm Nielsen
Software Engineer

CERN, IT Department, Digital Library Technology Section
Office 513/1-014
Tel: +41 22 76 79182
Cel: +41 76 672 8927


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to