I use pig with several PigServers inside multi-threaded long-living daemon 
process. I allocate a new PigServer dedicated to one mono-thread job, shutdown 
after each job. The daemons spend their time dealing with jobs. So I think I'm 
pretty close to your use-case of PIG.

There are still some thread-safe bugs do to static variables not protected by 
using ThreadLocal (I've patched SchemaTupleFrontend and SchemaTupleBackend to 
fix this, the bug appear even if you don't activate the schema code generation 
feature). Also SchemaTupleFrontend and backend "leaks" file (they create 
temporary files and folders, with delete on exist setted, but when running pig 
in daemons that run for ages, you end up with those files never being deleted). 
I state "leaks" because this way it is done if fine when using pig in batch 
mode (full JVM start, pig run some (few) scripts jobs, JVM shutdown use-case). 
With long-living daemons, the JVM just never shutdown so delete-on-exist is 
just never trigerred. I end up with and inode saturation of my /tmp folder due 
to this, again, I've a patch for it but cannot really be sure I don't break 
something elsewhere. I've not submitted the patch to the community yet because 
I've a hard time setting the true pig dev env and so I cannot be sure I'm not 
breaking something elsewhere.

In conclusion : running many threads on long-living daemons will trigger 2 
problems with current pig version : inode saturation and thread safe issues. 
The inode saturation problem also exist in JarManager class when dealing with 
PIG UDL auto generated jar file (got a patch for this one too). I will check 
jira to see if the bugs are already created into jira and propose my patch. 
Even if I cannot really test it with the unit tests from pig, it's running in 
production on my side for monthes without a glitch, maybe that's enough testing 
to push the patch to the community :)

-----Message d'origine-----
De : Xuefu Zhang [mailto:xzh...@cloudera.com] 
Envoyé : mardi 28 avril 2015 17:35
À : dev@pig.apache.org
Cc : user
Objet : Re: many users can use one pigserver?

PigServer has state, which isn't meant to be shared by multiple user sessions. 
On the other hand, PIG-1784 made PigServer thread-safe, so depending on your 
version, you may choose having multiple instances in your Tomcat, one for each 
user session.

On Tue, Apr 28, 2015 at 1:51 AM, 李运田 <cumt...@163.com> wrote:

> I have a big web ETL system ,which use pig as ETL tool. when I start 
> tomcat , I also start one pigserver,and all users use the only 
> pigserver,perhaps when these users use this system ,there will be some 
> unknown errors ?

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

Reply via email to