Re: [galaxy-dev] python egg cache exists error

2012-09-24 Thread Nate Coraor
For Test/Main, I have the user's ~/.bash_profile set $PYTHON_EGG_CACHE on a per-node basis. This could also be done per-node and per-pty to ensure uniqueness per job. --nate On Sep 18, 2012, at 11:24 AM, James Taylor wrote: Interesting. If I'm reading this correctly the problem is happening

Re: [galaxy-dev] python egg cache exists error

2012-09-19 Thread Jorrit Boekel
For completeness, here's two tracebacks (there were more similar ones) from the same job: /mnt/galaxyData/tmp/job_working_directory/000/75/task_4: Traceback (most recent call last): File ./scripts/extract_dataset_part.py, line 25, in module import galaxy.model.mapping #need to load this

Re: [galaxy-dev] python egg cache exists error

2012-09-19 Thread Jorrit Boekel
I added this snippet to the top of my extract_dataset_part.py: pkg_resources.require(simplejson) # wait until this process' PID is the first PID of all processes with the same name, then import while True: with os.popen(ps ax|grep extract_dataset_part.py |grep -v grep|awk '{print $1}')

Re: [galaxy-dev] python egg cache exists error

2012-09-18 Thread Jorrit Boekel
Hi again, I have looked into this matter a little bit more, and it looks like this is happening: - tasked job is split - tasks commands are sent to workers (I am running 8-core high cpu extra large workers on EC2) - per task, worker runs env.sh for the respective tool - per task, worker

Re: [galaxy-dev] python egg cache exists error

2012-09-18 Thread James Taylor
Interesting. If I'm reading this correctly the problem is happening inside pkg_resources? (galaxy.eggs unzips eggs, but I think it does so on install [fetch_eggs] time not run time which would avoid this). If so this would seem to be a locking bug in pkg_resources. Dannon, we could put a guard

[galaxy-dev] python egg cache exists error

2012-09-14 Thread Jorrit Boekel
Dear list, I am running galaxy-dist on Amazon EC2 through Cloudman, and am using the enable_tasked_jobs to run jobs in parallel. Yes, I know it's not recommended in production. My jobs usually get split in 72 parts, and sometimes (but not always, maybe in 30-50% of cases), errors are