Hi all, I am trying to run the Kallisto package command on the apache beam worker. Below is a table that describes my steps on the apache beam pipeline code and local compute Debian machine (new machine). I used both of them for debug and comparison. On a local machine, the execution completes with no issues. On apache beam, I am having issues with no error. Very challenging to debug.
The only issue that I am familiar with the Kallisto package is when there is not enough disk for the input and the output. I have added the resources commands on the local and remote machine. Please let me know if there is another way to manage the resources. Thank you, Eila task Local Apache worker resources n1-standard-8 (8 vCPUs, 30 GB memory) 60 GB persistent disk GoogleCloudOptions.disk_size_gb = 60 GoogleCloudOptions.worker_machine_type = 'n1-standard-4' anaconda A created base environment with Kallisto package Created base environment with kallisto package command from subprocess import Popen, PIPE, STDOUT import logging script = "/home/eila_orielresearch_org/etc/profile.d/conda.sh" cmd1 = ". {}; env".format(script) cmd2 = "echo finished kallisto" cmd3 = "echo before init" cmd4 = "conda init --all" cmd5 = "conda activate" cmd6 = "kallisto quant -t 2 -i release-99_transcripts.idx --single -l 200 -s 20 -o srr SRR2144345.fastq" cmd7 = "conda deactivate" final = Popen("{}; {}; {}; {}; {}; {}; {}".format(cmd1,cmd2,cmd3,cmd4,cmd5,cmd6,cmd7), shell=True, stdin=PIPE,stdout=PIPE, stderr=STDOUT, close_fds=True) stdout, nothing = final.communicate() stdout from subprocess import Popen, PIPE, STDOUT import logging script = "/opt/userowned/etc/profile.d/conda.sh" cmd1 = ". {}; env".format(script) cmd2 = "echo finished kallisto" cmd3 = "echo before init" cmd4 = "conda init --all" cmd5 = "conda activate" cmd6 = "kallisto quant -t 2 -i release-99_transcripts.idx --single -l 200 -s 20 -o srr SRR2144345.fastq" cmd7 = "conda deactivate" final = Popen("{}; {}; {}; {}; {}; {}; {}".format(cmd1,cmd2,cmd3,cmd4,cmd5,cmd6,cmd7), shell=True, stdin=PIPE,stdout=PIPE, stderr=STDOUT, close_fds=True) stdout, nothing = final.communicate() stdout output eila_orielresearch_org@instance-1:~/srr$ ls -lt total 8548 -rw-r--r-- 1 eila_orielresearch_org eila_orielresearch_org 2174869 May 11 16:19 abundance.h5 -rw-r--r-- 1 eila_orielresearch_org eila_orielresearch_org 6570911 May 11 16:19 abundance.tsv -rw-r--r-- 1 eila_orielresearch_org eila_orielresearch_org 371 May 11 16:19 run_info.json No output. hanging on the yellow command. no error. restarting DoFn execution -- Eila <http://www.orielresearch.com> Meetup <https://www.meetup.com/Deep-Learning-In-Production/>