[ https://issues.apache.org/jira/browse/BEAM-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ismaël Mejía reassigned BEAM-9007: ---------------------------------- Assignee: (was: Aizhamal Nurmamat kyzy) > beam.DoFn setup() will call several times when using python subprocess > ---------------------------------------------------------------------- > > Key: BEAM-9007 > URL: https://issues.apache.org/jira/browse/BEAM-9007 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Affects Versions: 2.15.0, 2.16.0 > Environment: python 3.5 > apache-beam[gcp] == 2.16.* > google-cloud-storage == 1.23.* > google-resumable-media == 0.5.* > googleapis-common-protos == 1.6.* > grpc-google-logging-v2 == 0.11.* > Reporter: Hokuto Tateyama > Priority: Minor > > Hello. > I`m trying to use a make command on dataflow to use OpenCV source written in > C++. > I was thinking, *setup()* function on *beam.DoFn* will run only once a time > before the process runs. > So I tried to run build commands on the setup() function, and it will run > successfully. > h1. Problem > After the running process, the setup() function will run again and try to > build commands several times. I`ve checked these logs from my stack driver. > h1. Codes > These are my codes using dataflow. I defined the command_list in the class > that inheritance from beam.DoFn and call run_cmd() from setup(). > ・Run command lines. > {code:python} > def run_cmd(command_list: List[List[str]], shell: bool = False) -> > List[Dict[str, Any]]: > outputs = [] > try: > for cmd in command_list: > logging.info(cmd) > proc = subprocess.check_output( > cmd, shell=shell, stderr=subprocess.STDOUT, > universal_newlines=True) > outputs.append({“Input: “: cmd, “Output: “: proc}) > except subprocess.CalledProcessError as e: > logging.warning(“Return code:{}, > Output:{}”.format(e.returncode, e.output)) > return outputs{code} > ・Command list to pass run_cmd() function. > {code:python} > command_list = [ > [“cat /etc/issue”], > [“apt-get —assume-yes update”], > [ > “apt-get —assume-yes install —no-install-recommends ffmpeg git > software-properties-common” > ], > [“apt-get install -y software-properties-common”], > [ > ‘add-apt-repository -s “deb http://security.ubuntu.com/ubuntu > bionic-security main”’ > ], > [ > “apt-get install -y build-essential checkinstall cmake unzip > pkg-config yasm unzip” > ], > [“apt-get -y install git gfortran python3-dev”], > [ > “apt-get -y install libjpeg62-turbo-dev libpng-dev libpng16-16 > libavcodec-dev libavformat-dev libswscale-dev libdc1394-22-dev libxine2-dev > libv4l-dev” > ], > [“apt-get -y install libjpeg-dev libpng-dev libtiff-dev libtbb-dev”], > [ > “apt-get -y install libavcodec-dev libavformat-dev libswscale-dev > libv4l-dev libatlas-base-dev libxvidcore-dev libx264-dev libgtk-3-dev” > ], > [“apt-get clean”], > [“rm -rf /var/lib/apt/lists/*”], > [“git clone https://github.com/opencv/opencv.git”], > [“git clone https://github.com/opencv/opencv_contrib.git”], > [“cd opencv_contrib”], > [“git checkout -b 3.4.3 refs/tags/3.4.3”], > [“cd ../opencv/“], > [“git checkout -b 3.4.3 refs/tags/3.4.3”], > [“mkdir build”], > [“cd build”], > [ > “cmake -D CMAKE_BUILD_TYPE=Release \ > -D CMAKE_INSTALL_PREFIX=/usr/local \ > -D WITH_TBB=ON \ > -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib/modules ..” > ], > [“make -j8”], > [“make install”], > [“echo /usr/local/lib > /etc/ld.so.conf.d/opencv.conf”], > [“ldconfig -v”] > ] > {code} > h1. Question > For my summary, I`m wondering if these are bugs for apache beam. > # What is the reason for calling setup() several times? > # Is there any solution to set up these commands only once in the total > running? This is a method what I tried. > ## Using os.system() instead of subprocess. I think subprocess will create > another process on setup() so, it can not extract process finished > successfully. > ## Writing commands on setup.py and use it for CustomCommand > [https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/] > > Regards, Collonville -- This message was sent by Atlassian Jira (v8.3.4#803005)