My code is not perfect yet. I will write a detailed design document. Then I will realize this feature about our discussion result.
------------------ ???????? ------------------ ??????: "wenhemin"<[email protected]>; ????????: 2020??5??18??(??????) ????7:50 ??????: "??????"<[email protected]>;"dev"<[email protected]>; ????: Re: [Feature] Support SSH Task and Support dummy task like airflow Thanks for writing detailed documentation. I think this is also a missing feature of DS. About the extension point: 1.Can ssh tasks be merged into shell tasks. Essentially, they all execute shell commands. 2.About dummy task, DS has the function of disable nodes, I do n??t know if this requirement is met. The script from AirFlow to Dolphin is great. > ?? 2020??5??18????09:28???????? <[email protected]> ?????? > > > OK, 3Q! > > First, I will ensure that open source can use. > > Second, I think we must discuss deeply. I write a more detailed document. You can check the attachment. I also send the document to DaiLidong. > > Third, I'll give you the error of not using SSH connection pool. > > > > > ------------------ ???????? ------------------ > ??????: "wenhemin"<[email protected]>; > ????????: 2020??5??14??(??????) ????7:26 > ??????: "??????"<[email protected]>; > ????: Re: [Feature] Support SSH Task and Support dummy task like airflow > > Great! > I think, Can ssh tasks be merged into shell tasks, execute script locally or remotely, Configure on the front end. > About ssh connect pool, I did not find it necessary to use the connection pool. > > BTW, Look at the code to introduce additional jar packages, You also need to ensure that open source can use the license of this jar package. > >> ?? 2020??5??14????16:20???????? <[email protected] <mailto:[email protected]>> ?????? >> >> >> 1. The priority between these tasks is also depended on the dolphin DAG define. When the front task is not finished, it not execute next task. >> >> 2. I extend ssh task. I also use local params to config ssh host, user and password. >> >> E.g: >> public static AbstractTask newTask(TaskExecutionContext taskExecutionContext, Logger logger) >> throws IllegalArgumentException { >> Boolean enable = JSONUtils.parseObject(taskExecutionContext.getTaskParams()).getBoolean("enable"); >> if (enable != null && enable == false ) { >> return new DummyTask(taskExecutionContext, logger); >> } >> switch (EnumUtils.getEnum(TaskType.class,taskExecutionContext.getTaskType())) { >> case SHELL: >> return new ShellTask(taskExecutionContext, logger); >> case PROCEDURE: >> return new ProcedureTask(taskExecutionContext, logger); >> case SQL: >> return new SqlTask(taskExecutionContext, logger); >> case MR: >> return new MapReduceTask(taskExecutionContext, logger); >> case SPARK: >> return new SparkTask(taskExecutionContext, logger); >> case FLINK: >> return new FlinkTask(taskExecutionContext, logger); >> case PYTHON: >> return new PythonTask(taskExecutionContext, logger); >> case HTTP: >> return new HttpTask(taskExecutionContext, logger); >> case DATAX: >> return new DataxTask(taskExecutionContext, logger); >> case SQOOP: >> return new SqoopTask(taskExecutionContext, logger); >> case SSH: >> return new SSHTask(taskExecutionContext, logger); >> default: >> logger.error("unsupport task type: {}", taskExecutionContext.getTaskType()); >> throw new IllegalArgumentException("not support task type"); >> } >> } >> 3. I am not sure that it supports window or not. >> >> >> >> ------------------ ???????? ------------------ >> ??????: "wenhemin"<[email protected] <mailto:[email protected]>>; >> ????????: 2020??5??14??(??????) ????3:46 >> ??????: "??????"<[email protected] <mailto:[email protected]>>; >> ????: Re: [Feature] Support SSH Task and Support dummy task like airflow >> >> Sorry, My previous description is not very clear. >> >> I want to ask some questions: >> 1.How to control the priority between ssh tasks? There may be some ssh tasks that have been waiting for execution. >> 2.I understand what you want to solve is the problem of executing remote ssh scripts in batches. >> So, not sure how to use this function. >> 3.I don't know if this supports windows system. >> >>> ?? 2020??5??13????20:56???????? <[email protected] <mailto:[email protected]>> ?????? >>> >>> >>> I use spin lock. Here is my code. Of course , it's not perfect. I just do a test. To my surprise, it is the result of the execution is the same as the AirFlow >>> >>> ???????????????????????????????????????????????????????????????????????????????????????????????????????? AirFlow ???????????? >>> >>> >>> >>> >>> ------------------ ???????? ------------------ >>> ??????: "whm_777"<[email protected] <mailto:[email protected]>>; >>> ????????: 2020??5??13??(??????) ????7:21 >>> ??????: "??????"<[email protected] <mailto:[email protected]>>; >>> ????: Re: [Feature] Support SSH Task and Support dummy task like airflow >>> >>> You can modify the maximum number of linux ssh connections. >>> If use ssh connection pool, How to control the priority of ssh? >>> >>>> ?? 2020??5??13????18:01???????? <[email protected] <mailto:[email protected]>> ?????? >>>> >>>> >>>> First 3Q?? >>>> >>>> I use more than 100 task node. But SSH connections are limited. >>>> >>>> ??????????100??????????????????????SSH????????????????????????????????????????????????SSH??????????????????????????????DAG??????AirFlow???????????? >>>> <[email protected]> >>>> >>>> >>>> >>>> ------------------ ???????? ------------------ >>>> ??????: "whm_777"<[email protected] <mailto:[email protected]>>; >>>> ????????: 2020??5??13??(??????) ????5:50 >>>> ??????: "??????"<[email protected] <mailto:[email protected]>>; >>>> ????: Re: [Feature] Support SSH Task and Support dummy task like airflow >>>> >>>> E.g. >>>> rtn_code=`ssh -o ServerAliveInterval=60 -p xxxx [email protected] <mailto:[email protected]> ??shell command >/dev/null 2>&1; echo $?'` >>>> if [ "$rtn_code" -eq 0 ]; then >>>> echo "????" >>>> exit 0 >>>> else >>>> echo "????" >>>> exit 1 >>>> fi >>>> >>>> Batch shell command is not supported. >>>> Multiple servers can be split into multiple task nodes. >>>> >>>>> ?? 2020??5??13????17:40???????? <[email protected] <mailto:[email protected]>> ?????? >>>>> >>>>> >>>>> Could you give me a example??3Q. ???????????????????????? >>>>> >>>>> By the way, I have more than 100 tasks in one DAG. These tasks connect two other server to execute. So SSH tasks must have pool to manager. Now I use JSch and realize a simple pool. >>>>> >>>>> ??????????????????????????????????100???? SSH ?????????????????????????????????????????????????? SSH ?????????????????????????????????????????????????? JSch???????????????????????????? >>>>> >>>>> ------------------ ???????? ------------------ >>>>> ??????: "wenhemin"<[email protected] <mailto:[email protected]>>; >>>>> ????????: 2020??5??13??(??????) ????5:24 >>>>> ??????: "dev"<[email protected] <mailto:[email protected]>>; >>>>> ????: Re: [Feature] Support SSH Task and Support dummy task like airflow >>>>> >>>>> The shell node is supports remote calling, and get the remote command result code. >>>>> >>>>> >>>>> > ?? 2020??5??13????15:16???????? <[email protected] <mailto:[email protected]>> ?????? >>>>> > >>>>> > Dear ALL?? >>>>> > >>>>> > >>>>> > Support Linux SSH Task ???? Linux SSH ???? >>>>> > >>>>> > ???????????????????????????????????????????????????????? Shell ??????Shell ???????????????????????????????????? Worker ???????????????????????????????????????????????????? Shell ?????????????????????????????????????????????????????????????????? >>>>> > >>>>> > For example, in my project, the workflow's tasks want to execute shell scripts where are in different server's different directory. When worker execute these shell scripts, it must use the same user to login these server. Also, the worker can get the executing state of these server. We can config these server 's host,user and password. >>>>> > >>>>> > SSH Task is very useful for most user SSH ???????????????????????????? >>>>> > >>>>> > ?????????????????????? Shell ???????????????????????????????????????????????????????????????? Worker?????????? Worker ?????????????????????????????????????????????????????????????? >>>>> > >>>>> > In dolphinscheduler, the most executing tasks are in different servers who are not workers. These servers also have their different fixed services. We just have to pass different parameters to schedule these shell scripts to execute. >>>>> > >>>>> > Python has a module to execute ssh script Python ??????????????????????????SSH Shell ???? >>>>> > >>>>> > Python ??????????????????????SSH Shell??????????????????????paramiko?? >>>>> > >>>>> > Python has a module that can execute SSH Shell script. It's paramiko. >>>>> > >>>>> > Others ???????? >>>>> > >>>>> > ???????????????????????????????????????????????????????????????????? >>>>> > >>>>> > I found this described in previous feature, but it was relatively simple. >>>>> > Feature URL >>>>> > >>>>> > ???????????? Shell Task ?????????????????????????????????????????????????????????????????????? >>>>> > >>>>> > In addition, it is very inconvenient for me to perform remote tasks through Shell Task. Here is my script. I don't know if there's a better way. >>>>> > sshpass -p 'password' ssh user@host echo 'ssh success' echo 'Hello World' -&gt; /home/dolphinscheduler/test/hello.txt echo 'end' >>>>> > >>>>> > >>>>> > >>>>> > Support dummy task like airflow ?????? Airflow ???????????? >>>>> > >>>>> > ???????????????????????????????? DAG ??????DAG ?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? Dummy Task????????????????????????????????????????????????????????????????????????AirFlow??????????????DummyOperator???????? >>>>> > >>>>> > For example, in my project, it has a productized DAG file. The file contains different modules, some of which are interdependent and some of which are not. When customers purchase different modules, we need to set some tasks as dummy tasks, which some modules are not purchased and the purchased module is not dependent. Because of this setting, these dummy tasks are actually not executed. The benefits of this setup are product unity and diagram integrity. In airflow, these task execute by dummy operator. >>>>> > >>>>> > ** Realize ????????** >>>>> > >>>>> > Dummy Task ???????????????????????????????????????????????????????????????? dummy ?????????????????????????? Dummy Task?? >>>>> > >>>>> > Dummy Task is easy to realize, but it need to use with other different tasks. When the task's executed type is set to dummy type, the task are executed as a dummy task and the real task is not executed. >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > ????????????????????????????????????Fork?????????????????????????????????????????????????????????? >>>>> > >>>>> > By the way??I already realize these two&nbsp; features in my fork branch.&nbsp;Whether the follow-up release can be supported >>>>> >>>> >>> >>> <SSHClient.java><SSHPool.java><SSHTask.java> >> > > <??????????????Dolphin????????????.pdf>
