Egor Pakhomov created SPARK-3714: ------------------------------------ Summary: Spark workflow scheduler Key: SPARK-3714 URL: https://issues.apache.org/jira/browse/SPARK-3714 Project: Spark Issue Type: New Feature Components: Project Infra Reporter: Egor Pakhomov Priority: Minor
[Design doc | https://docs.google.com/document/d/1q2Q8Ux-6uAkH7wtLJpc3jz-GfrDEjlbWlXtf20hvguk/edit?usp=sharing] Spark stack currently hard to use in the production processes due to the lack of next features: * Scheduling spark jobs * Retrying failed spark job in big pipeline * Share context among jobs in pipeline * Queue jobs Typical usecase for such platform would be - wait for new data, process new data, learn ML models on new data, compare model with previous one, in case of success - rewrite model in HDFS directory for current production model with new one. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org