Egor Pakhomov created SPARK-3714:
------------------------------------

             Summary: Spark workflow scheduler
                 Key: SPARK-3714
                 URL: https://issues.apache.org/jira/browse/SPARK-3714
             Project: Spark
          Issue Type: New Feature
          Components: Project Infra
            Reporter: Egor Pakhomov
            Priority: Minor


[Design doc | 
https://docs.google.com/document/d/1q2Q8Ux-6uAkH7wtLJpc3jz-GfrDEjlbWlXtf20hvguk/edit?usp=sharing]
Spark stack currently hard to use in the production processes due to the lack 
of next features:

* Scheduling spark jobs
* Retrying failed spark job in big pipeline
* Share context among jobs in pipeline
* Queue jobs

Typical usecase for such platform would be - wait for new data, process new 
data, learn ML models on new data, compare model with previous one, in case of 
success - rewrite model in HDFS directory for current production model with new 
one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to