Today I found one answer from a this thread [1] which seems to be worth
exploring.
Michael, if you are reading this, it will be helpful if you could share
more about your spark deployment in production.
Thanks and Regards
Noorul
[1]
http://apache-spark-user-list.1001560.n3.nabble.com/How-do-you-run-your-spark-app-tp7935p7958.html
Noorul Islam K M noo...@noorul.com writes:
Hi all,
We have a cloud application, to which we are adding a reporting service.
For this we have narrowed down to use Cassandra + Spark for data store
and processing respectively.
Since cloud application is separate from Cassandra + Spark deployment,
what is ideal method to interact with Spark Master from the application?
We have been evaluating spark-job-server [1], which is an RESTful layer
on top of Spark.
Are there any other such tools? Or are there any other better approach
which can be explored?
We are evaluating following requirements against spark-job-server,
1. Provide a platform for applications to submit jobs
2. Provide RESTful APIs using which applications will interact with the
server
- Upload jar for running jobs
- Submit job
- Get job list
- Get job status
- Get job result
3. Provide support for kill/restart job
- Kill job
- Restart job
4. Support job priority
5. Queue up job submissions if resources not available
6. Troubleshoot job execution
- Failure – job logs
- Measure performance
7. Manage cluster deployment
- Bootstrap, scale up/down (add, remove, replace nodes)
8. Monitor cluster deployment
- Health report: Report metrics – CPU, Memory, - of jobs, spark
processes
- Alert DevOps about threshold limit of these metrics
- Alert DevOps about job failures
- Self healing?
9. Security
- AAA job submissions
10. High availability/Redundancy
- This is for the spark-jobserver component itself
Any help is appreciated!
Thanks and Regards
Noorul
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org