[
https://issues.apache.org/jira/browse/SAMZA-334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062599#comment-14062599
]
Chris Riccomini commented on SAMZA-334:
---------------------------------------
_(This post is mostly about auto-scaling, but I think it's related to this
JIRA.)_
bq. I would really like to see an existing system that has successfully
implemented auto scaling.
The word "auto-scaling" is kind of ill-defined, and a bit of a loaded term.
There are many potential implementations, some of which are very simple, and
others that are very complicated. On the simple side, [~jghoman] asked for a
way to automatically adjust container sizes based on time of day. This would
allow us to shrink our resource usage in off-peak hours.
I think it's important to examine auto-scaling not from the job-level, but
holistically from the cluster-level. Clearly, in an empty cluster, all jobs
will benefit from auto-scaling. I believe complications are likely to arrive
when multiple jobs with auto-scaling are running in a single cluster. Off the
top of my head, I see two auto-scaling patterns: one is predictable and the
other is unpredictable.
A cluster that has several jobs with predictable auto-scaling patterns (e.g.
time-of-day based) could benefit from even the simple auto-scaling style
described above in cases where two jobs are complimentary (e.g. on job
auto-scales from 7-9am every day, and the other auto-scales from 7-9pm every
day). In this case, you can reduce overall cluster size by letting the two jobs
share the excess resources at different times of day.
A cluster that has jobs with unpredictable auto-scaling patterns (e.g. job is
down randomly, and auto-scales to catch up when it restarts) could still
benefit from auto-scaling, though the benefit is likely just to be at a more
abstract "cluster utilization" level. Most ops guys I know track this as a
measure of efficiency. If you have excess resources, and they can be used by
someone, they should be.
To me, the predictable use-case seems easier to implement, but not very useful.
It's likely that all time-based jobs will peak at the same time. I have little
actual evidence of this--it's just intuition, but most of our jobs process
things like page views, ad clicks, etc. These events all scale in accordance
with traffic on our site, which means their throughput requirements are all
correlated to the same time. In essence, this means that our cluster is going
to be required to have capacity to run all of our jobs at peak capacity ANYWAY,
so why not just schedule the jobs with peak resources, and call it quits?
The unpredictable use-case seems much harder to implement, but more useful. In
a basic implementation, you could imagine sharing all excess cluster resources
equally among all lagging jobs in the cluster. This would drastically increase
cluster utilization, and also increase recovery times for jobs, in the average
case. I do worry about run-away jobs that are INEFFICIENTLY using resources in
the cluster, but that's a separate problem.
The more that I look at this problem, the more it starts to sound like a
scheduling problem, at heart. This sounds especially true in the unpredictable
auto-scaling pattern. I don't really want to sign up for writing a scheduler at
this point, personally.
bq. We should also need to think about how something like Samza SQL would be
able to take advantage of these properties and dynamically decide on these
values.
Yea, I agree. [~raulcf] did a really brief one day investigation into this, and
found a lot of gaps. It's really difficult to figure out, a priori, how many
resources you need without starting the job first. It seems much more do-able
to run the job stochastically, and modify resources based on observed metrics.
The other comment I have is that there's got to be a ton of literature on this
stuff. If someone is interested in the subject, it'd be good to get a list of
papers so we can actually understand the problem space.
> Need for asymmetric container config
> ------------------------------------
>
> Key: SAMZA-334
> URL: https://issues.apache.org/jira/browse/SAMZA-334
> Project: Samza
> Issue Type: Improvement
> Components: container
> Affects Versions: 0.8.0
> Reporter: Chinmay Soman
>
> The current (and upcoming) partitioning scheme(s) suggest that there might be
> a skew in the amount of data ingested and computation performed across
> different containers for a given Samza job. This directly affects the amount
> of resources required by a container - which today are completely symmetric.
> Case A] Partitioning on Kafka partitions
> For instance, consider a partitioner job which reads data from different
> Kafka topics (having different partition layouts). In this case, its possible
> that a lot of topics have a smaller number of Kafka partitions. Consequently
> the containers processing these partitions would need more resources than
> those responsible for the higher numbered partitions.
> Case B] Partitioning based on Kafka topics
> Even in this case, its very easy for some containers to be doing more work
> than others - leading to a skew in resource requirements.
> Today, the container config is based on the requirements for the worst (doing
> the most work) container. Needless to say, this leads to resource wastage. A
> better approach needs to consider what is the true requirement per container
> (instead of per job).
--
This message was sent by Atlassian JIRA
(v6.2#6252)