Replies inline
On 3/9/17, 11:24 AM, "Jagadish Venkatraman" <[email protected]> wrote:
I understand you are receiving messages from *all* partitions (but fewer
messages from some partitions).
Some questions:
1. Is it possible that you may have saturated the capacity of the entire
container?
>> Possible but what metric/metrics would tell me that?
2. What is the time you spend inside *process* and *window* for the
affected container? (How does it compare with other containers?). The
metrics of interest are *process_ns* and *window_ns.*
>> process-ns is ~ 200,000ns window-ns happens once every 60 secs.and is < 1 ms
3. What is the number of messages per-second you process for the affected
containers ? (How does it compare with other containers?). The metric of
interest is *process-calls*. You can also look at per-task process calls
<https://github.com/apache/samza/blob/master/samza-core/src/main/scala/org/apache/samza/container/TaskInstanceMetrics.scala>,
and per-partition messages read
<https://github.com/apache/samza/blob/master/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemConsumerMetrics.scala>
>> I was quoting process-calls from the metrics registry. For task 2, its
>> lesser than those for task 1 as I stated earlier.
.
4. How many partitions do you have on the input topics?
>> 100 each that we are trying to do in 50 containers (2 tasks each, with a
>> pool of size 5)
4.1 Have you tried increasing the # of partitions, and re-deploying with
more containers? ( I take it that you are running on Yarn).
>> No, since we don’t have the ability to change input partitions right now.
*>> Another thing to add is our amount of time being blocked is generally
quite high which we believe is mainly because our processing is not fast
enough? *
5. What metric are you referring to here? Do you mean the time spent in the
*process* and *window* call? If you're not processing "fast enough", it
will be helpful to instrument where you are spending most of the time.
>> Referring to block-ns. From my understanding, since the stores are
>> synchronized, once the run loop asks the tasks to process, it then waits
>> (blocks) until at least one of them is ready to receive more. So multiple
>> threads may not really be helping us?
You can rely on Samza's MetricRegistry to configure / report your custom
metrics.
>> Most of the time spent is in the get and put from RocksDB. According to the
>> samza metrics (KeyValueStorageEngine.store-put-ns), our PUTs are much more
>> expensive than GETs. Any idea on how one would go about profiling that? We
>> are using Level compaction with 3 buffers each 500M. Average input message
>> is ~ 1KB and we commit (thus, flush the store) every 90 seconds. We are also
>> using an object cache of 2million objects with a write cache of 1m objects.
Input messages are ~ 5k/sec/partition aggregate for both input streams that we
are joining.
On Thu, Mar 9, 2017 at 5:20 AM, Ankit Malhotra <[email protected]>
wrote:
> Replies inline.
>
> --
> Ankit
>
> > On Mar 9, 2017, at 12:34 AM, Jagadish Venkatraman <
> [email protected]> wrote:
> >
> > We can certainly help you debug this more. Some questions:
> >
> > 1. Are you processing messages (at all) from the "suffering" containers?
> > (You can verify that by observing metrics/ logging etc.)
> Processing messages for sure. But mainly from one of the 2 partitions that
> the container is reading from.
>
> The overall number of process calls for task 1 is much higher than the
> process calls for task 2 and the diff is approx the lag on the container
> which is all from task 2.
> >
> > 2. If you are indeed processing messages, is it possible the impacted
> > containers not able to keep up with the surge in the data? You can try
> > re-partitioning your input topics (and increasing the number of
> containers)
> We were trying the async loop by having 2 tasks be on the same container
> and multiple threads processing messages. The process is a simple inner
> join with a get and put into the RocksDB store.
>
> We saw that both get and put for the suffering task was higher than the
> task that was chugging log.
> >
> > 3. If you are not processing messages, maybe can you provide us with
your
> > stack trace? It will be super-helpful to find out if (or where)
> containers
> > are stuck.
> >
>
> Another thing to add is our amount of time being blocked is generally
> quite high which we believe is mainly because our processing is not fast
> enough? To add more color, our rocks store's average get across all tasks
> is around 20,000ns BUT average put is 5X or more. We have the object cache
> enabled.
>
> > Thanks,
> > Jagadish
> >
> >
> > On Wed, Mar 8, 2017 at 1:05 PM, Ankit Malhotra <[email protected]>
> > wrote:
> >
> >> Hi,
> >>
> >> While joining streams from 2 partitions to join 2 streams, we see that
> >> some containers start suffering in that, lag (messages behind high
> >> watermark) for one of the tasks starts sky rocketing while the other
> one is
> >> ~ 0.
> >>
> >> We are using default values for buffer sizes, fetch threshold, are
> using 4
> >> threads as part of the pool and are using the default
> >> RoundRobinMessageChooser.
> >>
> >> Happy to share more details/config if it can help to debug this
further.
> >>
> >> Thanks
> >> Ankit
> >>
> >>
> >>
> >>
> >>
> >
> >
> > --
> > Jagadish V,
> > Graduate Student,
> > Department of Computer Science,
> > Stanford University
>
--
Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University