Hi Ambarish,
- Benchmark the operator separately, you can write a test which calls
operator lifecycle methods in a loop and find out time taken to process fix
number of records. You could use this information to find out the actual
time required to sustain your input rate.
operator = new Operator();
operator.setup();
operator.beingWindow(1);
long start = System.currentTimeMillis();
for (int i = 0; i < count; i++) {
operator.port.process(inputport)
}
operator.endWindow()
end = System.currentTimeMillis();
Log.info("time taken to process {} items is {} millis ", count, (end -
start))
This test does not consider time taken for serialization and
de-serialization of tuples. you need to separately account for that.
- If your operator communicates with external systems, then check if there
are any latency introduced by external systems.
- GC pauses also contributes towards latency of the operator. monitor your
operator gc stats by enabling gc logging. You could add following property
in application's properties.xml file to enable gc logging.
xml <property> <name>dt.application.*.attr.containerJvmOpts</name>
<value>-Xloggc:<LOG_DIR>/gc.log -verbose:gc
-XX:+PrintGCDateStamps</value> </property>
- If the system on which operator runs is slow because many other processes
are waiting for CPU resources. This normally happens if more CPU heavy
operators are deployed on single node / container. Or other hadoop
applications are running such as hive query. you could use system utilities
such as top to monitor system cpu usage. or could you RTS ui to check all
opeartors deployed on a node and CPU usage for each.
- Tushar.
On Tue, Feb 21, 2017 at 12:21 PM, Tushar Gosavi <[email protected]>
wrote:
> Hi Ambarish,
>
> you could add following property in your applications properties.xml file.
>
> <property>
> <name>dt.application.*.attr.containerJvmOpts</name>
> <value>-Xloggc:<LOG_DIR>/gc.log -verbose:gc
> -XX:+PrintGCDateStamps</value>
> </property>
>
> when this property is used, you will see gc.log file in container
> directory.
>
> - Tushar.
>
>
> On Tue, Feb 21, 2017 at 12:06 PM, Ambarish Pande
> <[email protected]> wrote:
> > Hello,
> >
> > I tried enabling gclogs in hadoop configurations. Do I need to enable it
> in
> > Datatorrent RTS somewhere or in my app?. If so, how should I do it?
> >
> > Thank You
> >
> > On Fri, Feb 17, 2017 at 1:20 PM, Ashwin Chandra Putta
> > <[email protected]> wrote:
> >>
> >> It is probably not able to keep up with the throughput and might need
> more
> >> partitions. Check the CPU and memory utilization of its container. If
> memory
> >> allocated to container is too low, it might be hitting GC too often.
> You can
> >> enable GC logging and check GC logs.
> >>
> >> Regards,
> >> Ashwin.
> >>
> >>
> >> On Feb 16, 2017 11:12 PM, "Ambarish Pande" <
> [email protected]>
> >> wrote:
> >>
> >> Hello,
> >>
> >> I wanted to know why my operator latency is increasing. Which logs
> should
> >> I check to get any idea about that. I have checked the container dt.log
> ,
> >> stderr and stdout, but I cannot find anything there.
> >>
> >> Thank You
> >>
> >>
> >
>