Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Shanthoosh Venkataraman


> On Nov. 9, 2016, 11:18 p.m., Prateek Maheshwari wrote:
> > samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java, line 55
> > 
> >
> > What's the execution unit here? The samza-rest server? The monitor? The 
> > container being monitored?
> 
> Shanthoosh Venkataraman wrote:
> I think `execution unit` is a misnomer. By execution unit, I meant the 
> origin of the metrics (physical hostname in which the samza rest process is 
> running/SamzaRestMonitor literal/SamzaRestResource literal etc)

Discussed offline. Exposing containerName through samza-rest config might be 
unnecessary as of now. Hardcoding it to hostname on which `samza-rest` is 
running.


- Shanthoosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/#review155511
---


On Nov. 10, 2016, 1:04 a.m., Shanthoosh Venkataraman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53297/
> ---
> 
> (Updated Nov. 10, 2016, 1:04 a.m.)
> 
> 
> Review request for samza and Jake Maes.
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> This patch aims at enabling users to define custom reporters to send metrics 
> from the monitors. Configurations required for the definition of the metrics 
> reporters follows the same convention as of the samza jobs.
> 
> 
> Diffs
> -
> 
>   docs/learn/documentation/versioned/rest/monitors.md 
> 46678bbe5fed99f767c3324dc9578ee1a64cec66 
>   samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
> PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
> e0468ee89c89fd720834461771ebb36475475bcb 
>   
> samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
>  f24beb1e099dd44b15b475e0a4a7f70560c6965e 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
> 47b0663637f6db187d86961377ee3ee203b73fdb 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
> 2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
>   samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
> PRE-CREATION 
>   
> samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala
>  8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 
> 
> Diff: https://reviews.apache.org/r/53297/diff/
> 
> 
> Testing
> ---
> 
> Unit tests are done to verify the intended functionality.
> 
> 
> Thanks,
> 
> Shanthoosh Venkataraman
> 
>



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Shanthoosh Venkataraman


> On Nov. 9, 2016, 12:09 a.m., Jagadish Venkatraman wrote:
> > samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java, line 55
> > 
> >
> > How exactly is this used when metrics are reported? 
> > 
> > Is there a reason for this to be configurable / adding a new config 
> > key? 
> > 
> > If it's configurable, we should document it somewhere?
> 
> Shanthoosh Venkataraman wrote:
> Container name here refers to the execution unit from which the metrics 
> gets reported. It's used when instantiating the MetricsReporter. If this 
> configuration is undefined, it's defaulted to hostName on which SamzaRest is 
> running. Resources will report metrics alongside monitors. Added relevent 
> documentation for it.
> 
> Jagadish Venkatraman wrote:
> Is there a use case where I would want to configure a different container 
> name? If not, this need not be a config? (we can default it to `samza-rest`)
> 
> Shanthoosh Venkataraman wrote:
> For LI use case, using hostname(or SamzaRest) would suffice.
> My understanding is that the containerName denotes either the logical 
> host or the physical host or some unique tag to identify the origin from 
> which the metrics gets reported from. 
> This container name could be appended into metrics reported depending 
> upon MetricsReporter implementations. 
> Hence, I think it will be useful to let users configure it. 
> For instance, some implementations could generate a unique string as 
> container name based upon environment and populate it in the config to 
> uniquely identify the environment it's reporting the metrics from.

Discussed offline. Exposing containerName through samza-rest config might be 
unnecessary as of now. Hardcoding it to hostname on which `samza-rest` is 
running.


- Shanthoosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/#review155372
---


On Nov. 10, 2016, 1:04 a.m., Shanthoosh Venkataraman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53297/
> ---
> 
> (Updated Nov. 10, 2016, 1:04 a.m.)
> 
> 
> Review request for samza and Jake Maes.
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> This patch aims at enabling users to define custom reporters to send metrics 
> from the monitors. Configurations required for the definition of the metrics 
> reporters follows the same convention as of the samza jobs.
> 
> 
> Diffs
> -
> 
>   docs/learn/documentation/versioned/rest/monitors.md 
> 46678bbe5fed99f767c3324dc9578ee1a64cec66 
>   samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
> PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
> e0468ee89c89fd720834461771ebb36475475bcb 
>   
> samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
>  f24beb1e099dd44b15b475e0a4a7f70560c6965e 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
> 47b0663637f6db187d86961377ee3ee203b73fdb 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
> 2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
>   samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
> PRE-CREATION 
>   
> samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala
>  8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 
> 
> Diff: https://reviews.apache.org/r/53297/diff/
> 
> 
> Testing
> ---
> 
> Unit tests are done to verify the intended functionality.
> 
> 
> Thanks,
> 
> Shanthoosh Venkataraman
> 
>



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Shanthoosh Venkataraman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/
---

(Updated Nov. 10, 2016, 1:04 a.m.)


Review request for samza and Jake Maes.


Repository: samza


Description
---

This patch aims at enabling users to define custom reporters to send metrics 
from the monitors. Configurations required for the definition of the metrics 
reporters follows the same convention as of the samza jobs.


Diffs (updated)
-

  docs/learn/documentation/versioned/rest/monitors.md 
46678bbe5fed99f767c3324dc9578ee1a64cec66 
  samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
PRE-CREATION 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
e0468ee89c89fd720834461771ebb36475475bcb 
  
samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
 f24beb1e099dd44b15b475e0a4a7f70560c6965e 
  samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
47b0663637f6db187d86961377ee3ee203b73fdb 
  samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
  samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
PRE-CREATION 
  
samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 
8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 

Diff: https://reviews.apache.org/r/53297/diff/


Testing
---

Unit tests are done to verify the intended functionality.


Thanks,

Shanthoosh Venkataraman



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Shanthoosh Venkataraman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/
---

(Updated Nov. 10, 2016, 12:04 a.m.)


Review request for samza and Jake Maes.


Repository: samza


Description
---

This patch aims at enabling users to define custom reporters to send metrics 
from the monitors. Configurations required for the definition of the metrics 
reporters follows the same convention as of the samza jobs.


Diffs (updated)
-

  docs/learn/documentation/versioned/rest/monitors.md 
46678bbe5fed99f767c3324dc9578ee1a64cec66 
  docs/learn/documentation/versioned/rest/overview.md 
c382f032843cce696a445ff110e87a8198cc96d7 
  samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
PRE-CREATION 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
e0468ee89c89fd720834461771ebb36475475bcb 
  
samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
 f24beb1e099dd44b15b475e0a4a7f70560c6965e 
  samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
47b0663637f6db187d86961377ee3ee203b73fdb 
  samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
  samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
PRE-CREATION 
  
samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 
8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 

Diff: https://reviews.apache.org/r/53297/diff/


Testing
---

Unit tests are done to verify the intended functionality.


Thanks,

Shanthoosh Venkataraman



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Shanthoosh Venkataraman


> On Nov. 9, 2016, 11:18 p.m., Prateek Maheshwari wrote:
> > samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java, line 55
> > 
> >
> > What's the execution unit here? The samza-rest server? The monitor? The 
> > container being monitored?

I think `execution unit` is a misnomer. By execution unit, I meant the origin 
of the metrics (physical hostname in which the samza rest process is 
running/SamzaRestMonitor literal/SamzaRestResource literal etc)


- Shanthoosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/#review155511
---


On Nov. 9, 2016, 10:50 p.m., Shanthoosh Venkataraman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53297/
> ---
> 
> (Updated Nov. 9, 2016, 10:50 p.m.)
> 
> 
> Review request for samza and Jake Maes.
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> This patch aims at enabling users to define custom reporters to send metrics 
> from the monitors. Configurations required for the definition of the metrics 
> reporters follows the same convention as of the samza jobs.
> 
> 
> Diffs
> -
> 
>   docs/learn/documentation/versioned/rest/monitors.md 
> 46678bbe5fed99f767c3324dc9578ee1a64cec66 
>   docs/learn/documentation/versioned/rest/overview.md 
> c382f032843cce696a445ff110e87a8198cc96d7 
>   samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
> PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
> e0468ee89c89fd720834461771ebb36475475bcb 
>   
> samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
>  f24beb1e099dd44b15b475e0a4a7f70560c6965e 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
> 47b0663637f6db187d86961377ee3ee203b73fdb 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
> 2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
>   samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
> PRE-CREATION 
>   
> samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala
>  8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 
> 
> Diff: https://reviews.apache.org/r/53297/diff/
> 
> 
> Testing
> ---
> 
> Unit tests are done to verify the intended functionality.
> 
> 
> Thanks,
> 
> Shanthoosh Venkataraman
> 
>



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Shanthoosh Venkataraman


> On Nov. 9, 2016, 12:09 a.m., Jagadish Venkatraman wrote:
> > samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java, line 55
> > 
> >
> > How exactly is this used when metrics are reported? 
> > 
> > Is there a reason for this to be configurable / adding a new config 
> > key? 
> > 
> > If it's configurable, we should document it somewhere?
> 
> Shanthoosh Venkataraman wrote:
> Container name here refers to the execution unit from which the metrics 
> gets reported. It's used when instantiating the MetricsReporter. If this 
> configuration is undefined, it's defaulted to hostName on which SamzaRest is 
> running. Resources will report metrics alongside monitors. Added relevent 
> documentation for it.
> 
> Jagadish Venkatraman wrote:
> Is there a use case where I would want to configure a different container 
> name? If not, this need not be a config? (we can default it to `samza-rest`)

For LI use case, using hostname(or SamzaRest) would suffice.
My understanding is that the containerName denotes either the logical host or 
the physical host or some unique tag to identify the origin from which the 
metrics gets reported from. 
This container name could be appended into metrics reported depending upon 
MetricsReporter implementations. 
Hence, I think it will be useful to let users configure it. 
For instance, some implementations could generate a unique string as container 
name based upon environment and populate it in the config to uniquely identify 
the environment it's reporting the metrics from.


- Shanthoosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/#review155372
---


On Nov. 9, 2016, 10:50 p.m., Shanthoosh Venkataraman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53297/
> ---
> 
> (Updated Nov. 9, 2016, 10:50 p.m.)
> 
> 
> Review request for samza and Jake Maes.
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> This patch aims at enabling users to define custom reporters to send metrics 
> from the monitors. Configurations required for the definition of the metrics 
> reporters follows the same convention as of the samza jobs.
> 
> 
> Diffs
> -
> 
>   docs/learn/documentation/versioned/rest/monitors.md 
> 46678bbe5fed99f767c3324dc9578ee1a64cec66 
>   docs/learn/documentation/versioned/rest/overview.md 
> c382f032843cce696a445ff110e87a8198cc96d7 
>   samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
> PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
> e0468ee89c89fd720834461771ebb36475475bcb 
>   
> samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
>  f24beb1e099dd44b15b475e0a4a7f70560c6965e 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
> 47b0663637f6db187d86961377ee3ee203b73fdb 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
> 2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
>   samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
> PRE-CREATION 
>   
> samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala
>  8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 
> 
> Diff: https://reviews.apache.org/r/53297/diff/
> 
> 
> Testing
> ---
> 
> Unit tests are done to verify the intended functionality.
> 
> 
> Thanks,
> 
> Shanthoosh Venkataraman
> 
>



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Prateek Maheshwari


> On Nov. 5, 2016, 11:47 a.m., Jake Maes wrote:
> > samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java, line 
> > 75
> > 
> >
> > I don't think the MetricsConfig constructure takes a subset. 
> > 
> > I think it takes the root and expects to find the "metrics" prefix
> 
> Jake Maes wrote:
> s/constructure/constructor
> 
> phonetic brain fail
> 
> Shanthoosh Venkataraman wrote:
> Yes, that's true. It expects the root and finds the metrics prefix. 
> Hence, stripPrefix is passed on as false, so that prefix isn't removed. This 
> just selects the config subset that starts with METRICS_PREFIX, without 
> removing the prefix. The goal is to not pass on the entire config object and 
> pass only metrics related config into MetricsConfig constructor.

Regarding the goal: That's not how the XConfig classes are intended to be used, 
so it's not really necessary to do this. For example, see the implicit 
conversion b/w Config and XConfig. Prefer passing in config.


- Prateek


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/#review155061
---


On Nov. 9, 2016, 2:50 p.m., Shanthoosh Venkataraman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53297/
> ---
> 
> (Updated Nov. 9, 2016, 2:50 p.m.)
> 
> 
> Review request for samza and Jake Maes.
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> This patch aims at enabling users to define custom reporters to send metrics 
> from the monitors. Configurations required for the definition of the metrics 
> reporters follows the same convention as of the samza jobs.
> 
> 
> Diffs
> -
> 
>   docs/learn/documentation/versioned/rest/monitors.md 
> 46678bbe5fed99f767c3324dc9578ee1a64cec66 
>   docs/learn/documentation/versioned/rest/overview.md 
> c382f032843cce696a445ff110e87a8198cc96d7 
>   samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
> PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
> e0468ee89c89fd720834461771ebb36475475bcb 
>   
> samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
>  f24beb1e099dd44b15b475e0a4a7f70560c6965e 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
> 47b0663637f6db187d86961377ee3ee203b73fdb 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
> 2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
>   samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
> PRE-CREATION 
>   
> samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala
>  8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 
> 
> Diff: https://reviews.apache.org/r/53297/diff/
> 
> 
> Testing
> ---
> 
> Unit tests are done to verify the intended functionality.
> 
> 
> Thanks,
> 
> Shanthoosh Venkataraman
> 
>



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Jagadish Venkatraman


> On Nov. 9, 2016, 12:09 a.m., Jagadish Venkatraman wrote:
> > samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java, line 55
> > 
> >
> > How exactly is this used when metrics are reported? 
> > 
> > Is there a reason for this to be configurable / adding a new config 
> > key? 
> > 
> > If it's configurable, we should document it somewhere?
> 
> Shanthoosh Venkataraman wrote:
> Container name here refers to the execution unit from which the metrics 
> gets reported. It's used when instantiating the MetricsReporter. If this 
> configuration is undefined, it's defaulted to hostName on which SamzaRest is 
> running. Resources will report metrics alongside monitors. Added relevent 
> documentation for it.

Is there a use case where I would want to configure a different container name? 
If not, this need not be a config? (we can default it to `samza-rest`)


- Jagadish


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/#review155372
---


On Nov. 9, 2016, 10:50 p.m., Shanthoosh Venkataraman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53297/
> ---
> 
> (Updated Nov. 9, 2016, 10:50 p.m.)
> 
> 
> Review request for samza and Jake Maes.
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> This patch aims at enabling users to define custom reporters to send metrics 
> from the monitors. Configurations required for the definition of the metrics 
> reporters follows the same convention as of the samza jobs.
> 
> 
> Diffs
> -
> 
>   docs/learn/documentation/versioned/rest/monitors.md 
> 46678bbe5fed99f767c3324dc9578ee1a64cec66 
>   docs/learn/documentation/versioned/rest/overview.md 
> c382f032843cce696a445ff110e87a8198cc96d7 
>   samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
> PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
> e0468ee89c89fd720834461771ebb36475475bcb 
>   
> samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
>  f24beb1e099dd44b15b475e0a4a7f70560c6965e 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
> 47b0663637f6db187d86961377ee3ee203b73fdb 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
> 2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
>   samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
> PRE-CREATION 
>   
> samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala
>  8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 
> 
> Diff: https://reviews.apache.org/r/53297/diff/
> 
> 
> Testing
> ---
> 
> Unit tests are done to verify the intended functionality.
> 
> 
> Thanks,
> 
> Shanthoosh Venkataraman
> 
>



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Shanthoosh Venkataraman


> On Nov. 9, 2016, 12:09 a.m., Jagadish Venkatraman wrote:
> > samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java, line 
> > 65
> > 
> >
> > Why is this taking in a concrete class - ReadableMetricsRegistry ? 
> > Prefer to have this `MetricsRegistry` instead.
> > 
> > We should operate at the level of interfaces whereever possible.

ReadableMetricsRegistry is an interface, not a concrete class. This particular 
type is required for registering with MetricsReporter instances.


> On Nov. 9, 2016, 12:09 a.m., Jagadish Venkatraman wrote:
> > docs/learn/documentation/versioned/rest/monitors.md, line 95
> > 
> >
> > Documentation looks different (than the rest of the page). Would be 
> > nice if this was instructional in nature.
> > 
> > /s/create/create and report.
> > /s/theirs/yours

Done.


> On Nov. 9, 2016, 12:09 a.m., Jagadish Venkatraman wrote:
> > docs/learn/documentation/versioned/rest/monitors.md, line 99
> > 
> >
> > 1. s/Please refer/You can refer
> > 2. `custom` is probably redundant.
> > 3. The link and anchor text placement is weird. 
> > 4. Prefer to be precise with documentation. (For example, this line has 
> > the word `metrics reporters` thrice)

Done.


> On Nov. 9, 2016, 12:09 a.m., Jagadish Venkatraman wrote:
> > docs/learn/documentation/versioned/rest/monitors.md, line 100
> > 
> >
> > This line is not needed IMO. (If we choose to retain, it, should 
> > atleast be re-worded)

Removed.


> On Nov. 9, 2016, 12:09 a.m., Jagadish Venkatraman wrote:
> > samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java, line 55
> > 
> >
> > How exactly is this used when metrics are reported? 
> > 
> > Is there a reason for this to be configurable / adding a new config 
> > key? 
> > 
> > If it's configurable, we should document it somewhere?

Container name here refers to the execution unit from which the metrics gets 
reported. It's used when instantiating the MetricsReporter. If this 
configuration is undefined, it's defaulted to hostName on which SamzaRest is 
running. Resources will report metrics alongside monitors. Added relevent 
documentation for it.


- Shanthoosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/#review155372
---


On Nov. 9, 2016, 10:50 p.m., Shanthoosh Venkataraman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53297/
> ---
> 
> (Updated Nov. 9, 2016, 10:50 p.m.)
> 
> 
> Review request for samza and Jake Maes.
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> This patch aims at enabling users to define custom reporters to send metrics 
> from the monitors. Configurations required for the definition of the metrics 
> reporters follows the same convention as of the samza jobs.
> 
> 
> Diffs
> -
> 
>   docs/learn/documentation/versioned/rest/monitors.md 
> 46678bbe5fed99f767c3324dc9578ee1a64cec66 
>   docs/learn/documentation/versioned/rest/overview.md 
> c382f032843cce696a445ff110e87a8198cc96d7 
>   samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
> PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
> e0468ee89c89fd720834461771ebb36475475bcb 
>   
> samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
>  f24beb1e099dd44b15b475e0a4a7f70560c6965e 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
> 47b0663637f6db187d86961377ee3ee203b73fdb 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
> 2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
>   samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
> PRE-CREATION 
>   
> samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala
>  8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 
> 
> Diff: https://reviews.apache.org/r/53297/diff/
> 
> 
> Testing
> ---
> 
> Unit tests are done to verify the intended functionality.
> 
> 
> Thanks,
> 
> Shanthoosh Venkataraman
> 
>



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Shanthoosh Venkataraman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/
---

(Updated Nov. 9, 2016, 10:50 p.m.)


Review request for samza and Jake Maes.


Repository: samza


Description
---

This patch aims at enabling users to define custom reporters to send metrics 
from the monitors. Configurations required for the definition of the metrics 
reporters follows the same convention as of the samza jobs.


Diffs (updated)
-

  docs/learn/documentation/versioned/rest/monitors.md 
46678bbe5fed99f767c3324dc9578ee1a64cec66 
  docs/learn/documentation/versioned/rest/overview.md 
c382f032843cce696a445ff110e87a8198cc96d7 
  samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
PRE-CREATION 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
e0468ee89c89fd720834461771ebb36475475bcb 
  
samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
 f24beb1e099dd44b15b475e0a4a7f70560c6965e 
  samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
47b0663637f6db187d86961377ee3ee203b73fdb 
  samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
  samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
PRE-CREATION 
  
samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala 
8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 

Diff: https://reviews.apache.org/r/53297/diff/


Testing
---

Unit tests are done to verify the intended functionality.


Thanks,

Shanthoosh Venkataraman



Re: Review Request 53297: Initial version of adding metrics into samza rest.

2016-11-09 Thread Shanthoosh Venkataraman


> On Nov. 9, 2016, 12:31 a.m., Prateek Maheshwari wrote:
> > samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java, 
> > line 36
> > 
> >
> > Minor: private constructors for helper classes are pretty universal, 
> > don't think they need a comment.

Done.


> On Nov. 9, 2016, 12:31 a.m., Prateek Maheshwari wrote:
> > docs/learn/documentation/versioned/rest/monitors.md, line 100
> > 
> >
> > This section could be more concise.

Changed, removed some of the docs.


> On Nov. 9, 2016, 12:31 a.m., Prateek Maheshwari wrote:
> > samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java, 
> > line 40
> > 
> >
> > This javadoc doesn't add any more information than the method signature 
> > already provides. Prefer removing.

Removed.


> On Nov. 9, 2016, 12:31 a.m., Prateek Maheshwari wrote:
> > samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java, line 55
> > 
> >
> > What is this config, and what does container refer to here? The 
> > samza-rest service container?

Container name here refers to the execution unit from which the metrics gets 
reported. It's used when instantiating the MetricsReporter. If this 
configuration is undefined, it's defaulted to hostName on which SamzaRest is 
running. Resources will report metrics alongside monitors.  Added relevent 
documentation for it.


> On Nov. 9, 2016, 12:31 a.m., Prateek Maheshwari wrote:
> > samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java, 
> > line 45
> > 
> >
> > Not sure this specific method (for classloading this particular class 
> > from configs) deserves a new Util class. This is a general pattern used in 
> > other places, maybe we should extract that as a util instead.
> > 
> > May be worth revisiting if/how we want to share code b/w samza-rest and 
> > samza-core. Ideally samza-rest shouldn't depend on samza-core, and shared 
> > stuff should be pulled out to samza-api or samza-common (which doesn't 
> > exist yet).

Discussed offline. Extracting utils(and other common functionalities) shared 
between samza-core & samza-rest into samza-common is a longer term goal which 
is required to remove dependencies. However, for the scope of this patch, doing 
this will be a overkill. Punted for now.


- Shanthoosh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53297/#review155380
---


On Nov. 9, 2016, 10:50 p.m., Shanthoosh Venkataraman wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53297/
> ---
> 
> (Updated Nov. 9, 2016, 10:50 p.m.)
> 
> 
> Review request for samza and Jake Maes.
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> This patch aims at enabling users to define custom reporters to send metrics 
> from the monitors. Configurations required for the definition of the metrics 
> reporters follows the same convention as of the samza jobs.
> 
> 
> Diffs
> -
> 
>   docs/learn/documentation/versioned/rest/monitors.md 
> 46678bbe5fed99f767c3324dc9578ee1a64cec66 
>   docs/learn/documentation/versioned/rest/overview.md 
> c382f032843cce696a445ff110e87a8198cc96d7 
>   samza-core/src/main/java/org/apache/samza/util/MetricsReporterLoader.java 
> PRE-CREATION 
>   samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
> e0468ee89c89fd720834461771ebb36475475bcb 
>   
> samza-core/src/main/scala/org/apache/samza/metrics/ContainerProcessManagerMetrics.scala
>  f24beb1e099dd44b15b475e0a4a7f70560c6965e 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestConfig.java 
> 47b0663637f6db187d86961377ee3ee203b73fdb 
>   samza-rest/src/main/java/org/apache/samza/rest/SamzaRestService.java 
> 2a3e83a24a5343bb53b93fc9d0a647c1b253714b 
>   samza-rest/src/test/java/org/apache/samza/rest/TestSamzaRestService.java 
> PRE-CREATION 
>   
> samza-yarn/src/main/scala/org/apache/samza/job/yarn/SamzaAppMasterMetrics.scala
>  8a5b4aaea6e11a5af999f12d50e5b6135dbc70ca 
> 
> Diff: https://reviews.apache.org/r/53297/diff/
> 
> 
> Testing
> ---
> 
> Unit tests are done to verify the intended functionality.
> 
> 
> Thanks,
> 
> Shanthoosh Venkataraman
> 
>



Re: Review Request 53282: SAMZA-1043: Samza performance improvements

2016-11-09 Thread Prateek Maheshwari

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53282/#review155478
---


Ship it!




Looks pretty good, thanks!


samza-core/src/main/java/org/apache/samza/util/ThrottlingExecutor.java (line 
134)


Maybe make private if not being tested.



samza-core/src/main/scala/org/apache/samza/config/MetricsConfig.scala (line 23)


Unused import?



samza-core/src/main/scala/org/apache/samza/config/MetricsConfig.scala (line 23)






samza-core/src/main/scala/org/apache/samza/config/MetricsConfig.scala (line 23)






samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala (line 
155)






samza-core/src/main/scala/org/apache/samza/util/Util.scala (line 393)






samza-core/src/main/scala/org/apache/samza/util/Util.scala (line 393)


s/TimerClock/HighResolutionClock


- Prateek Maheshwari


On Nov. 9, 2016, 11:10 a.m., Xinyu Liu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53282/
> ---
> 
> (Updated Nov. 9, 2016, 11:10 a.m.)
> 
> 
> Review request for samza, Chris Pettitt, Jake Maes, and Navina Ramesh.
> 
> 
> Repository: samza
> 
> 
> Description
> ---
> 
> In the recent experiments of samza batch job (consuming hdfs data on hadoop), 
> the results are subpar to map/reduce and spark. By looking at the metrics 
> closely, we found two basic problems:
> 
> 1) Not enough data to process. This is spotted as the unprocessed message 
> queue length was zero for quite a lot of times.
> 
> 2) Not process fast enough. We found samza performed closely in both median 
> size records (100B) and small record (10B), while spark can scale very well 
> in the small record (over 1M/s).
> 
> The first problem is solved by increasing the buffer size. This ticket is to 
> address the second problem, which contains three major improvements:
> 
> - Option to turn off timer metrics calculation: one of the main time spent in 
> samza processing turns out to be just keeping the timer metrics. While it is 
> useful in debugging, it becomes a bottleneck when running a stable job with 
> high performance. In my testing job which consumes 8M mock data, it took 30 
> secs with timer metrics on. After turning it off, it only took 14 secs.
> 
> - Java coding improvements: The AsyncRunLoop code can be further optimized 
> for efficiency. Some of the thread-safe data structure I am using is not for 
> optimal performance (Collections.synchronizedSet). I switched to use 
> CopyOnWriteArraySet, which has far better performance due to more reads and 
> small set size.
> 
> - Specific handling for in-order processing improvements: AsyncRunLoop 
> handles the callbacks regardless of whether it's in-order or out-of-order 
> (max concurrency > 1), which incurs quite some cost. By simplying the logic 
> for in-order handling, the performance gains.
> 
> After all three improvements, my test job with mock input (8M messages) can 
> be processed within 8 sec (down from org 30 secs), so it's 1M/s for one cpu 
> core.
> 
> For the performance benchmark jobs running in Hadoop, we also see a 4 times 
> improvement with all the fixes above. Please take a look at the attached 
> spreedsheet (see the numbers with fix(turn off the timing metrics) and 
> fix2(all three together).
> 
> 
> Diffs
> -
> 
>   samza-core/src/main/java/org/apache/samza/container/RunLoopFactory.java 
> 609a956a1f2fa97419c2f66fe2fb6876aaaeecd0 
>   samza-core/src/main/java/org/apache/samza/task/AsyncRunLoop.java 
> 8fac8155c7f64e67d4a39ec6943f98da1e1d63d9 
>   samza-core/src/main/java/org/apache/samza/task/CoordinatorRequests.java 
> 052b3b91ec609ca6288662cfa2d3e71b0273d020 
>   samza-core/src/main/java/org/apache/samza/task/TaskCallbackImpl.java 
> 9b700998d2af040c6734289f7f28bbd78c36bd2c 
>   samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java 
> 132cf59eb593524a4cac134aeceeeb37a4c74b1f 
>   samza-core/src/main/java/org/apache/samza/util/HighResolutionClock.java 
> 69ba441ed087305dfe4e1272b00fad67b644e13f 
>   
> samza-core/src/main/java/org/apache/samza/util/SystemHighResolutionClock.java 
> 2e65b603bc959273e679eba3ea9f89c7c0c4664d 
>   samza-core/src/main/java/org/apache/samza/util/ThrottlingExecutor.java 
> d1298fc40680e5ad4db7067c9ef02f0266dffc1d 
>   samza-core/src/main/java/org/apache/samza/util/Utils.java 
> 

Re: Review Request 53282: SAMZA-1043: Samza performance improvements

2016-11-09 Thread Xinyu Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53282/
---

(Updated Nov. 9, 2016, 7:10 p.m.)


Review request for samza, Chris Pettitt, Jake Maes, and Navina Ramesh.


Changes
---

Some final touch on the rb. Please take a look.


Repository: samza


Description
---

In the recent experiments of samza batch job (consuming hdfs data on hadoop), 
the results are subpar to map/reduce and spark. By looking at the metrics 
closely, we found two basic problems:

1) Not enough data to process. This is spotted as the unprocessed message queue 
length was zero for quite a lot of times.

2) Not process fast enough. We found samza performed closely in both median 
size records (100B) and small record (10B), while spark can scale very well in 
the small record (over 1M/s).

The first problem is solved by increasing the buffer size. This ticket is to 
address the second problem, which contains three major improvements:

- Option to turn off timer metrics calculation: one of the main time spent in 
samza processing turns out to be just keeping the timer metrics. While it is 
useful in debugging, it becomes a bottleneck when running a stable job with 
high performance. In my testing job which consumes 8M mock data, it took 30 
secs with timer metrics on. After turning it off, it only took 14 secs.

- Java coding improvements: The AsyncRunLoop code can be further optimized for 
efficiency. Some of the thread-safe data structure I am using is not for 
optimal performance (Collections.synchronizedSet). I switched to use 
CopyOnWriteArraySet, which has far better performance due to more reads and 
small set size.

- Specific handling for in-order processing improvements: AsyncRunLoop handles 
the callbacks regardless of whether it's in-order or out-of-order (max 
concurrency > 1), which incurs quite some cost. By simplying the logic for 
in-order handling, the performance gains.

After all three improvements, my test job with mock input (8M messages) can be 
processed within 8 sec (down from org 30 secs), so it's 1M/s for one cpu core.

For the performance benchmark jobs running in Hadoop, we also see a 4 times 
improvement with all the fixes above. Please take a look at the attached 
spreedsheet (see the numbers with fix(turn off the timing metrics) and fix2(all 
three together).


Diffs (updated)
-

  samza-core/src/main/java/org/apache/samza/container/RunLoopFactory.java 
609a956a1f2fa97419c2f66fe2fb6876aaaeecd0 
  samza-core/src/main/java/org/apache/samza/task/AsyncRunLoop.java 
8fac8155c7f64e67d4a39ec6943f98da1e1d63d9 
  samza-core/src/main/java/org/apache/samza/task/CoordinatorRequests.java 
052b3b91ec609ca6288662cfa2d3e71b0273d020 
  samza-core/src/main/java/org/apache/samza/task/TaskCallbackImpl.java 
9b700998d2af040c6734289f7f28bbd78c36bd2c 
  samza-core/src/main/java/org/apache/samza/task/TaskCallbackManager.java 
132cf59eb593524a4cac134aeceeeb37a4c74b1f 
  samza-core/src/main/java/org/apache/samza/util/HighResolutionClock.java 
69ba441ed087305dfe4e1272b00fad67b644e13f 
  samza-core/src/main/java/org/apache/samza/util/SystemHighResolutionClock.java 
2e65b603bc959273e679eba3ea9f89c7c0c4664d 
  samza-core/src/main/java/org/apache/samza/util/ThrottlingExecutor.java 
d1298fc40680e5ad4db7067c9ef02f0266dffc1d 
  samza-core/src/main/java/org/apache/samza/util/Utils.java 
472e0a59d5aa992b136292c8a3347c311e2cd606 
  samza-core/src/main/scala/org/apache/samza/config/MetricsConfig.scala 
c3fd8bfb2e16a4c5146d34682d04cb1d4e9bbe72 
  samza-core/src/main/scala/org/apache/samza/container/SamzaContainer.scala 
e0468ee89c89fd720834461771ebb36475475bcb 
  samza-core/src/main/scala/org/apache/samza/system/SystemConsumers.scala 
e2aed5b1c2e77a914268963b21809380972037b6 
  samza-core/src/main/scala/org/apache/samza/util/Util.scala 
c4836f202f7eda1d4e71eac94fd48e46207b0316 
  samza-core/src/test/java/org/apache/samza/task/TestAsyncRunLoop.java 
6000ffaf2b8723d48a72e58b571f242a42dc8128 
  samza-core/src/test/java/org/apache/samza/task/TestAsyncStreamAdapter.java 
99e1e18bcfa6bca1e275d8ae030a77ff8d70a4eb 
  samza-core/src/test/java/org/apache/samza/task/TestTaskCallbackImpl.java 
f1dbf35165e6ddfc02e3522887c25d78a4bbfcd7 
  samza-core/src/test/java/org/apache/samza/task/TestTaskCallbackManager.java 
d7110f34a9eae6e9ffc15b4982bfbb180da88b2d 
  samza-core/src/test/java/org/apache/samza/util/TestThrottlingExecutor.java 
0276e6b17bc9ad9413611189b2e9ff2b8793694c 
  
samza-kv/src/main/scala/org/apache/samza/storage/kv/BaseKeyValueStorageEngineFactory.scala
 c975893a42689732c39c39600fecacee843bf9d6 

Diff: https://reviews.apache.org/r/53282/diff/


Testing
---

./gradlew build

Tested in the yarn hadoop cluster with different kinds of jobs.


File Attachments


hdfs performance