[jira] [Updated] (PIO-115) Cache name-to-ID lookups for Storage app & channel

Mars Hall (JIRA) Tue, 22 Aug 2017 14:19:20 -0700

     [ 
https://issues.apache.org/jira/browse/PIO-115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mars Hall updated PIO-115:
--------------------------
    Description: 
When stress testing the Universal Recommender with high-concurrency HTTP/REST 
queries, we observed that Elasticsearch traffic was majority composed of 
requests resolving the Storage app's name & channel, over and over and over 
again! In this case, [each per-query call to 
`LEventStore.findByEntity`|https://github.com/heroku/predictionio-engine-ur/blob/master/src/main/scala/URAlgorithm.scala#L694]
 re-resolves the app name to an ID.

Implement memoization for the function that performs these name-to-ID lookups, 
so that only one set of lookups is performed per process for each app+channel 
combination.

  was:
When stress testing the Universal Recommender with high-concurrency HTTP/REST 
queries, we observed that Elasticsearch traffic was majority composed of 
requests resolving the Storage app's name & channel, over and over and over 
again! In this case, [each per-query call to 
`LEventStore.findByEntity`|https://github.com/heroku/predictionio-engine-ur/blob/master/src/main/scala/URAlgorithm.scala#L694]
 re-resolves the app name to an ID.

This changeset implements memoization for the function that performs these 
name-to-ID lookups, so that only one set of lookups is performed per process 
for each app+channel combination. As a result, we've seen overall throughput 
increase 📈 and error rate drop dramatically 📉.

This common optimization effects all storage backends, not just Elasticsearch.


> Cache name-to-ID lookups for Storage app & channel
> --------------------------------------------------
>
>                 Key: PIO-115
>                 URL: https://issues.apache.org/jira/browse/PIO-115
>             Project: PredictionIO
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.11.0-incubating
>            Reporter: Mars Hall
>            Assignee: Mars Hall
>
> When stress testing the Universal Recommender with high-concurrency HTTP/REST 
> queries, we observed that Elasticsearch traffic was majority composed of 
> requests resolving the Storage app's name & channel, over and over and over 
> again! In this case, [each per-query call to 
> `LEventStore.findByEntity`|https://github.com/heroku/predictionio-engine-ur/blob/master/src/main/scala/URAlgorithm.scala#L694]
>  re-resolves the app name to an ID.
> Implement memoization for the function that performs these name-to-ID 
> lookups, so that only one set of lookups is performed per process for each 
> app+channel combination.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (PIO-115) Cache name-to-ID lookups for Storage app & channel

Reply via email to