[ 
https://issues.apache.org/jira/browse/ATLAS-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

venkata madugundu updated ATLAS-683:
------------------------------------
    Attachment: RedisTypeCacheProvider.java

FYI - Attached an experimental Redis based implementation of the cache provider 
interface. This helped me to validate the completeness of cache provider 
interface with two implementations.

Tested it with Atlas and Redis running on same machine. All operations of Type, 
Entity, Query, Lineage worked fine. Ran quick_start.py and did a sanity test of 
UI by navigation, running queries and looking at Lineage graph.

> Refactor local type-system cache with cache provider interface
> --------------------------------------------------------------
>
>                 Key: ATLAS-683
>                 URL: https://issues.apache.org/jira/browse/ATLAS-683
>             Project: Atlas
>          Issue Type: Sub-task
>    Affects Versions: 0.7-incubating
>            Reporter: venkata madugundu
>            Assignee: venkata madugundu
>            Priority: Critical
>              Labels: high-availability, performance, scalability
>             Fix For: 0.7-incubating
>
>         Attachments: ATLAS-683-1.patch, ATLAS-683.patch, 
> RedisTypeCacheProvider.java
>
>
> As noted in ATLAS-488, local type-system cache makes Atlas runtime stateful 
> and prevents multiple Atlas instances to be active in a cluster. Either the 
> type-cache should be synched across Atlas instances (on all type 
> create/update requests) or the type-cache should be moved out of Atlas to 
> something like a distributed cache. 
> 1. As a first step, the local type-cache code in TypeSystem.java can be 
> refactored to be carved out as an interface like TypeCacheProvider (whose 
> default implementation for a standalone Atlas server would just use 
> in-process local cache). The cache provider implementation itself could be 
> specified as an optional configuration property. Expert users of Atlas can 
> choose to inject a custom cache provider which can likely hit a distributed 
> cache. We are evaluating the use of a distributed cache. 
> 2. As a second step, some more refactoring can be done to minimize/optimize 
> the calls made to TypeSystem for type lookup queries. Essentially, in a given 
> transaction/request, once a type lookup is done, it should not be requeried 
> again. A request scoped variable (guice would probably help with that 
> scoping) can hold all the lookups made in a request. This might sound like a 
> cache of a cache, but I think it should help in reducing the hits to cache 
> provider if the provider is hitting a remote cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to