[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Wang updated ZOOKEEPER-4261:
---------------------------------
    Description: 
I am working on a project which uses Zookeeper as the authority KV store, the 
write TPS would be quite low, but the read TPS would be some like 10K from 10K 
clients. So my idea is quite simple, I will use a cache layer(like http) in 
front of ZK, then my problem become how to keep my cache updated.

After some investigation, I come up with 2 solutions:
 # Run the Observer inside my cache, so that the cache would be updated 
incrementally, and could be persisted with the stable ZK data dir.
 # [Use 
https://curator.apache.org/curator-recipes/curator-cache.html.|https://curator.apache.org/curator-recipes/curator-cache.html]

With approach 2, the disconnection of the session would lead to read all the 
keys inside the KV store, and I have to implement a persistent store myself if 
I want to store the data locally.

For the approach 1, I come up with the question:

Each server inside ensemble should have a serverid, which is read from 
data/myid, it is limited to range [0, 255]. I my case I would like to have 
thousands of Observer process running, which would easily break the limitation. 

After look into the code, I don't think each observer has unique serverid is 
not necessary (I tested the Observer with a quite large number, it works 
though). I think we should remove the serverid of Observer so that we could 
build application on Observer easily and benefit from the streaming of 
transaction log.

 

 

  was:
I am working on a project which uses Zookeeper as the authority KV store, the 
write TPS would be quite low, but the read TPS would be some like 10K from 10K 
clients. So my idea is quite simple, I will use a cache layer(like http) in 
front of ZK, then my problem become how to keep my cache updated.

After some investigation, I come up with 2 solutions:
 # Run the Observer inside my cache, so that the cache would be updated 
incrementally, and could be persisted with the stable ZK data dir.
 # [Used 
https://curator.apache.org/curator-recipes/curator-cache.html.|https://curator.apache.org/curator-recipes/curator-cache.html]

With the approach 1, I come up with the question:

Each server inside ensemble should have a serverid, which is read from 
data/myid, it is limited to range [0, 255]. I my case I would like to have 
thousands of Observer process running, which would easily break the limitation. 

After look into the code, I don't think each observer has unique serverid is 
not necessary (I tested the Observer with a quite large number, it works 
though). I think we should remove the serverid of Observer so that we could 
build application on Observer easily and benefit from the streaming of 
transaction log.

 

 


> Observer do not need a serverid
> -------------------------------
>
>                 Key: ZOOKEEPER-4261
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4261
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: server
>    Affects Versions: 3.6.2
>            Reporter: Jian Wang
>            Priority: Major
>
> I am working on a project which uses Zookeeper as the authority KV store, the 
> write TPS would be quite low, but the read TPS would be some like 10K from 
> 10K clients. So my idea is quite simple, I will use a cache layer(like http) 
> in front of ZK, then my problem become how to keep my cache updated.
> After some investigation, I come up with 2 solutions:
>  # Run the Observer inside my cache, so that the cache would be updated 
> incrementally, and could be persisted with the stable ZK data dir.
>  # [Use 
> https://curator.apache.org/curator-recipes/curator-cache.html.|https://curator.apache.org/curator-recipes/curator-cache.html]
> With approach 2, the disconnection of the session would lead to read all the 
> keys inside the KV store, and I have to implement a persistent store myself 
> if I want to store the data locally.
> For the approach 1, I come up with the question:
> Each server inside ensemble should have a serverid, which is read from 
> data/myid, it is limited to range [0, 255]. I my case I would like to have 
> thousands of Observer process running, which would easily break the 
> limitation. 
> After look into the code, I don't think each observer has unique serverid is 
> not necessary (I tested the Observer with a quite large number, it works 
> though). I think we should remove the serverid of Observer so that we could 
> build application on Observer easily and benefit from the streaming of 
> transaction log.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to