[sheepdog] [PATCH V3] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Yunkai Zhang
From: Yunkai Zhang V3: - make send_light_req() only return -1 or 0, and update related upper code V2: - rename send_empty_req() to send_light_req(), and move it to net.c - cleanup tow more places: collie/node.c collie/debug.c >

Re: [sheepdog] [PATCH v4 6/8] object cache: reclaim cached objects when cache reaches the max size

2012-07-26 Thread Liu Yuan
On 07/27/2012 12:27 PM, levin li wrote: > +static inline void mark_cache_flush(struct object_cache *cache) > +{ > + cache->in_flush = 1; > +} > + > +static inline void end_cache_flush(struct object_cache *cache) > +{ > + cache->in_flush = 0; > +} > + There is no need to add these two helpe

Re: [sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Yunkai Zhang
On Fri, Jul 27, 2012 at 2:10 PM, Liu Yuan wrote: > On 07/27/2012 02:02 PM, Yunkai Zhang wrote: >> In fact, my fist inner version return EXIT_XXX in send_light_req(), >> but I found that EXIT_xxx code defined in exits.h, they seems not >> appropriate for library functions (compare to exe_req()/send

Re: [sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Liu Yuan
On 07/27/2012 02:02 PM, Yunkai Zhang wrote: > In fact, my fist inner version return EXIT_XXX in send_light_req(), > but I found that EXIT_xxx code defined in exits.h, they seems not > appropriate for library functions (compare to exe_req()/send_req() and > other system calls).Library functions usua

Re: [sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Yunkai Zhang
On Fri, Jul 27, 2012 at 1:44 PM, Liu Yuan wrote: > On 07/27/2012 12:37 PM, Yunkai Zhang wrote: >> I just want to keep the new code consistent with the original version. >> > > The old code is broken -- not consistent in all places, that didn't > abbey the same rule for error code. > >> The upper c

Re: [sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Liu Yuan
On 07/27/2012 12:37 PM, Yunkai Zhang wrote: > I just want to keep the new code consistent with the original version. > The old code is broken -- not consistent in all places, that didn't abbey the same rule for error code. > The upper code, such as cluster_shutdown, will return different EXIT >

Re: [sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Yunkai Zhang
On Fri, Jul 27, 2012 at 11:35 AM, Liu Yuan wrote: > On 07/27/2012 11:22 AM, Yunkai Zhang wrote: >> The top code, such as cluster_shutdown/cluster_recover/..., need to >> use -1 and 1 to distinguish two error types: >> 1) failed to connect or failed to call exe_req() >> 2) success to call exe_re(),

[sheepdog] [PATCH v4 8/8] object cache: add a object_list for cache entry for cache deleting

2012-07-26 Thread levin li
From: levin li When deleting an entire VDI cache, a object list is easy for traversing comparing to a rb-tree Signed-off-by: levin li --- sheep/object_cache.c | 55 + 1 files changed, 32 insertions(+), 23 deletions(-) diff --git a/sheep/object

[sheepdog] [PATCH v4 7/8] object cache: refactor object_cache_remove()

2012-07-26 Thread levin li
From: levin li Since all the cache entry are not stored in memory, we can not only remove entry from dirty tree/list, we should also remove it from the object tree/list. Signed-off-by: levin li --- sheep/object_cache.c | 29 ++--- 1 files changed, 6 insertions(+), 23

[sheepdog] [PATCH v4 6/8] object cache: reclaim cached objects when cache reaches the max size

2012-07-26 Thread levin li
From: levin li This patch do reclaiming work when the total size of cached objects reaches the max size specified by user, I did it in the following way: 1. check the object tree for the object entry to determine whether the cache entry is exist and whether it's reclaiming, if it's reclaiming

[sheepdog] [PATCH v4 5/8] object cache: schedule the object cache in a lru list

2012-07-26 Thread levin li
From: levin li We put all the cached object into a global lru list, when the object cache is referenced(read/write), we move the object to the head of the lru list, then when cache reaches the max size we can reclaim it from the end of the lru list. Signed-off-by: levin li --- sheep/object_cac

[sheepdog] [PATCH v4 4/8] object cache: use rwlock to replace mutex lock for per-vdi cache

2012-07-26 Thread levin li
From: levin li Signed-off-by: levin li --- sheep/object_cache.c | 36 ++-- 1 files changed, 18 insertions(+), 18 deletions(-) diff --git a/sheep/object_cache.c b/sheep/object_cache.c index d59b966..bb8aa96 100644 --- a/sheep/object_cache.c +++ b/sheep/object_

[sheepdog] [PATCH v4 3/8] object cache: merge active and inactive dirt_tree/list

2012-07-26 Thread levin li
From: levin li Since we will share the same entry in both object_tree and dirty_tree, it would make thing more complicated to use two dirty tree/list, so merge them together, and use lock when flushing. Signed-off-by: levin li --- sheep/object_cache.c | 108 +++

[sheepdog] [PATCH v4 2/8] object cache: add object cache tree for every VDI

2012-07-26 Thread levin li
From: levin li Add object cache tree for every VDI to keep track of all the objects cached by the VDI, for the reclaiming work. When sheep starts, we should also read the cached objects in disk which is created by the previous running, otherwise, these cache objects may cause a disk leak. Signe

[sheepdog] [PATCH v4 1/8] sheep: use cmd argument -w to specify a max cache size

2012-07-26 Thread levin li
From: levin li Signed-off-by: levin li --- sheep/sheep.c | 17 ++--- sheep/sheep_priv.h |2 ++ 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/sheep/sheep.c b/sheep/sheep.c index 380a129..9f371ba 100644 --- a/sheep/sheep.c +++ b/sheep/sheep.c @@ -53,7 +53

[sheepdog] [PATCH v4 0/8] object cache reclaim

2012-07-26 Thread levin li
From: levin li v3 > v4: 1. remove SD_RES_CACHE_REFERENCING, just return -1 if object can not be reclaimed in reclaim_object() 2. put the in_flush flags setting in object_cache_push() 3. when cache is in flush, do not stop reclaiming, but just skip the dirty entries. v2 > v3: 1. use

Re: [sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Liu Yuan
On 07/27/2012 11:22 AM, Yunkai Zhang wrote: > The top code, such as cluster_shutdown/cluster_recover/..., need to > use -1 and 1 to distinguish two error types: > 1) failed to connect or failed to call exe_req() > 2) success to call exe_re(), but the response's result isn't SD_RES_SUCCESS. These t

Re: [sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Yunkai Zhang
On Fri, Jul 27, 2012 at 11:11 AM, Liu Yuan wrote: > With a second review, I think this patch need more reviews. > > On 07/26/2012 09:46 PM, Yunkai Zhang wrote: >> +int send_light_req(struct sd_req *hdr, const char *host, int port) >> +{ >> + int fd, ret; >> + struct sd_rsp *rsp = (struct s

Re: [sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Liu Yuan
With a second review, I think this patch need more reviews. On 07/26/2012 09:46 PM, Yunkai Zhang wrote: > +int send_light_req(struct sd_req *hdr, const char *host, int port) > +{ > + int fd, ret; > + struct sd_rsp *rsp = (struct sd_rsp *)hdr; > + unsigned rlen, wlen; > + > + fd = c

Re: [sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Liu Yuan
On 07/26/2012 09:46 PM, Yunkai Zhang wrote: > V2: > - rename send_empty_req() to send_light_req(), and move it to net.c > - cleanup tow more places: collie/node.c collie/debug.c > >8 > > There are several callbacks of collie comm

Re: [sheepdog] [sheepdog-users] Several difficulties with sheepdog (from 0.4.0-0+tek2b-10 deb package)

2012-07-26 Thread David Douard
On 26/07/2012 19:59, Bastian Scholz wrote: > Am 2012-07-26 18:53, schrieb Jens WEBER: >> In a case of a crash, like your network error, you have a problem if >> one node dosn't have a full copy. So 3 nodes must have 3 copies. Or >> use redundant network links, so situation can't happen. For me some

[sheepdog] [PATCH] collie: cleanup callbacks of collie command when they send header only requests

2012-07-26 Thread Yunkai Zhang
From: Yunkai Zhang V2: - rename send_empty_req() to send_light_req(), and move it to net.c - cleanup tow more places: collie/node.c collie/debug.c >8 There are several callbacks of collie command send requests which only contai

Re: [sheepdog] [PATCH] collie: cleanup cluster subcommand callbacks when they send empty request

2012-07-26 Thread Yunkai Zhang
On Thu, Jul 26, 2012 at 5:52 PM, Liu Yuan wrote: > On 07/26/2012 05:31 PM, Yunkai Zhang wrote: >> Cluster subcommand callbacks, such as: cluster_shutdown/cluster_cleanup/..., >> share almost the same code, all of them send empty request which only >> contains request-header without body content. >

Re: [sheepdog] [PATCH] collie: cleanup cluster subcommand callbacks when they send empty request

2012-07-26 Thread Liu Yuan
On 07/26/2012 05:31 PM, Yunkai Zhang wrote: > Cluster subcommand callbacks, such as: cluster_shutdown/cluster_cleanup/..., > share almost the same code, all of them send empty request which only > contains request-header without body content. > > So let's abstract the common part of the them into

Re: [sheepdog] [Sheepdog] [PATCH] logger: initialize log_level

2012-07-26 Thread Liu Yuan
On 07/26/2012 05:42 PM, Liu Yuan wrote: > On 08/09/2011 09:52 PM, MORITA Kazutaka wrote: >> +static int log_level = LOG_INFO; > > If logger isn't inited, so what do you do for log_level = LOG_INFO ? > > can we see the lost message with this patch? I am suspicious of it > Oops, my email client s

Re: [sheepdog] [Sheepdog] [PATCH] logger: initialize log_level

2012-07-26 Thread Liu Yuan
On 08/09/2011 09:52 PM, MORITA Kazutaka wrote: > +static int log_level = LOG_INFO; If logger isn't inited, so what do you do for log_level = LOG_INFO ? can we see the lost message with this patch? I am suspicious of it Thanks, Yuan -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.w

[sheepdog] [PATCH] collie: cleanup cluster subcommand callbacks when they send empty request

2012-07-26 Thread Yunkai Zhang
From: Yunkai Zhang Cluster subcommand callbacks, such as: cluster_shutdown/cluster_cleanup/..., share almost the same code, all of them send empty request which only contains request-header without body content. So let's abstract the common part of the them into a new function: send_empty_req()

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> On 07/26/2012 04:06 PM, Dietmar Maurer wrote: > > But recovery and cleanup actions can take several hours, so it is > > quite hard to find a window on such system? > > We are always optimizing the recovery performance. For now, 30 nodes with > dozens of TB data, the recovery process is less than

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 04:06 PM, Dietmar Maurer wrote: > But recovery and cleanup actions can take several hours, so it is quite hard > to find a window > on such system? We are always optimizing the recovery performance. For now, 30 nodes with dozens of TB data, the recovery process is less than 30 mins.

Re: [sheepdog] [PATCH v3 6/8] object cache: reclaim cached objects when cache reaches the max size

2012-07-26 Thread levin li
On 2012年07月26日 15:55, Liu Yuan wrote: > On 07/26/2012 03:17 PM, levin li wrote: >> +static inline int cache_is_flushing(struct object_cache *cache) >> +{ >> +return cache->in_flush; >> +} >> + > > better renamed as cache_in_flush() > >> +static inline int entry_is_reclaiming(struct object_cac

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> It is only safe to run the command if none of the nodes is recovering, you > said you have node failure event all the time, so there isn't any window for > you to run the command. Your target are system with 1000 nodes . Let's say average lifetime of a node about 3 years. So you get about one n

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:59 PM, Dietmar Maurer wrote: > How do I know in advance that no node failure will happen during 'cluster > cleanup'? If I remember well, it is not a problem during cluster cleanup, the node failure happens. Levin has introduced this command, Levin, can you explain is there any ra

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> On 07/26/2012 03:51 PM, Dietmar Maurer wrote: > > I simply do not understand what you say, sorry. > > > > You first say that I have to run 'collie cluster cleanup' (else my storage > > runs > full). > > Now you say I should not use it? > > It is only safe to run the command if none of the nodes

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:51 PM, Dietmar Maurer wrote: > I simply do not understand what you say, sorry. > > You first say that I have to run 'collie cluster cleanup' (else my storage > runs full). > Now you say I should not use it? It is only safe to run the command if none of the nodes is recovering, y

Re: [sheepdog] [PATCH v3 6/8] object cache: reclaim cached objects when cache reaches the max size

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:17 PM, levin li wrote: > +static inline int cache_is_flushing(struct object_cache *cache) > +{ > + return cache->in_flush; > +} > + better renamed as cache_in_flush() > +static inline int entry_is_reclaiming(struct object_cache_entry *entry) > +{ > + int flags = uatomic_r

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> On 07/26/2012 03:35 PM, Dietmar Maurer wrote: > >> On 07/26/2012 03:26 PM, Dietmar Maurer wrote: > >>> But there is always a change that nodes connect, so you can never do > >>> that > >> safely. > >> > >> For running several cluster for several months, I don't see node > >> event(fail or add new

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:35 PM, Dietmar Maurer wrote: >> On 07/26/2012 03:26 PM, Dietmar Maurer wrote: >>> But there is always a change that nodes connect, so you can never do that >> safely. >> >> For running several cluster for several months, I don't see node event(fail >> or >> add new node) is happen

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:34 PM, Dietmar Maurer wrote: > Or the other way around: Why don't you run automatically when it is easy to > detect > when it is safe to run cluster cleanup? I think I have said it already, if you think it is *easy*, please show us the code. Thanks, Yuan -- sheepdog mailing lis

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:25 PM, Dietmar Maurer wrote: > I you pointed out a way to solve a problem. > It is the idea, haven't proven itself to an efficient approach to *solve* the problem. We can only solve the problem with the code. > But it seems that you do not think it is a problem at all. I don't

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> On 07/26/2012 03:26 PM, Dietmar Maurer wrote: > > But there is always a change that nodes connect, so you can never do that > safely. > > For running several cluster for several months, I don't see node event(fail or > add new node) is happening 100% of the time. What? The claim is that nodes d

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> > On 07/26/2012 03:15 PM, Dietmar Maurer wrote: > > > Which leads me to another interesting question. How does the 'collie > > > cluster cleanup' decides when it is save to purge an object? > > > > When recovery in all the nodes are done (none of node is doing > > recovery) and then it is safe to

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:26 PM, Dietmar Maurer wrote: > But there is always a change that nodes connect, so you can never do that > safely. For running several cluster for several months, I don't see node event(fail or add new node) is happening 100% of the time. Thanks, Yuan -- sheepdog mailing list s

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> On 07/26/2012 03:15 PM, Dietmar Maurer wrote: > > Which leads me to another interesting question. How does the 'collie > > cluster cleanup' decides when it is save to purge an object? > > When recovery in all the nodes are done (none of node is doing recovery) > and then it is safe to type colli

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> On 07/26/2012 03:09 PM, Dietmar Maurer wrote: > > So the previous suggestion would be a way to solve that problem cleanly. > > Before you show the cleaner code, I don't think you can claim it 'clean'. > Instead, I guess the idea you suggest will cause need complicated > implementation when robu

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:15 PM, Dietmar Maurer wrote: > Which leads me to another interesting question. How does the 'collie cluster > cleanup' decides > when it is save to purge an object? When recovery in all the nodes are done (none of node is doing recovery) and then it is safe to type collie cluster

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:09 PM, Dietmar Maurer wrote: > So the previous suggestion would be a way to solve that problem cleanly. Before you show the cleaner code, I don't think you can claim it 'clean'. Instead, I guess the idea you suggest will cause need complicated implementation when robust to handl

[sheepdog] [PATCH v3 7/8] object cache: refactor object_cache_remove()

2012-07-26 Thread levin li
From: levin li Since all the cache entry are not stored in memory, we can not only remove entry from dirty tree/list, we should also remove it from the object tree/list. Signed-off-by: levin li --- sheep/object_cache.c | 29 ++--- 1 files changed, 6 insertions(+), 23

[sheepdog] [PATCH v3 8/8] object cache: add a object_list for cache entry for cache deleting

2012-07-26 Thread levin li
From: levin li When deleting an entire VDI cache, a object list is easy for traversing comparing to a rb-tree Signed-off-by: levin li --- sheep/object_cache.c | 51 - 1 files changed, 29 insertions(+), 22 deletions(-) diff --git a/sheep/object

[sheepdog] [PATCH v3 6/8] object cache: reclaim cached objects when cache reaches the max size

2012-07-26 Thread levin li
From: levin li This patch do reclaiming work when the total size of cached objects reaches the max size specified by user, I did it in the following way: 1. check the object tree for the object entry to determine whether the cache entry is exist and whether it's reclaiming, if it's reclaiming

[sheepdog] [PATCH v3 4/8] object cache: use rwlock to replace mutex lock for per-vdi cache

2012-07-26 Thread levin li
From: levin li Signed-off-by: levin li --- sheep/object_cache.c | 36 ++-- 1 files changed, 18 insertions(+), 18 deletions(-) diff --git a/sheep/object_cache.c b/sheep/object_cache.c index d59b966..bb8aa96 100644 --- a/sheep/object_cache.c +++ b/sheep/object_

[sheepdog] [PATCH v3 5/8] object cache: schedule the object cache in a lru list

2012-07-26 Thread levin li
From: levin li We put all the cached object into a global lru list, when the object cache is referenced(read/write), we move the object to the head of the lru list, then when cache reaches the max size we can reclaim it from the end of the lru list. Signed-off-by: levin li --- sheep/object_cac

[sheepdog] [PATCH v3 3/8] object cache: merge active and inactive dirt_tree/list

2012-07-26 Thread levin li
From: levin li Since we will share the same entry in both object_tree and dirty_tree, it would make thing more complicated to use two dirty tree/list, so merge them together, and use lock when flushing. Signed-off-by: levin li --- sheep/object_cache.c | 108 +++

[sheepdog] [PATCH v3 2/8] object cache: add object cache tree for every VDI

2012-07-26 Thread levin li
From: levin li Add object cache tree for every VDI to keep track of all the objects cached by the VDI, for the reclaiming work. When sheep starts, we should also read the cached objects in disk which is created by the previous running, otherwise, these cache objects may cause a disk leak. Signe

[sheepdog] [PATCH v3 1/8] sheep: use cmd argument -w to specify a max cache size

2012-07-26 Thread levin li
From: levin li Signed-off-by: levin li --- sheep/sheep.c | 17 ++--- sheep/sheep_priv.h |2 ++ 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/sheep/sheep.c b/sheep/sheep.c index 380a129..9f371ba 100644 --- a/sheep/sheep.c +++ b/sheep/sheep.c @@ -53,7 +53

[sheepdog] [PATCH v3 0/8] object cache reclaim

2012-07-26 Thread levin li
From: levin li v2 > v3: 1. use uatomic_cmpxchg() in cache_in_reclaim() to check and set the reclaiming flags atomicly. 2. remove -W cmd arg, use -w instead to specify the max cache size 3. rename object_cache_access_begin{end} to get{put}_cache_entry() 4. move read{write}_cache_object() an

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> We have a command collie cluster cleanup to manually remove these unused > objects. Also, restarting sheep will trigger an internal cleanup operation of > that sheep too. Which leads me to another interesting question. How does the 'collie cluster cleanup' decides when it is save to purge an ob

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> Due to the fact that we are not easily to get the agreement(one of most > difficult aspect in distributed system), we don't actually remove the object > that is already migrated. > > We have a command collie cluster cleanup to manually remove these unused > objects. Also, restarting sheep will t

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Liu Yuan
On 07/26/2012 03:01 PM, Dietmar Maurer wrote: >>> How do you detect the point in time when it is safe to remove an >>> object (because all new members have all data the need)? >> >> Sorry, I'm afraid I don't understand the question. Which objects do we need >> to remove? Can you give an example?

Re: [sheepdog] read/write during recovery

2012-07-26 Thread Dietmar Maurer
> > How do you detect the point in time when it is safe to remove an > > object (because all new members have all data the need)? > > Sorry, I'm afraid I don't understand the question. Which objects do we need > to remove? Can you give an example? When we re-balance. For example 1.) obj is sto