[sheepdog] [PATCH] func/test: change functional test output for __vdi_list

2014-12-16 Thread Teruaki Ishizaki
Change output of functional test using __vdi_list
in assosiation with adding block_size_shift information
to vdi list command.

Signed-off-by: Teruaki Ishizaki ishizaki.teru...@lab.ntt.co.jp
---
 tests/functional/016.out |2 +-
 tests/functional/029.out |   18 +++---
 tests/functional/030.out |  158 +++---
 tests/functional/031.out |   20 +++---
 tests/functional/039.out |   42 ++--
 tests/functional/040.out |8 +-
 tests/functional/041.out |   70 ++--
 tests/functional/043.out |   24 
 tests/functional/044.out |2 +-
 tests/functional/046.out |   24 
 tests/functional/047.out |4 +-
 tests/functional/048.out |6 +-
 tests/functional/052.out |   64 +-
 tests/functional/059.out |   24 
 tests/functional/060.out |   80 
 tests/functional/062.out |2 +-
 tests/functional/068.out |   12 ++--
 tests/functional/072.out |   12 ++--
 tests/functional/076.out |8 +-
 tests/functional/077.out |4 +-
 tests/functional/078.out |   18 +++---
 tests/functional/079.out |   16 +++---
 tests/functional/080.out |8 +-
 tests/functional/083.out |4 +-
 tests/functional/088.out |8 +-
 tests/functional/091.out |8 +-
 tests/functional/092.out |   20 +++---
 tests/functional/096.out |   66 ++--
 28 files changed, 366 insertions(+), 366 deletions(-)

diff --git a/tests/functional/016.out b/tests/functional/016.out
index ab648d4..50e23d7 100644
--- a/tests/functional/016.out
+++ b/tests/functional/016.out
@@ -2,7 +2,7 @@ QA output created by 016
 using backend plain store
 Failed to create snapshot for base, maybe snapshot id (0) or tag (tag) is 
existed
 there should be no vdi
-  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
+  NameIdSizeUsed  SharedCreation time   VDI id  Copies  
Tag   Block Size Shift
 there should be no object
 STORE  DATAVDI VMSTATE ATTRLEDGER  STALE
 0  0   3   0   0   0   0
diff --git a/tests/functional/029.out b/tests/functional/029.out
index 23117f7..7c50653 100644
--- a/tests/functional/029.out
+++ b/tests/functional/029.out
@@ -6,15 +6,15 @@ To create replicated vdi, set -c x
 To create erasure coded vdi, set -c x:y
   x(2,4,8,16)  - number of data strips
   y(1 to 15)   - number of parity strips
-  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
-  test50   20 MB   20 MB  0.0 MB DATE   fd2c304:2  
-  test40   20 MB   20 MB  0.0 MB DATE   fd2de3  4  
-  test70   20 MB   20 MB  0.0 MB DATE   fd2f964:4  
-  test60   20 MB   20 MB  0.0 MB DATE   fd31494:3  
-  test30   20 MB   20 MB  0.0 MB DATE   fd3662  3  
-  test20   20 MB  0.0 MB   20 MB DATE   fd3816  2  
-  test90   20 MB   20 MB  0.0 MB DATE   fd4094   16:7  
-  test80   20 MB   20 MB  0.0 MB DATE   fd42474:5  
+  NameIdSizeUsed  SharedCreation time   VDI id  Copies  
Tag   Block Size Shift
+  test50   20 MB   20 MB  0.0 MB DATE   fd2c304:222
+  test40   20 MB   20 MB  0.0 MB DATE   fd2de3  422
+  test70   20 MB   20 MB  0.0 MB DATE   fd2f964:422
+  test60   20 MB   20 MB  0.0 MB DATE   fd31494:322
+  test30   20 MB   20 MB  0.0 MB DATE   fd3662  322
+  test20   20 MB  0.0 MB   20 MB DATE   fd3816  222
+  test90   20 MB   20 MB  0.0 MB DATE   fd4094   16:722
+  test80   20 MB   20 MB  0.0 MB DATE   fd42474:522
 Looking for the object 0xfd38150001 (vid 0xfd3816 idx 1, 2 copies) with 23 
nodes
 
 127.0.0.1:7000 doesn't have the object
diff --git a/tests/functional/030.out b/tests/functional/030.out
index 5b386ab..00f50a9 100644
--- a/tests/functional/030.out
+++ b/tests/functional/030.out
@@ -5,36 +5,36 @@ Index Tag Snapshot Time
 Index  Tag Snapshot Time
 1  s1  DATE
 2  s2  DATE
-  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
-s test11   10 MB   12 MB  0.0 MB DATE   fd32fc  6  
-s test12   10 MB   12 MB  0.0 MB DATE   fd32fd  6  
-  test10   10 MB  0.0 MB   12 MB DATE   fd32fe  6  
-s test21   10 MB   12 MB  0.0 MB DATE   fd3815  3  
-s test22   10 MB   12 MB  0.0 MB DATE   fd3816  3  
-  test20   10 MB  0.0 MB   12 MB DATE   fd3817  3  
+  NameIdSizeUsed  SharedCreation time   VDI id  Copies  
Tag   Block Size Shift
+s test11   10 MB   12 

[sheepdog] [PATCH] sheep: fix bug for not saving block_size_shift to cluster config

2014-12-16 Thread Teruaki Ishizaki
This patch fixes bugs that block_size_shift info was forgotten
after cluster shutdown and start sheepdog.

Add block_size_shift info to cluster config file.

Signed-off-by: Teruaki Ishizaki ishizaki.teru...@lab.ntt.co.jp
---
 sheep/config.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/sheep/config.c b/sheep/config.c
index 383a1ed..dfad5fd 100644
--- a/sheep/config.c
+++ b/sheep/config.c
@@ -11,7 +11,7 @@
 
 #include sheep_priv.h
 
-#define SD_FORMAT_VERSION 0x0005
+#define SD_FORMAT_VERSION 0x0006
 #define SD_CONFIG_SIZE 40
 
 static struct sheepdog_config {
@@ -21,7 +21,7 @@ static struct sheepdog_config {
uint8_t store[STORE_LEN];
uint8_t shutdown;
uint8_t copy_policy;
-   uint8_t __pad;
+   uint8_t block_size_shift;
uint16_t version;
uint64_t space;
 } config;
@@ -64,6 +64,7 @@ static int get_cluster_config(struct cluster_info *cinfo)
cinfo-nr_copies = config.copies;
cinfo-flags = config.flags;
cinfo-copy_policy = config.copy_policy;
+   cinfo-block_size_shift = config.block_size_shift;
memcpy(cinfo-store, config.store, sizeof(config.store));
 
return SD_RES_SUCCESS;
@@ -155,6 +156,7 @@ int set_cluster_config(const struct cluster_info *cinfo)
config.copies = cinfo-nr_copies;
config.copy_policy = cinfo-copy_policy;
config.flags = cinfo-flags;
+   config.block_size_shift = cinfo-block_size_shift;
memset(config.store, 0, sizeof(config.store));
pstrcpy((char *)config.store, sizeof(config.store),
(char *)cinfo-store);
-- 
1.7.1

-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [sheepdog-users] sheepdog 0.9 vs live migration

2014-12-16 Thread Bastian Scholz

Hi Hitoshi,

sorry, second try to send to the list...

Am 2014-12-16 10:51, schrieb Hitoshi Mitake:

if I remove the VDI lock the live migration works correctly:
$ dog vdi lock unlock test-vm-disk

but after the live migration I can't relock the VDI.



Thanks for your report. As you say, live migration and vdi locking
seem to be conflicted. I'll work on it later. But I'm not familiar
with qemu's live migration feature, so it would take time. Could you
add an issue to our launchpad tracker for remainder?


You had two qemu instances temporarily running which accesses
the same vdi on different hosts.

Similar problem exists with drbd (kind of RAID 1 over network
on two nodes), which let only one node (Primary node) access
the drbd device in default configuration. But for live
migration both nodes (or better both qemu instances) need
access to the drbd device.

For this, drbd has a dual primary mode which can be enabled
by the drbd admin utility temporarily by a command line switch.
I let handle this task to libvirt in my environment by writing
a simple hook script for libvirt, which enabled the dual primary
mode when a live migration starts and disables it when finished.

For sheepdog it would be nice (at least for me), if it is
possible to unlock the vdi, migrate the guest to a new node and
lock the vdi again (Dont know if this is possible to implement
in sheepdog; Allow a second lock to an already locked vdi and
clear the old lock automatically after the old qemu is destroyed?)

Cheers

Bastian

--
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH RFT 0/4] garbage collect needless VIDs and inode objects

2014-12-16 Thread Valerio Pachera
2014-12-15 10:36 GMT+01:00 Hitoshi Mitake mitake.hito...@lab.ntt.co.jp:
 Current sheepdog never recycles VIDs. But it will cause problems
 e.g. VID space exhaustion, too much garbage inode objects.

I've been testing this branch and it seem to work.
I use a script that creates 3 vdi, 3 snapshot for each (writing 10M of
data), then removes them and look for objects with name starting with
80*.

With all snap active
/mnt/sheep/1/80fd3663
/mnt/sheep/0/80fd3818
/mnt/sheep/0/80fd32fc
/mnt/sheep/0/80fd32fd
/mnt/sheep/0/80fd32fe

After removing all snap
/mnt/sheep/1/80fd3663
/mnt/sheep/0/80fd3818
/mnt/sheep/0/80fd32fc
/mnt/sheep/0/80fd32fd
/mnt/sheep/0/80fd32fe

After removing all vdi
empty

sheep -v
Sheepdog daemon version 0.9.0_25_g24ef77f

But I found a repeatable sheepdog crash!
I notice that happening if I was running the script a second time.
The crash occur after when I recreate a vdi with the same name and
then I take a snapshot of it.

Dec 16 12:12:42   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
client=127.0.0.1:40067, op=DEL_VDI, result=00
Dec 16 12:12:47   INFO [main] rx_main(830) req=0x7f314400e5a0, fd=26,
client=127.0.0.1:40069, op=DEL_VDI, data=(not string)
Dec 16 12:12:47   INFO [main] run_vid_gc(2106) all members of the
family (root: fd3662) are deleted
Dec 16 12:12:47   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
client=127.0.0.1:40069, op=DEL_VDI, result=00
Dec 16 12:13:57   INFO [main] rx_main(830) req=0x7f314400e5a0, fd=26,
client=127.0.0.1:40072, op=NEW_VDI, data=(not string)
Dec 16 12:13:57   INFO [main] post_cluster_new_vdi(133)
req-vdi.base_vdi_id: 0, rsp-vdi.vdi_id: fd32fc
Dec 16 12:13:57   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
client=127.0.0.1:40072, op=NEW_VDI, result=00
Dec 16 12:14:12   INFO [main] rx_main(830) req=0x7f314400e5a0, fd=26,
client=127.0.0.1:40074, op=NEW_VDI, data=(not string)
Dec 16 12:14:13   INFO [main] post_cluster_new_vdi(133)
req-vdi.base_vdi_id: 0, rsp-vdi.vdi_id: fd3815
Dec 16 12:14:13   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
client=127.0.0.1:40074, op=NEW_VDI, result=00
Dec 16 12:14:23   INFO [main] rx_main(830) req=0x7f314400e5a0, fd=26,
client=127.0.0.1:40076, op=NEW_VDI, data=(not string)
Dec 16 12:14:23   INFO [main] post_cluster_new_vdi(133)
req-vdi.base_vdi_id: 0, rsp-vdi.vdi_id: fd3662
Dec 16 12:14:23   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
client=127.0.0.1:40076, op=NEW_VDI, result=00
Dec 16 12:14:34   INFO [main] rx_main(830) req=0x7f314400d310, fd=26,
client=127.0.0.1:40078, op=NEW_VDI, data=(not string)
Dec 16 12:14:34  EMERG [main] crash_handler(268) sheep exits
unexpectedly (Segmentation fault).
Dec 16 12:14:34  EMERG [main] sd_backtrace(833) sheep.c:270: crash_handler
Dec 16 12:14:34  EMERG [main] sd_backtrace(847)
/lib/x86_64-linux-gnu/libpthread.so.0(+0xf02f) [0x7f31515cc02f]
Dec 16 12:14:34  EMERG [main] sd_backtrace(833) vdi.c:64:
lookup_vdi_family_member
Dec 16 12:14:34  EMERG [main] sd_backtrace(833) vdi.c:109: update_vdi_family
Dec 16 12:14:34  EMERG [main] sd_backtrace(833) vdi.c:396: add_vdi_state
Dec 16 12:14:34  EMERG [main] sd_backtrace(833) ops.c:674:
cluster_notify_vdi_add
Dec 16 12:14:34  EMERG [main] sd_backtrace(833) group.c:948: sd_notify_handler
Dec 16 12:14:34  EMERG [main] sd_backtrace(833) zookeeper.c:1252:
zk_event_handler
Dec 16 12:14:34  EMERG [main] sd_backtrace(833) event.c:210: do_event_loop
Dec 16 12:14:34  EMERG [main] sd_backtrace(833) sheep.c:963: main
Dec 16 12:14:34  EMERG [main] sd_backtrace(847)
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfc)
[0x7f3150badeac]
Dec 16 12:14:34  EMERG [main] sd_backtrace(847) sheep() [0x405fa8]

How to reproduce:

dog cluster format -c 2
dog vdi create -P  test 1G
dog vdi snapshot test
dd if=/dev/urandom bs=1M count=10 | dog vdi write test
dog vdi delete -s 1 test
dog vdi delete test
echo 'Recreating vdi test'
dog vdi create -P  test 1G
dog vdi snapshot test   -- at this point, sheep crashes
dog vdi list
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


[sheepdog] Fwd: [PATCH 2/2] sheep: forbid revival of orphan objects

2014-12-16 Thread Valerio Pachera
2014-12-11 8:00 GMT+01:00 Hitoshi Mitake mitake.hito...@lab.ntt.co.jp:
 Current recovery process can cause revival of orphan objects. This
 patch solves this problem.

sheep -v
Sheepdog daemon version 0.9.0_18_g7215788

It works fine!

Dec 16 15:00:20   INFO [main] main(966) shutdown
Dec 16 15:00:20   INFO [main] zk_leave(989) leaving from cluster
Dec 16 15:00:37   INFO [main] md_add_disk(343) /mnt/sheep/0, vdisk nr
206, total disk 1
Dec 16 15:00:37   INFO [main] md_add_disk(343) /mnt/sheep/1, vdisk nr
279, total disk 2
Dec 16 15:00:37 NOTICE [main] get_local_addr(522) found IPv4 address
Dec 16 15:00:37   INFO [main] send_join_request(1016) IPv4
ip:192.168.10.7 port:7000 going to join the cluster
Dec 16 15:00:37 NOTICE [main] nfs_init(611) nfs server service is not compiled
Dec 16 15:00:37   INFO [main] main(958) sheepdog daemon (version
0.9.0_18_g7215788) started
Dec 16 15:00:38   INFO [rw 30221] prepare_object_list(1100) skipping
object list reading from IPv4 ip:192.168.10.7 port:7000 becauseit is
marked as excluded node
Dec 16 15:00:38   INFO [main] recover_object_main(930) object recovery
progress   2%
Dec 16 15:00:38   INFO [main] recover_object_main(930) object recovery
progress   3%

There's only this corner case to fix:
all vdi are removed then and the disconnected node joins back the cluster

Dec 16 14:55:09   INFO [main] zk_leave(989) leaving from cluster
Dec 16 14:55:40   INFO [main] md_add_disk(343) /mnt/sheep/0, vdisk nr
206, total disk 1
Dec 16 14:55:40   INFO [main] md_add_disk(343) /mnt/sheep/1, vdisk nr
279, total disk 2
Dec 16 14:55:40 NOTICE [main] get_local_addr(522) found IPv4 address
Dec 16 14:55:40   INFO [main] send_join_request(1016) IPv4
ip:192.168.10.7 port:7000 going to join the cluster
Dec 16 14:55:40 NOTICE [main] nfs_init(611) nfs server service is not compiled
Dec 16 14:55:40   INFO [main] main(958) sheepdog daemon (version
0.9.0_18_g7215788) started
Dec 16 14:55:41   INFO [rw 30049] prepare_object_list(1100) skipping
object list reading from IPv4 ip:192.168.10.5 port:7000 becauseit is
marked as excluded node
Dec 16 14:55:41  ERROR [rw 30049] sheep_exec_req(1170) failed No
object found, remote address: 192.168.10.5:7000, op name: GET_HASH
Dec 16 14:55:41  ERROR [rw 30068] sheep_exec_req(1170) failed No
object found, remote address: 192.168.10.5:7000, op name: GET_HASH
Dec 16 14:55:41  ERROR [rw 30068] sheep_exec_req(1170) failed No
object found, remote address: 192.168.10.6:7000, op name: GET_HASH
Dec 16 14:55:41  ERROR [rw 30072] sheep_exec_req(1170) failed No
object found, remote address: 192.168.10.5:7000, op name: GET_HASH
Dec 16 14:55:41   INFO [main] recover_object_main(930) object recovery
progress   1%
cut
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 2/2] sheep: forbid revival of orphan objects

2014-12-16 Thread Valerio Pachera
2014-12-16 15:07 GMT+01:00 Valerio Pachera siri...@gmail.com:
 It works fine!
...
 There's only this corner case to fix:
 all vdi are removed then and the disconnected node joins back the cluster

Please, notice that the same logic should apply to multi device:

create some vdi
unplug a disk
remove some vdi
plug back the disk

This still causes

Dec 16 15:10:16   INFO [main] recover_object_main(930) object recovery
progress  74%
Dec 16 15:10:16  ERROR [rw 30554] sheep_exec_req(1170) failed Network
error between sheep, remote address: 192.168.10.5:7000, op name:
READ_PEER
Dec 16 15:10:16  ERROR [rw 30553] sheep_exec_req(1170) failed Network
error between sheep, remote address: 192.168.10.5:7000, op name:
READ_PEER
Dec 16 15:10:16  ERROR [rw 30500] sheep_exec_req(1170) failed Network
error between sheep, remote address: 192.168.10.5:7000, op name:
READ_PEER
Dec 16 15:10:16  ERROR [rw 30500] sheep_exec_req(1170) failed Network
error between sheep, remote address: 192.168.10.4:7000, op name:
READ_PEER
Dec 16 15:10:16  ERROR [rw 30554] sheep_exec_req(1170) failed Network
error between sheep, remote address: 192.168.10.4:7000, op name:
READ_PEER
Dec 16 15:10:16  ALERT [rw 30500] recover_replication_object(419)
cannot access any replicas of fd32fc0013 at epoch 2
Dec 16 15:10:16  ALERT [rw 30500] recover_replication_object(420)
clients may see old data
Dec 16 15:10:16  ERROR [rw 30500] recover_replication_object(427) can
not recover oid fd32fc0013
Dec 16 15:10:16  ERROR [rw 30500] recover_object_work(600) failed to
recover object fd32fc0013
Dec 16 15:10:16  ALERT [rw 30554] recover_replication_object(419)
cannot access any replicas of fd32fc000b at epoch 2
Dec 16 15:10:16  ALERT [rw 30554] recover_replication_object(420)
clients may see old data
Dec 16 15:10:16  ERROR [rw 30554] recover_replication_object(427) can
not recover oid fd32fc000b
Dec 16 15:10:16  ERROR [rw 30554] recover_object_work(600) failed to
recover object fd32fc000b
Dec 16 15:10:16  ERROR [rw 30553] sheep_exec_req(1170) failed Network
error between sheep, remote address: 192.168.10.4:7000, op name:
READ_PEER
Dec 16 15:10:16  ERROR [rw 30552] sheep_exec_req(1170) failed Network
error between sheep, remote address: 192.168.10.5:7000, op name:
READ_PEER
Dec 16 15:10:16  ALERT [rw 30553] recover_replication_object(419)
cannot access any replicas of fd32fc0012 at epoch 2
Dec 16 15:10:16  ALERT [rw 30553] recover_replication_object(420)
clients may see old data
Dec 16 15:10:16  ERROR [rw 30553] recover_replication_object(427) can
not recover oid fd32fc0012
Dec 16 15:10:16  ERROR [rw 30553] recover_object_work(600) failed to
recover object fd32fc0012

Notice I'm not using the option --enable-diskvnodes.

Thank you.
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


[sheepdog] Build failed in Jenkins: sheepdog-build #574

2014-12-16 Thread sheepdog-jenkins
See http://jenkins.sheepdog-project.org:8080/job/sheepdog-build/574/changes

Changes:

[mitake.hitoshi] sheep, dog: add block_size_shift option to cluster format 
command

[mitake.hitoshi] sheep, dog: add selectable object_size support of VDI operation

[mitake.hitoshi] dog: revert the change for output of dog vdi list manually

--
[...truncated 51 lines...]
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for size_t... yes
checking for working alloca.h... yes
checking for alloca... yes
checking for dirent.h that defines DIR... yes
checking for library containing opendir... none required
checking for ANSI C header files... (cached) yes
checking for sys/wait.h that is POSIX.1 compatible... yes
checking arpa/inet.h usability... yes
checking arpa/inet.h presence... yes
checking for arpa/inet.h... yes
checking fcntl.h usability... yes
checking fcntl.h presence... yes
checking for fcntl.h... yes
checking limits.h usability... yes
checking limits.h presence... yes
checking for limits.h... yes
checking netdb.h usability... yes
checking netdb.h presence... yes
checking for netdb.h... yes
checking netinet/in.h usability... yes
checking netinet/in.h presence... yes
checking for netinet/in.h... yes
checking for stdint.h... (cached) yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking sys/ioctl.h usability... yes
checking sys/ioctl.h presence... yes
checking for sys/ioctl.h... yes
checking sys/param.h usability... yes
checking sys/param.h presence... yes
checking for sys/param.h... yes
checking sys/socket.h usability... yes
checking sys/socket.h presence... yes
checking for sys/socket.h... yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking syslog.h usability... yes
checking syslog.h presence... yes
checking for syslog.h... yes
checking for unistd.h... (cached) yes
checking for sys/types.h... (cached) yes
checking getopt.h usability... yes
checking getopt.h presence... yes
checking for getopt.h... yes
checking malloc.h usability... yes
checking malloc.h presence... yes
checking for malloc.h... yes
checking sys/sockio.h usability... no
checking sys/sockio.h presence... no
checking for sys/sockio.h... no
checking utmpx.h usability... yes
checking utmpx.h presence... yes
checking for utmpx.h... yes
checking urcu.h usability... yes
checking urcu.h presence... yes
checking for urcu.h... yes
checking urcu/uatomic.h usability... yes
checking urcu/uatomic.h presence... yes
checking for urcu/uatomic.h... yes
checking for an ANSI C-conforming const... yes
checking for uid_t in sys/types.h... yes
checking for inline... inline
checking for size_t... (cached) yes
checking whether time.h and sys/time.h may both be included... yes
checking for working volatile... yes
checking size of short... 2
checking size of int... 4
checking size of long... 8
checking size of long long... 8
checking sys/eventfd.h usability... yes
checking sys/eventfd.h presence... yes
checking for sys/eventfd.h... yes
checking sys/signalfd.h usability... yes
checking sys/signalfd.h presence... yes
checking for sys/signalfd.h... yes
checking sys/timerfd.h usability... yes
checking sys/timerfd.h presence... yes
checking for sys/timerfd.h... yes
checking whether closedir returns void... no
checking for error_at_line... yes
checking for mbstate_t... yes
checking for working POSIX fnmatch... yes
checking for pid_t... yes
checking vfork.h usability... no
checking vfork.h presence... no
checking for vfork.h... no
checking for fork... yes
checking for vfork... yes
checking for working fork... yes
checking for working vfork... (cached) yes
checking whether gcc needs -traditional... no
checking for stdlib.h... (cached) yes
checking for GNU libc compatible malloc... yes
checking for working memcmp... yes
checking for stdlib.h... (cached) yes
checking for GNU libc compatible realloc... yes
checking sys/select.h usability... yes
checking sys/select.h presence... yes
checking for sys/select.h... yes
checking for sys/socket.h... (cached) yes
checking types of arguments for select... int,fd_set *,struct timeval *
checking return type of signal handlers... void
checking for vprintf... yes
checking for _doprnt... no
checking for alarm... yes
checking for alphasort... yes
checking for atexit... yes
checking for bzero... yes
checking for dup2... yes
checking for endgrent... yes
checking for endpwent... yes
checking for fcntl... yes
checking for getcwd... yes
checking for getpeerucred... no
checking for getpeereid... no
checking for gettimeofday... yes
checking for inet_ntoa... yes

Re: [sheepdog] [PATCH] func/test: change functional test output for __vdi_list (1/2)

2014-12-16 Thread Hitoshi Mitake
From: Hitoshi Mitake mitake.hitoshi@lab.ntt.co.jp
To: Teruaki Ishizaki ishizaki.teruaki@lab.ntt.co.jp
Cc: sheepdog@lists.wpkg.org
Subject: Re: [sheepdog] [PATCH] func/test: change functional test output for	__vdi_list
In-Reply-To: 1418725227-20464-1-git-send-email-ishizaki.teruaki@lab.ntt.co.jp
References: 1418725227-20464-1-git-send-email-ishizaki.teruaki@lab.ntt.co.jp
User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI/1.14.6 (Maruoka)
 FLIM/1.14.9 (=?ISO-2022-JP-2?B?R29qGyQoRCtXGyhC?=) APEL/10.8 Emacs/23.4
 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI 1.14.6 - Maruoka)
Content-Type: text/plain; charset=US-ASCII

At Tue, 16 Dec 2014 19:20:27 +0900,
Teruaki Ishizaki wrote:
 
 Change output of functional test using __vdi_list
 in assosiation with adding block_size_shift information
 to vdi list command.
 
 Signed-off-by: Teruaki Ishizaki ishizaki.teruaki@lab.ntt.co.jp
 ---
  tests/functional/016.out |2 +-
  tests/functional/029.out |   18 +++---
  tests/functional/030.out |  158 +++---
  tests/functional/031.out |   20 +++---
  tests/functional/039.out |   42 ++--
  tests/functional/040.out |8 +-
  tests/functional/041.out |   70 ++--
  tests/functional/043.out |   24 
  tests/functional/044.out |2 +-
  tests/functional/046.out |   24 
  tests/functional/047.out |4 +-
  tests/functional/048.out |6 +-
  tests/functional/052.out |   64 +-
  tests/functional/059.out |   24 
  tests/functional/060.out |   80 
  tests/functional/062.out |2 +-
  tests/functional/068.out |   12 ++--
  tests/functional/072.out |   12 ++--
  tests/functional/076.out |8 +-
  tests/functional/077.out |4 +-
  tests/functional/078.out |   18 +++---
  tests/functional/079.out |   16 +++---
  tests/functional/080.out |8 +-
  tests/functional/083.out |4 +-
  tests/functional/088.out |8 +-
  tests/functional/091.out |8 +-
  tests/functional/092.out |   20 +++---
  tests/functional/096.out |   66 ++--
  28 files changed, 366 insertions(+), 366 deletions(-)

Applied, thanks.
Hitoshi

 
 diff --git a/tests/functional/016.out b/tests/functional/016.out
 index ab648d4..50e23d7 100644
 --- a/tests/functional/016.out
 +++ b/tests/functional/016.out
 @@ -2,7 +2,7 @@ QA output created by 016
  using backend plain store
  Failed to create snapshot for base, maybe snapshot id (0) or tag (tag) is existed
  there should be no vdi
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
  there should be no object
  STORE	DATA	VDI	VMSTATE	ATTR	LEDGER	STALE
  0	0	3	0	0	0	0
 diff --git a/tests/functional/029.out b/tests/functional/029.out
 index 23117f7..7c50653 100644
 --- a/tests/functional/029.out
 +++ b/tests/functional/029.out
 @@ -6,15 +6,15 @@ To create replicated vdi, set -c x
  To create erasure coded vdi, set -c x:y
x(2,4,8,16)  - number of data strips
y(1 to 15)   - number of parity strips
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test50   20 MB   20 MB  0.0 MB DATE   fd2c304:2  
 -  test40   20 MB   20 MB  0.0 MB DATE   fd2de3  4  
 -  test70   20 MB   20 MB  0.0 MB DATE   fd2f964:4  
 -  test60   20 MB   20 MB  0.0 MB DATE   fd31494:3  
 -  test30   20 MB   20 MB  0.0 MB DATE   fd3662  3  
 -  test20   20 MB  0.0 MB   20 MB DATE   fd3816  2  
 -  test90   20 MB   20 MB  0.0 MB DATE   fd4094   16:7  
 -  test80   20 MB   20 MB  0.0 MB DATE   fd42474:5  
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
 +  test50   20 MB   20 MB  0.0 MB DATE   fd2c304:222
 +  test40   20 MB   20 MB  0.0 MB DATE   fd2de3  422
 +  test70   20 MB   20 MB  0.0 MB DATE   fd2f964:422
 +  test60   20 MB   20 MB  0.0 MB DATE   fd31494:322
 +  test30   20 MB   20 MB  0.0 MB DATE   fd3662  322
 +  test20   20 MB  0.0 MB   20 MB DATE   fd3816  222
 +  test90   20 MB   20 MB  0.0 MB DATE   fd4094   16:722
 +  test80   20 MB   20 MB  0.0 MB DATE   fd42474:522
  Looking for the object 0xfd38150001 (vid 0xfd3816 idx 1, 2 copies) with 23 nodes
  
  127.0.0.1:7000 doesn't have the object
 diff --git a/tests/functional/030.out b/tests/functional/030.out
 index 5b386ab..00f50a9 100644
 --- a/tests/functional/030.out
 +++ b/tests/functional/030.out
 @@ -5,36 +5,36 @@ Index		Tag		Snapshot Time
  Index		Tag		Snapshot Time
  1		

Re: [sheepdog] [PATCH] func/test: change functional test output for __vdi_list (2/2)

2014-12-16 Thread Hitoshi Mitake

 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test 0  8.0 MB  8.0 MB  0.0 MB DATE   7c2b25  1  
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test 0  8.0 MB  8.0 MB  0.0 MB DATE   7c2b25  1  
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test 0  8.0 MB  8.0 MB  0.0 MB DATE   7c2b25  1  
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
 +  test 0  8.0 MB  8.0 MB  0.0 MB DATE   7c2b25  122
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
 +  test 0  8.0 MB  8.0 MB  0.0 MB DATE   7c2b25  122
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
 +  test 0  8.0 MB  8.0 MB  0.0 MB DATE   7c2b25  122
  finish checkrepair test
 diff --git a/tests/functional/076.out b/tests/functional/076.out
 index b179a24..c21daeb 100644
 --- a/tests/functional/076.out
 +++ b/tests/functional/076.out
 @@ -1,7 +1,7 @@
  QA output created by 076
  using backend plain store
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test 0   40 MB  0.0 MB  0.0 MB DATE   7c2b25  16:15  
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
 +  test 0   40 MB  0.0 MB  0.0 MB DATE   7c2b25  16:1522
  using backend plain store
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test 0   40 MB  0.0 MB  0.0 MB DATE   7c2b252:1  
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
 +  test 0   40 MB  0.0 MB  0.0 MB DATE   7c2b252:122
 diff --git a/tests/functional/077.out b/tests/functional/077.out
 index 191d39f..0657acc 100644
 --- a/tests/functional/077.out
 +++ b/tests/functional/077.out
 @@ -1,7 +1,7 @@
  QA output created by 077
  using backend plain store
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test 0   12 MB  0.0 MB  0.0 MB DATE   7c2b25  3  
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
 +  test 0   12 MB  0.0 MB  0.0 MB DATE   7c2b25  322
  [127.0.0.1:7000] oid 007c2b25 is missing.
  test lost 1 object(s).
  fixed missing 7c2b25
 diff --git a/tests/functional/078.out b/tests/functional/078.out
 index e98c1d2..4ee8002 100644
 --- a/tests/functional/078.out
 +++ b/tests/functional/078.out
 @@ -1,11 +1,11 @@
  QA output created by 078
  using backend plain store
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test10   20 MB  0.0 MB  0.0 MB DATE   fd32fc4:2  
 -  test30   20 MB  0.0 MB  0.0 MB DATE   fd36622:1  
 -  test20   20 MB  0.0 MB  0.0 MB DATE   fd3815  2  
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test40   20 MB  0.0 MB  0.0 MB DATE   fd2de34:2  
 -  test10   20 MB  0.0 MB  0.0 MB DATE   fd32fc4:2  
 -  test30   20 MB  0.0 MB  0.0 MB DATE   fd36622:1  
 -  test20   20 MB  0.0 MB  0.0 MB DATE   fd3815  2  
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
 +  test10   20 MB  0.0 MB  0.0 MB DATE   fd32fc4:222
 +  test30   20 MB  0.0 MB  0.0 MB DATE   fd36622:122
 +  test20   20 MB  0.0 MB  0.0 MB DATE   fd3815  222
 +  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag   Block Size Shift
 +  test40   20 MB  0.0 MB  0.0 MB DATE   fd2de34:222
 +  test10   20 MB  0.0 MB  0.0 MB DATE   fd32fc4:222
 +  test30   20 MB  0.0 MB  0.0 MB DATE   fd36622:122
 +  test20   20 MB  0.0 MB  0.0 MB DATE   fd3815  222
 diff --git a/tests/functional/079.out b/tests/functional/079.out
 index 7f0949d..021ccfe 100644
 --- a/tests/functional/079.out
 +++ b/tests/functional/079.out
 @@ -1,11 +1,11 @@
  QA output created by 079
  using backend plain store
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test 0   16 PB  0.0 MB  0.0 MB DATE   7c2b25  3  
 -  NameIdSizeUsed  SharedCreation time   VDI id  Copies  Tag
 -  test 0   16 PB   64 MB  0.0 MB DATE   7c2b25  3  
 +  NameIdSizeUsed  

Re: [sheepdog] [PATCH] sheep: fix bug for not saving block_size_shift to cluster config

2014-12-16 Thread Hitoshi Mitake
At Tue, 16 Dec 2014 19:32:17 +0900,
Teruaki Ishizaki wrote:
 
 This patch fixes bugs that block_size_shift info was forgotten
 after cluster shutdown and start sheepdog.
 
 Add block_size_shift info to cluster config file.
 
 Signed-off-by: Teruaki Ishizaki ishizaki.teru...@lab.ntt.co.jp
 ---
  sheep/config.c |6 --
  1 files changed, 4 insertions(+), 2 deletions(-)

Applied, thanks.
Hitoshi

 
 diff --git a/sheep/config.c b/sheep/config.c
 index 383a1ed..dfad5fd 100644
 --- a/sheep/config.c
 +++ b/sheep/config.c
 @@ -11,7 +11,7 @@
  
  #include sheep_priv.h
  
 -#define SD_FORMAT_VERSION 0x0005
 +#define SD_FORMAT_VERSION 0x0006
  #define SD_CONFIG_SIZE 40
  
  static struct sheepdog_config {
 @@ -21,7 +21,7 @@ static struct sheepdog_config {
   uint8_t store[STORE_LEN];
   uint8_t shutdown;
   uint8_t copy_policy;
 - uint8_t __pad;
 + uint8_t block_size_shift;
   uint16_t version;
   uint64_t space;
  } config;
 @@ -64,6 +64,7 @@ static int get_cluster_config(struct cluster_info *cinfo)
   cinfo-nr_copies = config.copies;
   cinfo-flags = config.flags;
   cinfo-copy_policy = config.copy_policy;
 + cinfo-block_size_shift = config.block_size_shift;
   memcpy(cinfo-store, config.store, sizeof(config.store));
  
   return SD_RES_SUCCESS;
 @@ -155,6 +156,7 @@ int set_cluster_config(const struct cluster_info *cinfo)
   config.copies = cinfo-nr_copies;
   config.copy_policy = cinfo-copy_policy;
   config.flags = cinfo-flags;
 + config.block_size_shift = cinfo-block_size_shift;
   memset(config.store, 0, sizeof(config.store));
   pstrcpy((char *)config.store, sizeof(config.store),
   (char *)cinfo-store);
 -- 
 1.7.1
 
 -- 
 sheepdog mailing list
 sheepdog@lists.wpkg.org
 http://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH 2/2] sheep: forbid revival of orphan objects

2014-12-16 Thread Hitoshi Mitake
At Tue, 16 Dec 2014 15:18:18 +0100,
Valerio Pachera wrote:
 
 2014-12-16 15:07 GMT+01:00 Valerio Pachera siri...@gmail.com:
  It works fine!
 ...
  There's only this corner case to fix:
  all vdi are removed then and the disconnected node joins back the cluster
 
 Please, notice that the same logic should apply to multi device:
 
 create some vdi
 unplug a disk
 remove some vdi
 plug back the disk
 
 This still causes
 
 Dec 16 15:10:16   INFO [main] recover_object_main(930) object recovery
 progress  74%
 Dec 16 15:10:16  ERROR [rw 30554] sheep_exec_req(1170) failed Network
 error between sheep, remote address: 192.168.10.5:7000, op name:
 READ_PEER
 Dec 16 15:10:16  ERROR [rw 30553] sheep_exec_req(1170) failed Network
 error between sheep, remote address: 192.168.10.5:7000, op name:
 READ_PEER
 Dec 16 15:10:16  ERROR [rw 30500] sheep_exec_req(1170) failed Network
 error between sheep, remote address: 192.168.10.5:7000, op name:
 READ_PEER
 Dec 16 15:10:16  ERROR [rw 30500] sheep_exec_req(1170) failed Network
 error between sheep, remote address: 192.168.10.4:7000, op name:
 READ_PEER
 Dec 16 15:10:16  ERROR [rw 30554] sheep_exec_req(1170) failed Network
 error between sheep, remote address: 192.168.10.4:7000, op name:
 READ_PEER
 Dec 16 15:10:16  ALERT [rw 30500] recover_replication_object(419)
 cannot access any replicas of fd32fc0013 at epoch 2
 Dec 16 15:10:16  ALERT [rw 30500] recover_replication_object(420)
 clients may see old data
 Dec 16 15:10:16  ERROR [rw 30500] recover_replication_object(427) can
 not recover oid fd32fc0013
 Dec 16 15:10:16  ERROR [rw 30500] recover_object_work(600) failed to
 recover object fd32fc0013
 Dec 16 15:10:16  ALERT [rw 30554] recover_replication_object(419)
 cannot access any replicas of fd32fc000b at epoch 2
 Dec 16 15:10:16  ALERT [rw 30554] recover_replication_object(420)
 clients may see old data
 Dec 16 15:10:16  ERROR [rw 30554] recover_replication_object(427) can
 not recover oid fd32fc000b
 Dec 16 15:10:16  ERROR [rw 30554] recover_object_work(600) failed to
 recover object fd32fc000b
 Dec 16 15:10:16  ERROR [rw 30553] sheep_exec_req(1170) failed Network
 error between sheep, remote address: 192.168.10.4:7000, op name:
 READ_PEER
 Dec 16 15:10:16  ERROR [rw 30552] sheep_exec_req(1170) failed Network
 error between sheep, remote address: 192.168.10.5:7000, op name:
 READ_PEER
 Dec 16 15:10:16  ALERT [rw 30553] recover_replication_object(419)
 cannot access any replicas of fd32fc0012 at epoch 2
 Dec 16 15:10:16  ALERT [rw 30553] recover_replication_object(420)
 clients may see old data
 Dec 16 15:10:16  ERROR [rw 30553] recover_replication_object(427) can
 not recover oid fd32fc0012
 Dec 16 15:10:16  ERROR [rw 30553] recover_object_work(600) failed to
 recover object fd32fc0012
 
 Notice I'm not using the option --enable-diskvnodes.
 
 Thank you.

To be honest, the design of current md should be refined
completely. I'll work on the issue related to md in the future.

Thanks,
Hitoshi

 -- 
 sheepdog mailing list
 sheepdog@lists.wpkg.org
 http://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


[sheepdog] [PATCH] sheep, dog: check cluster is formatted or not during vdi creation

2014-12-16 Thread Hitoshi Mitake
Current dog prints an odd error message in a case of vdi creation
before cluster formatting like below:
$ dog/dog vdi create test 16M
VDI size is larger than 1.0 MB bytes, please use '-y' to create a hyper volume 
with size up to 16 PB bytes or use '-z' to create larger object size volume

This patch revives previous behavior.

Cc: Teruaki Ishizaki ishizaki.teru...@lab.ntt.co.jp
Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
---
 dog/vdi.c   | 7 +++
 sheep/ops.c | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/dog/vdi.c b/dog/vdi.c
index 22d6c83..effed17 100644
--- a/dog/vdi.c
+++ b/dog/vdi.c
@@ -478,6 +478,13 @@ static int vdi_create(int argc, char **argv)
ret = EXIT_FAILURE;
goto out;
}
+
+   if (rsp-result == SD_RES_WAIT_FOR_FORMAT) {
+   sd_err(Failed to create VDI %s: %s, vdiname,
+  sd_strerror(rsp-result));
+   return EXIT_FAILURE;
+   }
+
if (rsp-result != SD_RES_SUCCESS) {
sd_err(%s, sd_strerror(rsp-result));
ret = EXIT_FAILURE;
diff --git a/sheep/ops.c b/sheep/ops.c
index 448fd8e..3fb34aa 100644
--- a/sheep/ops.c
+++ b/sheep/ops.c
@@ -1125,6 +1125,9 @@ static int local_oids_exist(const struct sd_req *req, 
struct sd_rsp *rsp,
 static int local_cluster_info(const struct sd_req *req, struct sd_rsp *rsp,
  void *data, const struct sd_node *sender)
 {
+   if (sys-cinfo.ctime == 0)
+   return SD_RES_WAIT_FOR_FORMAT;
+
memcpy(data, sys-cinfo, sizeof(sys-cinfo));
rsp-data_length = sizeof(sys-cinfo);
return SD_RES_SUCCESS;
-- 
1.8.3.2

-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] Fwd: [PATCH 2/2] sheep: forbid revival of orphan objects

2014-12-16 Thread Hitoshi Mitake
At Tue, 16 Dec 2014 15:11:49 +0100,
Valerio Pachera wrote:
 
 2014-12-11 8:00 GMT+01:00 Hitoshi Mitake mitake.hito...@lab.ntt.co.jp:
  Current recovery process can cause revival of orphan objects. This
  patch solves this problem.
 
 sheep -v
 Sheepdog daemon version 0.9.0_18_g7215788
 
 It works fine!
 
 Dec 16 15:00:20   INFO [main] main(966) shutdown
 Dec 16 15:00:20   INFO [main] zk_leave(989) leaving from cluster
 Dec 16 15:00:37   INFO [main] md_add_disk(343) /mnt/sheep/0, vdisk nr
 206, total disk 1
 Dec 16 15:00:37   INFO [main] md_add_disk(343) /mnt/sheep/1, vdisk nr
 279, total disk 2
 Dec 16 15:00:37 NOTICE [main] get_local_addr(522) found IPv4 address
 Dec 16 15:00:37   INFO [main] send_join_request(1016) IPv4
 ip:192.168.10.7 port:7000 going to join the cluster
 Dec 16 15:00:37 NOTICE [main] nfs_init(611) nfs server service is not compiled
 Dec 16 15:00:37   INFO [main] main(958) sheepdog daemon (version
 0.9.0_18_g7215788) started
 Dec 16 15:00:38   INFO [rw 30221] prepare_object_list(1100) skipping
 object list reading from IPv4 ip:192.168.10.7 port:7000 becauseit is
 marked as excluded node
 Dec 16 15:00:38   INFO [main] recover_object_main(930) object recovery
 progress   2%
 Dec 16 15:00:38   INFO [main] recover_object_main(930) object recovery
 progress   3%

Thanks for your checking, applied this series.

 
 There's only this corner case to fix:
 all vdi are removed then and the disconnected node joins back the cluster

Do you mean the problem is the below error messages?

Thanks,
Hitoshi

 
 Dec 16 14:55:09   INFO [main] zk_leave(989) leaving from cluster
 Dec 16 14:55:40   INFO [main] md_add_disk(343) /mnt/sheep/0, vdisk nr
 206, total disk 1
 Dec 16 14:55:40   INFO [main] md_add_disk(343) /mnt/sheep/1, vdisk nr
 279, total disk 2
 Dec 16 14:55:40 NOTICE [main] get_local_addr(522) found IPv4 address
 Dec 16 14:55:40   INFO [main] send_join_request(1016) IPv4
 ip:192.168.10.7 port:7000 going to join the cluster
 Dec 16 14:55:40 NOTICE [main] nfs_init(611) nfs server service is not compiled
 Dec 16 14:55:40   INFO [main] main(958) sheepdog daemon (version
 0.9.0_18_g7215788) started
 Dec 16 14:55:41   INFO [rw 30049] prepare_object_list(1100) skipping
 object list reading from IPv4 ip:192.168.10.5 port:7000 becauseit is
 marked as excluded node
 Dec 16 14:55:41  ERROR [rw 30049] sheep_exec_req(1170) failed No
 object found, remote address: 192.168.10.5:7000, op name: GET_HASH
 Dec 16 14:55:41  ERROR [rw 30068] sheep_exec_req(1170) failed No
 object found, remote address: 192.168.10.5:7000, op name: GET_HASH
 Dec 16 14:55:41  ERROR [rw 30068] sheep_exec_req(1170) failed No
 object found, remote address: 192.168.10.6:7000, op name: GET_HASH
 Dec 16 14:55:41  ERROR [rw 30072] sheep_exec_req(1170) failed No
 object found, remote address: 192.168.10.5:7000, op name: GET_HASH
 Dec 16 14:55:41   INFO [main] recover_object_main(930) object recovery
 progress   1%
 cut
 -- 
 sheepdog mailing list
 sheepdog@lists.wpkg.org
 http://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH] sheep, dog: check cluster is formatted or not during vdi creation

2014-12-16 Thread Teruaki Ishizaki
(2014/12/17 10:48), Hitoshi Mitake wrote:
 Current dog prints an odd error message in a case of vdi creation
 before cluster formatting like below:
 $ dog/dog vdi create test 16M
 VDI size is larger than 1.0 MB bytes, please use '-y' to create a hyper 
 volume with size up to 16 PB bytes or use '-z' to create larger object size 
 volume
 
 This patch revives previous behavior.
 
 Cc: Teruaki Ishizaki ishizaki.teru...@lab.ntt.co.jp
 Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
 ---
   dog/vdi.c   | 7 +++
   sheep/ops.c | 3 +++
   2 files changed, 10 insertions(+)

I've tested and it looks good to me.
Reviewed-by: Teruaki Ishizaki ishizaki.teru...@lab.ntt.co.jp

Best Regards,
Teruaki

 
 diff --git a/dog/vdi.c b/dog/vdi.c
 index 22d6c83..effed17 100644
 --- a/dog/vdi.c
 +++ b/dog/vdi.c
 @@ -478,6 +478,13 @@ static int vdi_create(int argc, char **argv)
   ret = EXIT_FAILURE;
   goto out;
   }
 +
 + if (rsp-result == SD_RES_WAIT_FOR_FORMAT) {
 + sd_err(Failed to create VDI %s: %s, vdiname,
 +sd_strerror(rsp-result));
 + return EXIT_FAILURE;
 + }
 +
   if (rsp-result != SD_RES_SUCCESS) {
   sd_err(%s, sd_strerror(rsp-result));
   ret = EXIT_FAILURE;
 diff --git a/sheep/ops.c b/sheep/ops.c
 index 448fd8e..3fb34aa 100644
 --- a/sheep/ops.c
 +++ b/sheep/ops.c
 @@ -1125,6 +1125,9 @@ static int local_oids_exist(const struct sd_req *req, 
 struct sd_rsp *rsp,
   static int local_cluster_info(const struct sd_req *req, struct sd_rsp *rsp,
 void *data, const struct sd_node *sender)
   {
 + if (sys-cinfo.ctime == 0)
 + return SD_RES_WAIT_FOR_FORMAT;
 +
   memcpy(data, sys-cinfo, sizeof(sys-cinfo));
   rsp-data_length = sizeof(sys-cinfo);
   return SD_RES_SUCCESS;
 


-- 
NTT ソフトウェアイノベーションセンタ
分散処理基盤技術P(I分P)
石崎 晃朗
Tel: 0422-59-3488 Fax: 0422-59-2965
Email: ishizaki.teru...@lab.ntt.co.jp
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH] sheep, dog: check cluster is formatted or not during vdi creation

2014-12-16 Thread Hitoshi Mitake
At Wed, 17 Dec 2014 12:32:08 +0900,
Teruaki Ishizaki wrote:
 
 (2014/12/17 10:48), Hitoshi Mitake wrote:
  Current dog prints an odd error message in a case of vdi creation
  before cluster formatting like below:
  $ dog/dog vdi create test 16M
  VDI size is larger than 1.0 MB bytes, please use '-y' to create a hyper 
  volume with size up to 16 PB bytes or use '-z' to create larger object size 
  volume
  
  This patch revives previous behavior.
  
  Cc: Teruaki Ishizaki ishizaki.teru...@lab.ntt.co.jp
  Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
  ---
dog/vdi.c   | 7 +++
sheep/ops.c | 3 +++
2 files changed, 10 insertions(+)
 
 I've tested and it looks good to me.
 Reviewed-by: Teruaki Ishizaki ishizaki.teru...@lab.ntt.co.jp
 
 Best Regards,
 Teruaki

Applied.

Thanks,
Hitoshi

 
  
  diff --git a/dog/vdi.c b/dog/vdi.c
  index 22d6c83..effed17 100644
  --- a/dog/vdi.c
  +++ b/dog/vdi.c
  @@ -478,6 +478,13 @@ static int vdi_create(int argc, char **argv)
  ret = EXIT_FAILURE;
  goto out;
  }
  +
  +   if (rsp-result == SD_RES_WAIT_FOR_FORMAT) {
  +   sd_err(Failed to create VDI %s: %s, vdiname,
  +  sd_strerror(rsp-result));
  +   return EXIT_FAILURE;
  +   }
  +
  if (rsp-result != SD_RES_SUCCESS) {
  sd_err(%s, sd_strerror(rsp-result));
  ret = EXIT_FAILURE;
  diff --git a/sheep/ops.c b/sheep/ops.c
  index 448fd8e..3fb34aa 100644
  --- a/sheep/ops.c
  +++ b/sheep/ops.c
  @@ -1125,6 +1125,9 @@ static int local_oids_exist(const struct sd_req *req, 
  struct sd_rsp *rsp,
static int local_cluster_info(const struct sd_req *req, struct sd_rsp 
  *rsp,
void *data, const struct sd_node *sender)
{
  +   if (sys-cinfo.ctime == 0)
  +   return SD_RES_WAIT_FOR_FORMAT;
  +
  memcpy(data, sys-cinfo, sizeof(sys-cinfo));
  rsp-data_length = sizeof(sys-cinfo);
  return SD_RES_SUCCESS;
  
 
 
 -- 
 NTT ソフトウェアイノベーションセンタ
 分散処理基盤技術P(I分P)
 石崎 晃朗
 Tel: 0422-59-3488 Fax: 0422-59-2965
 Email: ishizaki.teru...@lab.ntt.co.jp
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH v2] sheep: let gateway node exit in a case of gateway only cluster

2014-12-16 Thread Hitoshi Mitake
At Mon, 15 Dec 2014 23:14:55 +0900,
Hitoshi Mitake wrote:
 
 When a cluster has gateway nodes only, it means the gateway nodes
 doesn't contribute to I/O of VMs. So this patch simply let them exit
 and avoid the below recovery issue.
 
 Related issue:
 https://bugs.launchpad.net/sheepdog-project/+bug/1327037
 
 Cc: duron...@qq.com
 Cc: Yang Zhang 3100100...@zju.edu.cn
 Cc: long nxtxiaol...@gmail.com
 Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
 ---
  sheep/group.c | 17 +
  1 file changed, 17 insertions(+)

Yang, long, when you have time, could you test this patch?

Thanks,
Hitoshi

 
 v2: remove needless logging
 
 diff --git a/sheep/group.c b/sheep/group.c
 index 095b7c5..5dc3284 100644
 --- a/sheep/group.c
 +++ b/sheep/group.c
 @@ -1151,6 +1151,18 @@ main_fn void sd_accept_handler(const struct sd_node 
 *joined,
   }
  }
  
 +static bool is_gateway_only_cluster(const struct rb_root *nroot)
 +{
 + struct sd_node *n;
 +
 + rb_for_each_entry(n, nroot, rb) {
 + if (n-space)
 + return false;
 + }
 +
 + return true;
 +}
 +
  main_fn void sd_leave_handler(const struct sd_node *left,
 const struct rb_root *nroot, size_t nr_nodes)
  {
 @@ -1177,6 +1189,11 @@ main_fn void sd_leave_handler(const struct sd_node 
 *left,
   old_vnode_info = main_thread_get(current_vnode_info);
   main_thread_set(current_vnode_info, alloc_vnode_info(nroot));
   if (sys-cinfo.status == SD_STATUS_OK) {
 + if (is_gateway_only_cluster(nroot)) {
 + sd_info(only gateway nodes are remaining, exiting);
 + exit(0);
 + }
 +
   ret = inc_and_log_epoch();
   if (ret != 0)
   panic(cannot log current epoch %d, sys-cinfo.epoch);
 -- 
 1.9.1
 
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH RFT 0/4] garbage collect needless VIDs and inode objects

2014-12-16 Thread Hitoshi Mitake
At Tue, 16 Dec 2014 12:28:29 +0100,
Valerio Pachera wrote:
 
 2014-12-15 10:36 GMT+01:00 Hitoshi Mitake mitake.hito...@lab.ntt.co.jp:
  Current sheepdog never recycles VIDs. But it will cause problems
  e.g. VID space exhaustion, too much garbage inode objects.
 
 I've been testing this branch and it seem to work.
 I use a script that creates 3 vdi, 3 snapshot for each (writing 10M of
 data), then removes them and look for objects with name starting with
 80*.
 
 With all snap active
 /mnt/sheep/1/80fd3663
 /mnt/sheep/0/80fd3818
 /mnt/sheep/0/80fd32fc
 /mnt/sheep/0/80fd32fd
 /mnt/sheep/0/80fd32fe
 
 After removing all snap
 /mnt/sheep/1/80fd3663
 /mnt/sheep/0/80fd3818
 /mnt/sheep/0/80fd32fc
 /mnt/sheep/0/80fd32fd
 /mnt/sheep/0/80fd32fe
 
 After removing all vdi
 empty
 
 sheep -v
 Sheepdog daemon version 0.9.0_25_g24ef77f
 
 But I found a repeatable sheepdog crash!
 I notice that happening if I was running the script a second time.
 The crash occur after when I recreate a vdi with the same name and
 then I take a snapshot of it.
 
 Dec 16 12:12:42   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
 client=127.0.0.1:40067, op=DEL_VDI, result=00
 Dec 16 12:12:47   INFO [main] rx_main(830) req=0x7f314400e5a0, fd=26,
 client=127.0.0.1:40069, op=DEL_VDI, data=(not string)
 Dec 16 12:12:47   INFO [main] run_vid_gc(2106) all members of the
 family (root: fd3662) are deleted
 Dec 16 12:12:47   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
 client=127.0.0.1:40069, op=DEL_VDI, result=00
 Dec 16 12:13:57   INFO [main] rx_main(830) req=0x7f314400e5a0, fd=26,
 client=127.0.0.1:40072, op=NEW_VDI, data=(not string)
 Dec 16 12:13:57   INFO [main] post_cluster_new_vdi(133)
 req-vdi.base_vdi_id: 0, rsp-vdi.vdi_id: fd32fc
 Dec 16 12:13:57   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
 client=127.0.0.1:40072, op=NEW_VDI, result=00
 Dec 16 12:14:12   INFO [main] rx_main(830) req=0x7f314400e5a0, fd=26,
 client=127.0.0.1:40074, op=NEW_VDI, data=(not string)
 Dec 16 12:14:13   INFO [main] post_cluster_new_vdi(133)
 req-vdi.base_vdi_id: 0, rsp-vdi.vdi_id: fd3815
 Dec 16 12:14:13   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
 client=127.0.0.1:40074, op=NEW_VDI, result=00
 Dec 16 12:14:23   INFO [main] rx_main(830) req=0x7f314400e5a0, fd=26,
 client=127.0.0.1:40076, op=NEW_VDI, data=(not string)
 Dec 16 12:14:23   INFO [main] post_cluster_new_vdi(133)
 req-vdi.base_vdi_id: 0, rsp-vdi.vdi_id: fd3662
 Dec 16 12:14:23   INFO [main] tx_main(882) req=0x7f314400e5a0, fd=26,
 client=127.0.0.1:40076, op=NEW_VDI, result=00
 Dec 16 12:14:34   INFO [main] rx_main(830) req=0x7f314400d310, fd=26,
 client=127.0.0.1:40078, op=NEW_VDI, data=(not string)
 Dec 16 12:14:34  EMERG [main] crash_handler(268) sheep exits
 unexpectedly (Segmentation fault).
 Dec 16 12:14:34  EMERG [main] sd_backtrace(833) sheep.c:270: crash_handler
 Dec 16 12:14:34  EMERG [main] sd_backtrace(847)
 /lib/x86_64-linux-gnu/libpthread.so.0(+0xf02f) [0x7f31515cc02f]
 Dec 16 12:14:34  EMERG [main] sd_backtrace(833) vdi.c:64:
 lookup_vdi_family_member
 Dec 16 12:14:34  EMERG [main] sd_backtrace(833) vdi.c:109: update_vdi_family
 Dec 16 12:14:34  EMERG [main] sd_backtrace(833) vdi.c:396: add_vdi_state
 Dec 16 12:14:34  EMERG [main] sd_backtrace(833) ops.c:674:
 cluster_notify_vdi_add
 Dec 16 12:14:34  EMERG [main] sd_backtrace(833) group.c:948: sd_notify_handler
 Dec 16 12:14:34  EMERG [main] sd_backtrace(833) zookeeper.c:1252:
 zk_event_handler
 Dec 16 12:14:34  EMERG [main] sd_backtrace(833) event.c:210: do_event_loop
 Dec 16 12:14:34  EMERG [main] sd_backtrace(833) sheep.c:963: main
 Dec 16 12:14:34  EMERG [main] sd_backtrace(847)
 /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfc)
 [0x7f3150badeac]
 Dec 16 12:14:34  EMERG [main] sd_backtrace(847) sheep() [0x405fa8]
 
 How to reproduce:
 
 dog cluster format -c 2
 dog vdi create -P  test 1G
 dog vdi snapshot test
 dd if=/dev/urandom bs=1M count=10 | dog vdi write test
 dog vdi delete -s 1 test
 dog vdi delete test
 echo 'Recreating vdi test'
 dog vdi create -P  test 1G
 dog vdi snapshot test   -- at this point, sheep crashes
 dog vdi list

Thanks for your report, I've fixed the problem and updated the gc-vid branch.

Thanks,
Hitoshi

 -- 
 sheepdog mailing list
 sheepdog@lists.wpkg.org
 http://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH v2] sheep: let gateway node exit in a case of gateway only cluster

2014-12-16 Thread 徐小龙
hi,Hitoshi
   we've tested the patch. Our test method is:

We attached a 20G sheepdog VDI to a VM holded by openstack. And we created
a 2G file which we have it's md5 in hand in the VDI.
We killed the non-gateway nodes in the middle of the process, then
restarted the cluster. The process resumed and the content of the file is
right(same md5)

Thanks,
Yang,Long

On Wed, Dec 17, 2014 at 1:09 PM, Hitoshi Mitake 
mitake.hito...@lab.ntt.co.jp wrote:

 At Mon, 15 Dec 2014 23:14:55 +0900,
 Hitoshi Mitake wrote:
 
  When a cluster has gateway nodes only, it means the gateway nodes
  doesn't contribute to I/O of VMs. So this patch simply let them exit
  and avoid the below recovery issue.
 
  Related issue:
  https://bugs.launchpad.net/sheepdog-project/+bug/1327037
 
  Cc: duron...@qq.com
  Cc: Yang Zhang 3100100...@zju.edu.cn
  Cc: long nxtxiaol...@gmail.com
  Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
  ---
   sheep/group.c | 17 +
   1 file changed, 17 insertions(+)

 Yang, long, when you have time, could you test this patch?

 Thanks,
 Hitoshi

 
  v2: remove needless logging
 
  diff --git a/sheep/group.c b/sheep/group.c
  index 095b7c5..5dc3284 100644
  --- a/sheep/group.c
  +++ b/sheep/group.c
  @@ -1151,6 +1151,18 @@ main_fn void sd_accept_handler(const struct
 sd_node *joined,
}
   }
 
  +static bool is_gateway_only_cluster(const struct rb_root *nroot)
  +{
  + struct sd_node *n;
  +
  + rb_for_each_entry(n, nroot, rb) {
  + if (n-space)
  + return false;
  + }
  +
  + return true;
  +}
  +
   main_fn void sd_leave_handler(const struct sd_node *left,
  const struct rb_root *nroot, size_t nr_nodes)
   {
  @@ -1177,6 +1189,11 @@ main_fn void sd_leave_handler(const struct
 sd_node *left,
old_vnode_info = main_thread_get(current_vnode_info);
main_thread_set(current_vnode_info, alloc_vnode_info(nroot));
if (sys-cinfo.status == SD_STATUS_OK) {
  + if (is_gateway_only_cluster(nroot)) {
  + sd_info(only gateway nodes are remaining,
 exiting);
  + exit(0);
  + }
  +
ret = inc_and_log_epoch();
if (ret != 0)
panic(cannot log current epoch %d,
 sys-cinfo.epoch);
  --
  1.9.1
 

-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH v2] sheep: let gateway node exit in a case of gateway only cluster

2014-12-16 Thread Hitoshi Mitake
At Wed, 17 Dec 2014 15:40:35 +0800,
$B=y.$(AAz(B wrote:
 
 [1  text/plain; UTF-8 (7bit)]
 hi,Hitoshi
we've tested the patch. Our test method is:
 
 We attached a 20G sheepdog VDI to a VM holded by openstack. And we created
 a 2G file which we have it's md5 in hand in the VDI.
 We killed the non-gateway nodes in the middle of the process, then
 restarted the cluster. The process resumed and the content of the file is
 right(same md5)
 
 Thanks,
 Yang,Long

Thanks a lot for testing! Could you give me your Tested-by: tags?
(e.g. Tested-by: Yang Zhang 3100100...@zju.edu.cn, Tested-by: Long 
nxtxiaol...@gmail.com)

Thanks,
Hitoshi

 
 On Wed, Dec 17, 2014 at 1:09 PM, Hitoshi Mitake 
 mitake.hito...@lab.ntt.co.jp wrote:
 
  At Mon, 15 Dec 2014 23:14:55 +0900,
  Hitoshi Mitake wrote:
  
   When a cluster has gateway nodes only, it means the gateway nodes
   doesn't contribute to I/O of VMs. So this patch simply let them exit
   and avoid the below recovery issue.
  
   Related issue:
   https://bugs.launchpad.net/sheepdog-project/+bug/1327037
  
   Cc: duron...@qq.com
   Cc: Yang Zhang 3100100...@zju.edu.cn
   Cc: long nxtxiaol...@gmail.com
   Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
   ---
sheep/group.c | 17 +
1 file changed, 17 insertions(+)
 
  Yang, long, when you have time, could you test this patch?
 
  Thanks,
  Hitoshi
 
  
   v2: remove needless logging
  
   diff --git a/sheep/group.c b/sheep/group.c
   index 095b7c5..5dc3284 100644
   --- a/sheep/group.c
   +++ b/sheep/group.c
   @@ -1151,6 +1151,18 @@ main_fn void sd_accept_handler(const struct
  sd_node *joined,
 }
}
  
   +static bool is_gateway_only_cluster(const struct rb_root *nroot)
   +{
   + struct sd_node *n;
   +
   + rb_for_each_entry(n, nroot, rb) {
   + if (n-space)
   + return false;
   + }
   +
   + return true;
   +}
   +
main_fn void sd_leave_handler(const struct sd_node *left,
   const struct rb_root *nroot, size_t nr_nodes)
{
   @@ -1177,6 +1189,11 @@ main_fn void sd_leave_handler(const struct
  sd_node *left,
 old_vnode_info = main_thread_get(current_vnode_info);
 main_thread_set(current_vnode_info, alloc_vnode_info(nroot));
 if (sys-cinfo.status == SD_STATUS_OK) {
   + if (is_gateway_only_cluster(nroot)) {
   + sd_info(only gateway nodes are remaining,
  exiting);
   + exit(0);
   + }
   +
 ret = inc_and_log_epoch();
 if (ret != 0)
 panic(cannot log current epoch %d,
  sys-cinfo.epoch);
   --
   1.9.1
  
 
 [2  text/html; UTF-8 (quoted-printable)]
 
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] Fwd: [PATCH 2/2] sheep: forbid revival of orphan objects

2014-12-16 Thread Valerio Pachera
2014-12-17 3:48 GMT+01:00 Hitoshi Mitake mitake.hito...@lab.ntt.co.jp:
 There's only this corner case to fix:
 all vdi are removed then and the disconnected node joins back the cluster

 Do you mean the problem is the below error messages?

The problem is that:

node 1, 2, 3, 4
create vdi
disconnect node 4
remove *all* vdi
reconnect node 4

Node 1,2 and 3 are empty but node 4 doesn't remove the objects once
rejoined the cluster.
And it prints the below error messages.
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] [PATCH v2] sheep: let gateway node exit in a case of gateway only cluster

2014-12-16 Thread Hitoshi Mitake
At Wed, 17 Dec 2014 16:42:27 +0900,
Hitoshi Mitake wrote:
 
 At Wed, 17 Dec 2014 15:40:35 +0800,
 $B=y.$(AAz(B wrote:
  
  [1  text/plain; UTF-8 (7bit)]
  hi,Hitoshi
 we've tested the patch. Our test method is:
  
  We attached a 20G sheepdog VDI to a VM holded by openstack. And we created
  a 2G file which we have it's md5 in hand in the VDI.
  We killed the non-gateway nodes in the middle of the process, then
  restarted the cluster. The process resumed and the content of the file is
  right(same md5)
  
  Thanks,
  Yang,Long
 
 Thanks a lot for testing! Could you give me your Tested-by: tags?
 (e.g. Tested-by: Yang Zhang 3100100...@zju.edu.cn, Tested-by: Long 
 nxtxiaol...@gmail.com)

Applied this one.

Thanks,
Hitoshi

 
 Thanks,
 Hitoshi
 
  
  On Wed, Dec 17, 2014 at 1:09 PM, Hitoshi Mitake 
  mitake.hito...@lab.ntt.co.jp wrote:
  
   At Mon, 15 Dec 2014 23:14:55 +0900,
   Hitoshi Mitake wrote:
   
When a cluster has gateway nodes only, it means the gateway nodes
doesn't contribute to I/O of VMs. So this patch simply let them exit
and avoid the below recovery issue.
   
Related issue:
https://bugs.launchpad.net/sheepdog-project/+bug/1327037
   
Cc: duron...@qq.com
Cc: Yang Zhang 3100100...@zju.edu.cn
Cc: long nxtxiaol...@gmail.com
Signed-off-by: Hitoshi Mitake mitake.hito...@lab.ntt.co.jp
---
 sheep/group.c | 17 +
 1 file changed, 17 insertions(+)
  
   Yang, long, when you have time, could you test this patch?
  
   Thanks,
   Hitoshi
  
   
v2: remove needless logging
   
diff --git a/sheep/group.c b/sheep/group.c
index 095b7c5..5dc3284 100644
--- a/sheep/group.c
+++ b/sheep/group.c
@@ -1151,6 +1151,18 @@ main_fn void sd_accept_handler(const struct
   sd_node *joined,
  }
 }
   
+static bool is_gateway_only_cluster(const struct rb_root *nroot)
+{
+ struct sd_node *n;
+
+ rb_for_each_entry(n, nroot, rb) {
+ if (n-space)
+ return false;
+ }
+
+ return true;
+}
+
 main_fn void sd_leave_handler(const struct sd_node *left,
const struct rb_root *nroot, size_t 
nr_nodes)
 {
@@ -1177,6 +1189,11 @@ main_fn void sd_leave_handler(const struct
   sd_node *left,
  old_vnode_info = main_thread_get(current_vnode_info);
  main_thread_set(current_vnode_info, alloc_vnode_info(nroot));
  if (sys-cinfo.status == SD_STATUS_OK) {
+ if (is_gateway_only_cluster(nroot)) {
+ sd_info(only gateway nodes are remaining,
   exiting);
+ exit(0);
+ }
+
  ret = inc_and_log_epoch();
  if (ret != 0)
  panic(cannot log current epoch %d,
   sys-cinfo.epoch);
--
1.9.1
   
  
  [2  text/html; UTF-8 (quoted-printable)]
  
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog


Re: [sheepdog] Fwd: [PATCH 2/2] sheep: forbid revival of orphan objects

2014-12-16 Thread Hitoshi Mitake
At Wed, 17 Dec 2014 08:50:35 +0100,
Valerio Pachera wrote:
 
 2014-12-17 3:48 GMT+01:00 Hitoshi Mitake mitake.hito...@lab.ntt.co.jp:
  There's only this corner case to fix:
  all vdi are removed then and the disconnected node joins back the cluster
 
  Do you mean the problem is the below error messages?
 
 The problem is that:
 
 node 1, 2, 3, 4
 create vdi
 disconnect node 4
 remove *all* vdi
 reconnect node 4
 
 Node 1,2 and 3 are empty but node 4 doesn't remove the objects once
 rejoined the cluster.
 And it prints the below error messages.

Ah, I see. I'll work on it later.

Thanks,
Hitoshi

 -- 
 sheepdog mailing list
 sheepdog@lists.wpkg.org
 http://lists.wpkg.org/mailman/listinfo/sheepdog
-- 
sheepdog mailing list
sheepdog@lists.wpkg.org
http://lists.wpkg.org/mailman/listinfo/sheepdog