On Fri, 6 Sep 2013, Nigel Williams wrote: > I appreciate CephFS is not a high priority, but this is a > user-experience test-case that can be a source of stability bugs for > Ceph developers to investigate (and hopefully resolve): > > CephFS test-case > > 1. Create two clusters, each 3 nodes with 4 OSDs each > > 2. I used Ubuntu 13.04 followed by update/upgrade > > 3. Install Ceph version 0.61 on Cluster A > > 4. Install release on Cluster B with ceph-deploy > > 5. Fill Cluster A (version 0.61) with about one million files (all sizes) > > 6. rsync ClusterA ClusterB > > 7. In about 12-hours one or two OSDs on ClusterB will crash, restart > OSDs, restart rsync > > 8. At around 75% full OSDs on ClusterB will become out of balance > (some more full than others), one or more OSD will then crash.
It sounds like the problem is cluster B's pools have too few PGs, making the data distribution get all out of whack. What does ceph osd dump | grep ^pool say, and how many OSDs do you have? sage > > For (4) it is possible to use freely available .ISOs of old user-group > CDROMs that are floating around the web, they are a good source of > varied content size, directory size and filename lengths. > > My impression is that 0.61 was relatively stable but subsequent > version such as 0.67.2 are less stable in this particular scenario > with CephFS. > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com