Re: bad CRC in data error on ARM
that always happen, every test have such errors. And our cluster and client that running on X86 works fine, never seen bad crc error. 2015-05-16 17:30 GMT+08:00 Haomai Wang haomaiw...@gmail.com: is this always happen or occasionally? On Sat, May 16, 2015 at 10:10 AM, huang jun hjwsm1...@gmail.com wrote: hi,steve 2015-05-15 16:36 GMT+08:00 Steve Capper steve.cap...@linaro.org: On 15 May 2015 at 00:51, huang jun hjwsm1...@gmail.com wrote: hi,all Hi HuangJun, We run ceph cluster on ARM platform (arm64, linux kernel 3.14, OS ubuntu 14.10), and use dd if=/dev/zero of=/mnt/test bs=4M count=125 to write data. On the osd side, we got bad data CRC error. The kclient log: (tid=6) May 14 17:21:08 node103 kernel: [ 180.194312] CPU[0] libceph: send_request ffc8d252f000 tid-6 to osd0 flags 36 pg 1.9aae829f req data size is 4194304 May 14 17:21:08 node103 kernel: [ 180.194316] CPU[0] libceph: tid-6 - ffc0702f66c8 to osd0 42=osd_op len 197+0+4194304 - libceph: tid-6 front_crc is 388648745 middle_crc is 0 data_crc is 3036014994 The OSD-0 log: 2015-05-13 08:12:50.049345 7f378d8d8700 0 seq 3 tid 6 front_len 197 mid_len 0 data_len 4194304 2015-05-13 08:12:50.049348 7f378d8d8700 0 crc in front 388648745 exp 388648745 2015-05-13 08:12:50.049395 7f378d8d8700 0 crc in middle 0 exp 0 2015-05-13 08:12:50.049964 7f378d8d8700 0 crc in data 0 exp 3036014994 2015-05-13 08:12:50.050234 7f378d8d8700 0 bad crc in data 0 != exp 3036014994 some considerations: 1) we use ceph 0.80.7 realse version and compile it on ARM, did this works? or does ceph's code has ARM branch? We did run a Ceph version close to that for 64-bit ARM, I'm checking out 0.80.7 now to test. In v9.0.0, there is some code to use the ARM optional crc32c instructions, but this isn't in 0.80.7. 2) as we have write 125 objects, only few of them report CRC error, and the right object's data_crc is 0 both on osd and kclient. the wrong object's data_crc is not 0 on kclient, but osd calculate result 0. the object data came from /dev/zero, i think the data_crc should be 0, am i right? If the initial CRC seed value is non-zero, then the CRC of a buffer full of zeros won't be zero. So ceph_crc32c(somethingnonzero, zerofilledbuffer, len), will be non-zero. I would like to reproduce this problem here. What steps did you take before this error occurred? Is this a cephfs filesystem or something on top of an RBD image? Which kernel are you running? Is it the one that comes with Ubuntu? (If so which package version is it?) We use linux kernel version 3.14 and we just tested it on Ubuntu, and ceph version v0.80.7. Both cephfs and RBD image have CRC problems. I'm not sure whether it's related to Memory, since we tested many times, but just a few reported CRC error. As i mentioned, i doubt the memory fault changed the data, because we write 125 objects, and the all data_crc is 0 except the Bad CRC object's data_crc. Any tips are welcome. Cheers, -- Steve -- thanks huangjun -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Best Regards, Wheat -- thanks huangjun -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[no subject]
Even if from /dev/zero, the data crc shouldn't be 0. I guess osd(arm) doesn't do crc computing. But from code, crc for arm should be fine On Sat, May 16, 2015 at 6:21 PM, huang jun hjwsm1...@gmail.com wrote: that always happen, every test have such errors. And our cluster and client that running on X86 works fine, never seen bad crc error. 2015-05-16 17:30 GMT+08:00 Haomai Wang haomaiw...@gmail.com: is this always happen or occasionally? On Sat, May 16, 2015 at 10:10 AM, huang jun hjwsm1...@gmail.com wrote: hi,steve 2015-05-15 16:36 GMT+08:00 Steve Capper steve.cap...@linaro.org: On 15 May 2015 at 00:51, huang jun hjwsm1...@gmail.com wrote: hi,all Hi HuangJun, We run ceph cluster on ARM platform (arm64, linux kernel 3.14, OS ubuntu 14.10), and use dd if=/dev/zero of=/mnt/test bs=4M count=125 to write data. On the osd side, we got bad data CRC error. The kclient log: (tid=6) May 14 17:21:08 node103 kernel: [ 180.194312] CPU[0] libceph: send_request ffc8d252f000 tid-6 to osd0 flags 36 pg 1.9aae829f req data size is 4194304 May 14 17:21:08 node103 kernel: [ 180.194316] CPU[0] libceph: tid-6 - ffc0702f66c8 to osd0 42=osd_op len 197+0+4194304 - libceph: tid-6 front_crc is 388648745 middle_crc is 0 data_crc is 3036014994 The OSD-0 log: 2015-05-13 08:12:50.049345 7f378d8d8700 0 seq 3 tid 6 front_len 197 mid_len 0 data_len 4194304 2015-05-13 08:12:50.049348 7f378d8d8700 0 crc in front 388648745 exp 388648745 2015-05-13 08:12:50.049395 7f378d8d8700 0 crc in middle 0 exp 0 2015-05-13 08:12:50.049964 7f378d8d8700 0 crc in data 0 exp 3036014994 2015-05-13 08:12:50.050234 7f378d8d8700 0 bad crc in data 0 != exp 3036014994 some considerations: 1) we use ceph 0.80.7 realse version and compile it on ARM, did this works? or does ceph's code has ARM branch? We did run a Ceph version close to that for 64-bit ARM, I'm checking out 0.80.7 now to test. In v9.0.0, there is some code to use the ARM optional crc32c instructions, but this isn't in 0.80.7. 2) as we have write 125 objects, only few of them report CRC error, and the right object's data_crc is 0 both on osd and kclient. the wrong object's data_crc is not 0 on kclient, but osd calculate result 0. the object data came from /dev/zero, i think the data_crc should be 0, am i right? If the initial CRC seed value is non-zero, then the CRC of a buffer full of zeros won't be zero. So ceph_crc32c(somethingnonzero, zerofilledbuffer, len), will be non-zero. I would like to reproduce this problem here. What steps did you take before this error occurred? Is this a cephfs filesystem or something on top of an RBD image? Which kernel are you running? Is it the one that comes with Ubuntu? (If so which package version is it?) We use linux kernel version 3.14 and we just tested it on Ubuntu, and ceph version v0.80.7. Both cephfs and RBD image have CRC problems. I'm not sure whether it's related to Memory, since we tested many times, but just a few reported CRC error. As i mentioned, i doubt the memory fault changed the data, because we write 125 objects, and the all data_crc is 0 except the Bad CRC object's data_crc. Any tips are welcome. Cheers, -- Steve -- thanks huangjun -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Best Regards, Wheat -- thanks huangjun -- Best Regards, Wheat -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: make check, src/test/ceph-disk.sh fails on Mint
Well, I do run it on Linux Mint, but rest of the tests passes without any problems. So I was wondering if there was any simple way to fix this one as well. On Sat, May 16, 2015 at 10:30 PM, David Zafman dzaf...@redhat.com wrote: Is something really broken? Or are you just on an unsupported platform? David On 5/16/15 8:49 AM, Michal Jarzabek wrote: ceph_detect_init.exc.UnsupportedPlatform: Platform is not supported.: -- To unsubscribe from this list: send the line unsubscribe ceph-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html