Re: bad CRC in data error on ARM

2015-05-16 Thread huang jun
that always happen, every test have such errors. And our cluster and
client that  running on X86 works fine, never seen bad crc error.


2015-05-16 17:30 GMT+08:00 Haomai Wang haomaiw...@gmail.com:
 is this always happen or occasionally?

 On Sat, May 16, 2015 at 10:10 AM, huang jun hjwsm1...@gmail.com wrote:
 hi,steve

 2015-05-15 16:36 GMT+08:00 Steve Capper steve.cap...@linaro.org:
 On 15 May 2015 at 00:51, huang jun hjwsm1...@gmail.com wrote:
 hi,all

 Hi HuangJun,


 We run ceph cluster on ARM platform (arm64, linux kernel 3.14, OS
 ubuntu 14.10), and use dd if=/dev/zero of=/mnt/test bs=4M count=125
 to write data.  On the osd side, we got bad data CRC error.

 The kclient log: (tid=6)
 May 14 17:21:08 node103 kernel: [  180.194312] CPU[0] libceph:
 send_request ffc8d252f000 tid-6 to osd0 flags 36 pg 1.9aae829f req
 data size is 4194304
 May 14 17:21:08 node103 kernel: [  180.194316] CPU[0] libceph: tid-6
 - ffc0702f66c8 to osd0 42=osd_op len 197+0+4194304 -
 libceph: tid-6 front_crc is 388648745 middle_crc is 0 data_crc is 
 3036014994

 The OSD-0 log:
 2015-05-13 08:12:50.049345 7f378d8d8700  0 seq  3 tid 6 front_len 197
 mid_len 0 data_len 4194304
 2015-05-13 08:12:50.049348 7f378d8d8700  0 crc in front 388648745 exp 
 388648745
 2015-05-13 08:12:50.049395 7f378d8d8700  0 crc in middle 0 exp 0
 2015-05-13 08:12:50.049964 7f378d8d8700  0 crc in data 0 exp 3036014994
 2015-05-13 08:12:50.050234 7f378d8d8700  0 bad crc in data 0 != exp 
 3036014994

 some considerations:
 1) we use ceph 0.80.7 realse version and compile it on ARM, did this
 works? or  does ceph's code has ARM branch?

 We did run a Ceph version close to that for 64-bit ARM, I'm checking
 out 0.80.7 now to test.
 In v9.0.0, there is some code to use the ARM optional crc32c
 instructions, but this isn't in 0.80.7.


 2) as we have write 125 objects, only few of them report CRC error,
 and the right object's data_crc is 0 both on osd and kclient. the
 wrong object's data_crc is not 0 on kclient, but osd calculate result
 0. the object data came from /dev/zero, i think the data_crc should be
 0, am i right?


 If the initial CRC seed value is non-zero, then the CRC of a buffer
 full of zeros won't be zero.
 So ceph_crc32c(somethingnonzero, zerofilledbuffer, len), will be non-zero.

 I would like to reproduce this problem here.
 What steps did you take before this error occurred?
 Is this a cephfs filesystem or something on top of an RBD image?
 Which kernel are you running? Is it the one that comes with Ubuntu?
 (If so which package version is it?)

 We use linux kernel version 3.14 and we just tested it on Ubuntu, and
 ceph version v0.80.7. Both cephfs and RBD image have CRC problems.
 I'm not sure whether it's related to Memory, since we tested many
 times, but just a few reported CRC error.
 As i mentioned, i doubt the memory fault changed the data, because we
 write 125 objects, and the all data_crc is 0 except the Bad CRC
 object's data_crc. Any tips are welcome.

 Cheers,
 --
 Steve



 --
 thanks
 huangjun
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



 --
 Best Regards,

 Wheat



-- 
thanks
huangjun
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[no subject]

2015-05-16 Thread Haomai Wang
Even if from /dev/zero, the data crc shouldn't be 0.

I guess osd(arm) doesn't do crc computing. But from code, crc for arm
should be fine

On Sat, May 16, 2015 at 6:21 PM, huang jun hjwsm1...@gmail.com wrote:
 that always happen, every test have such errors. And our cluster and
 client that  running on X86 works fine, never seen bad crc error.


 2015-05-16 17:30 GMT+08:00 Haomai Wang haomaiw...@gmail.com:
 is this always happen or occasionally?

 On Sat, May 16, 2015 at 10:10 AM, huang jun hjwsm1...@gmail.com wrote:
 hi,steve

 2015-05-15 16:36 GMT+08:00 Steve Capper steve.cap...@linaro.org:
 On 15 May 2015 at 00:51, huang jun hjwsm1...@gmail.com wrote:
 hi,all

 Hi HuangJun,


 We run ceph cluster on ARM platform (arm64, linux kernel 3.14, OS
 ubuntu 14.10), and use dd if=/dev/zero of=/mnt/test bs=4M count=125
 to write data.  On the osd side, we got bad data CRC error.

 The kclient log: (tid=6)
 May 14 17:21:08 node103 kernel: [  180.194312] CPU[0] libceph:
 send_request ffc8d252f000 tid-6 to osd0 flags 36 pg 1.9aae829f req
 data size is 4194304
 May 14 17:21:08 node103 kernel: [  180.194316] CPU[0] libceph: tid-6
 - ffc0702f66c8 to osd0 42=osd_op len 197+0+4194304 -
 libceph: tid-6 front_crc is 388648745 middle_crc is 0 data_crc is 
 3036014994

 The OSD-0 log:
 2015-05-13 08:12:50.049345 7f378d8d8700  0 seq  3 tid 6 front_len 197
 mid_len 0 data_len 4194304
 2015-05-13 08:12:50.049348 7f378d8d8700  0 crc in front 388648745 exp 
 388648745
 2015-05-13 08:12:50.049395 7f378d8d8700  0 crc in middle 0 exp 0
 2015-05-13 08:12:50.049964 7f378d8d8700  0 crc in data 0 exp 3036014994
 2015-05-13 08:12:50.050234 7f378d8d8700  0 bad crc in data 0 != exp 
 3036014994

 some considerations:
 1) we use ceph 0.80.7 realse version and compile it on ARM, did this
 works? or  does ceph's code has ARM branch?

 We did run a Ceph version close to that for 64-bit ARM, I'm checking
 out 0.80.7 now to test.
 In v9.0.0, there is some code to use the ARM optional crc32c
 instructions, but this isn't in 0.80.7.


 2) as we have write 125 objects, only few of them report CRC error,
 and the right object's data_crc is 0 both on osd and kclient. the
 wrong object's data_crc is not 0 on kclient, but osd calculate result
 0. the object data came from /dev/zero, i think the data_crc should be
 0, am i right?


 If the initial CRC seed value is non-zero, then the CRC of a buffer
 full of zeros won't be zero.
 So ceph_crc32c(somethingnonzero, zerofilledbuffer, len), will be non-zero.

 I would like to reproduce this problem here.
 What steps did you take before this error occurred?
 Is this a cephfs filesystem or something on top of an RBD image?
 Which kernel are you running? Is it the one that comes with Ubuntu?
 (If so which package version is it?)

 We use linux kernel version 3.14 and we just tested it on Ubuntu, and
 ceph version v0.80.7. Both cephfs and RBD image have CRC problems.
 I'm not sure whether it's related to Memory, since we tested many
 times, but just a few reported CRC error.
 As i mentioned, i doubt the memory fault changed the data, because we
 write 125 objects, and the all data_crc is 0 except the Bad CRC
 object's data_crc. Any tips are welcome.

 Cheers,
 --
 Steve



 --
 thanks
 huangjun
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html



 --
 Best Regards,

 Wheat



 --
 thanks
 huangjun



-- 
Best Regards,

Wheat
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: make check, src/test/ceph-disk.sh fails on Mint

2015-05-16 Thread Michal Jarzabek
Well, I do run it on Linux Mint, but rest of the tests passes without
any problems. So I was wondering if there was any simple way to fix
this one as well.

On Sat, May 16, 2015 at 10:30 PM, David Zafman dzaf...@redhat.com wrote:

 Is something really broken?  Or are you just on an unsupported platform?

 David


 On 5/16/15 8:49 AM, Michal Jarzabek wrote:

 ceph_detect_init.exc.UnsupportedPlatform: Platform is not supported.:


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html