Re: [radosgw] Race condition corrupting data on COPY ?
Hi, > Can't make much out of it, will probably need rgw logs (and preferably > with also 'debug ms = 1') for this issue. Well, the problem is that I can't make it happen again ... it happened 4 times during an import of ~3000 files ... I'm trying to reproduce this on a test cluster but so far, no luck. I'll give it another shot tomorrow. And I can't enable debug on prod for long periods, the space for log is limited and would be filled in minutes with all the requests. I also disabled the use of copy in production anyway because I can't have it corrupt random customer files. Cheers, Sylvain -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [radosgw] Race condition corrupting data on COPY ?
On Mon, Mar 18, 2013 at 7:40 AM, Sylvain Munaut wrote: > Hi, > > >> What version are you using? Do you have logs? > > I'm running a custom build 0.56.3 + some patches ( basically up > to7889c5412 + fixes for #4150 and #4177 ). > > I don't have any radosgw low ( debug level is set to 0 and it didn't > output anything ). > I have the HTTP logs : > > 10.0.0.253 s3.svc - [14/Mar/2013:09:23:14 +] "PUT > /rb/138e6898a8039db16df2146398626f0303ae3e97427fdad33c95b6034f690b34 > HTTP/1.1" 200 0 "-" "Boto/2.6.0 (linux2)" > 10.0.0.74 s3.svc - [14/Mar/2013:09:23:14 +] "GET > /rb/138e6898a8039db16df2146398626f0303ae3e97427fdad33c95b6034f690b34?Signature=XXX%3D&Expires=1363256594&AWSAccessKeyId=XXX > HTTP/1.1" 200 622080 "-" "python-requests" > 10.0.0.253 s3.svc - [14/Mar/2013:09:23:14 +] "PUT > /rb/138e6898a8039db16df2146398626f0303ae3e97427fdad33c95b6034f690b34 > HTTP/1.1" 200 146 "-" "Boto/2.6.0 (linux2)" > 10.0.0.74 s3.svc - [14/Mar/2013:10:14:53 +] "GET > /rb/138e6898a8039db16df2146398626f0303ae3e97427fdad33c95b6034f690b34?Signature=XXX%3D&Expires=1363258236&AWSAccessKeyId=XXX > HTTP/1.1" 200 461220 "-" "python-requests" > > Can't make much out of it, will probably need rgw logs (and preferably with also 'debug ms = 1') for this issue. Yehuda -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [radosgw] Race condition corrupting data on COPY ?
Hi, > What version are you using? Do you have logs? I'm running a custom build 0.56.3 + some patches ( basically up to7889c5412 + fixes for #4150 and #4177 ). I don't have any radosgw low ( debug level is set to 0 and it didn't output anything ). I have the HTTP logs : 10.0.0.253 s3.svc - [14/Mar/2013:09:23:14 +] "PUT /rb/138e6898a8039db16df2146398626f0303ae3e97427fdad33c95b6034f690b34 HTTP/1.1" 200 0 "-" "Boto/2.6.0 (linux2)" 10.0.0.74 s3.svc - [14/Mar/2013:09:23:14 +] "GET /rb/138e6898a8039db16df2146398626f0303ae3e97427fdad33c95b6034f690b34?Signature=XXX%3D&Expires=1363256594&AWSAccessKeyId=XXX HTTP/1.1" 200 622080 "-" "python-requests" 10.0.0.253 s3.svc - [14/Mar/2013:09:23:14 +] "PUT /rb/138e6898a8039db16df2146398626f0303ae3e97427fdad33c95b6034f690b34 HTTP/1.1" 200 146 "-" "Boto/2.6.0 (linux2)" 10.0.0.74 s3.svc - [14/Mar/2013:10:14:53 +] "GET /rb/138e6898a8039db16df2146398626f0303ae3e97427fdad33c95b6034f690b34?Signature=XXX%3D&Expires=1363258236&AWSAccessKeyId=XXX HTTP/1.1" 200 461220 "-" "python-requests" Cheers, Sylvain -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [radosgw] Race condition corrupting data on COPY ?
On Mon, Mar 18, 2013 at 2:50 AM, Sylvain Munaut wrote: > Hi, > > > I've just noticed something rather worrying on our cluster. > > Some files are apparently truncated. From the first look I had at it, > it happened on files where there was a metadata update right after the > file was stored. The exact sequence was: > > - PUT to store the file > - GET to get the file (which at that point is still correct and has > the proper length) > - PUT using a 'copy source' over itself to update the metadata > > all of theses happening sequentially in the same second, very quickly. > > Then subsequent GET return a truncated file. > > > I'm looking into it to narrow down the issue but I wanted to know if > anyone had seen something similar ? > > What version are you using? Do you have logs? Thanks, Yehuda -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[radosgw] Race condition corrupting data on COPY ?
Hi, I've just noticed something rather worrying on our cluster. Some files are apparently truncated. From the first look I had at it, it happened on files where there was a metadata update right after the file was stored. The exact sequence was: - PUT to store the file - GET to get the file (which at that point is still correct and has the proper length) - PUT using a 'copy source' over itself to update the metadata all of theses happening sequentially in the same second, very quickly. Then subsequent GET return a truncated file. I'm looking into it to narrow down the issue but I wanted to know if anyone had seen something similar ? Cheers, Sylvain -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html