(And the inverse would also be interesting to know.) -John
On Nov 8, 2013, at 6:41 PM, John Daily <[email protected]> wrote: > If you upload the files from Windows, and download them to the Ubuntu VM, do > inconsistencies ever appear? > > -John > > On Nov 8, 2013, at 4:58 PM, Engel Sanchez <[email protected]> wrote: > >> Hello there, >> >> This looks puzzling. Just from looking at the code we haven't found anything >> suspicious. Would you mind posting a pair of those files that failed to >> match somewhere so we can look at the differences? >> >> Thanks for reporting this. >> >> Engel@Basho >> >> >> On Fri, Nov 8, 2013 at 2:41 PM, finkle mcgraw <[email protected]> wrote: >> Fellow Riak users, >> >> I've noticed that when I upload binary files with sizes of >~1 MB to Riak >> from my Windows 7 (64 bit) machine, then read the same data back again, >> often it has a few corrupted bytes, while maintining the correct total data >> length. >> >> Here's the Python script I use to provoke and detect the situation: >> https://gist.github.com/anonymous/7376084 >> >> Notice that I included the typical output when running the script at the >> bottom of the gist. As you can see, for that particular run, half of the >> dummy-data files were corrupted. The returned data from Riak has the exact >> same length as the source, but not the exact same content. I've only done >> brief analysis of how the corruptions appear within the files that are >> detected as corrupted, but it looks like it's typically between 1 to 5 bytes >> that are altered, evenly distributed within the file. >> >> I get no exceptions or warnings from the Riak Python client. Everything >> appears to be in order. >> >> So far I've tested this on two different windows machines against two >> different Riak clusters (a five node Amazon cluster with a loadbalancer in >> front, and a local devcluster running inside an Ubuntu 12.04 Virtual >> Machine). The problems appear in all four possible combinations. >> >> However, if I run the script from within an Ubuntu VM, on one of the said >> Windows machines, against any of the two Riak clusteres, the problems do NOT >> appear. >> >> Another observation: If I generate 50 sample files, upload them, then >> repeatedly try to download them over and over again, the script will detect >> corruptions in different files on each repetition of downloading. E.g., on >> round one it might say that file 1,5, and 19 were corrupted, but on round >> two it might say 3, 8 and 19. >> >> Here is the riak stats-view from the Amazon cluster we're running (that I >> tested the script agains): >> https://gist.github.com/anonymous/7376379 >> >> But as I said, the corruptions appear also when working locally between a >> Win7 machine and a cluster running on a virtual Ubuntu 12.04 machine. >> >> Here are my local package versions, running on Python 2.7.5 64 bit on >> Windows 7 64 bit: >> protobuf==2.4.1 >> riak==2.0.1 >> riak-pb==1.4.1.1 >> >> Any ideas? This seems relatively serious, unless it's some kind of brutal >> oversight on my part. >> >> Finkle >> >> >> >> _______________________________________________ >> riak-users mailing list >> [email protected] >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >> >> >> _______________________________________________ >> riak-users mailing list >> [email protected] >> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
