Josh, how big are all the logs? On Tue, Sep 23, 2014 at 9:43 PM, Josh Elser <[email protected]> wrote:
> Well, color me shocked -- the verify found some bad data. It looks > like two keys have bad checksums (which I assume is what created the > UNDEFINEDs, too?). > > CORRUPT 2 > REFERENCED 2199999908 > UNDEFINED 2 > UNREFERENCED 874770 > > I ran two tabletservers on my desktop, turned on hflush instead of > hsync, switched from GZ to snappy and upped the splits threshold for > 4g and let CI run for ~5 hours. I killed the tservers about a dozen > times by hand throughout the day (kill -9), and the master once or > twice. The datanode was left alone. This was running on 2.6.0-SNAPSHOT > from around 9/14/2014. > > The offending keys are: > > 389a85668b6ebf8e 2ff6:4a78 [] 1411499115242 > > 3a10885b-d481-4d00-be00-0477e231ey65:000000008576b169:0cd98965c9ccc1d0:ba15529e > > and > > 7e56b58a0c7df128 5fa0:6249 [] 1411499311578 > > 3a10885b-d481-4d00-be00-0477e231e965:0000p000872d60eb:499fa72752d82a7c:5c5f19e8 > > which both happened a little after 3:00pm eastern (I stopped CI around > 3:30pm eastern). I don't see anything immediately wrong in the tserver > logs (nor does it appear that I had restarted either of them around > the timestamp of the above keys). I see no errors in the DN logs > either around that time window. > > I don't have a clue how to even start looking at this to figure out if > something indeed went wrong, or if it's some other sort of issue. To > be clear, this as it stands isn't sufficient to make me change my > vote. > > On Tue, Sep 23, 2014 at 3:04 PM, Josh Elser <[email protected]> wrote: > > +1 > > > > * Verified checksums+sigs > > * Build from source tarball and ran all unit+functional tests against > > Apache Hadoop 2.5.1 and 2.6.0-SNAPSHOT > > * Ingested 2B records w/ CI + clean verify with single tserver (Apache > > Hadoop 2.6.0-SNAPSHOT + Apache ZooKeeper 3.4.5) > > * Ingested ~2.5B records w/ CI with 2 tservers and some manual > > agitation (Apache Hadoop 2.6.0-SNAPSHOT + Apache ZooKeeper 3.4.5) > > - Currently running verify, will report if I get a failed verify > > * Ran some Hive queries (w/ Apache Hive-0.14.0-SNAPSHOT & Apache Tez > > 0.6.0-SNAPSHOT) > > * Ran some Pig queries (w/ Apache Pig-0.13.0) > > > > Thanks for organizing this, Corey!! > > > > On Fri, Sep 19, 2014 at 10:49 PM, Corey Nolet <[email protected]> wrote: > >> Devs, > >> > >> Please consider the following candidate for Apache Accumulo 1.6.1 > >> > >> Branch: 1.6.1-rc1 > >> SHA1: 88c5473b3b49d797d3dabebd12fe517e9b248ba2 > >> Staging Repository: > >> * > https://repository.apache.org/content/repositories/orgapacheaccumulo-1017/ > >> < > https://repository.apache.org/content/repositories/orgapacheaccumulo-1017/ > >* > >> > >> Source tarball: > >> * > http://repository.apache.org/content/repositories/orgapacheaccumulo-1017/org/apache/accumulo/accumulo/1.6.1/accumulo-1.6.1-src.tar.gz > >> < > http://repository.apache.org/content/repositories/orgapacheaccumulo-1017/org/apache/accumulo/accumulo/1.6.1/accumulo-1.6.1-src.tar.gz > >* > >> Binary tarball: > >> * > http://repository.apache.org/content/repositories/orgapacheaccumulo-1017/org/apache/accumulo/accumulo/1.6.1/accumulo-1.6.1-bin.tar.gz > >> < > http://repository.apache.org/content/repositories/orgapacheaccumulo-1017/org/apache/accumulo/accumulo/1.6.1/accumulo-1.6.1-bin.tar.gz > >* > >> (Append ".sha1", ".md5" or ".asc" to download the signature/hash for a > >> given artifact.) > >> > >> Signing keys available at: https://www.apache.org/dist/accumulo/KEYS > >> > >> Over 1.6.1, we have 188 issues resolved > >> * > https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=blob;f=CHANGES;h=91b9d31e3b9dc53f1a576cc49bbc061919eb0070;hb=1.6.1-rc1 > >> < > https://git-wip-us.apache.org/repos/asf?p=accumulo.git;a=blob;f=CHANGES;h=91b9d31e3b9dc53f1a576cc49bbc061919eb0070;hb=1.6.1-rc1 > >* > >> > >> Testing: All unit and functional tests are passing. > >> > >> Vote will be open until Thursday, September 25th 12:00AM UTC (9/24 > 8:00PM > >> ET, 9/24 5:00PM PT) > -- Sean
