[ https://issues.apache.org/jira/browse/ACCUMULO-368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Keith Turner resolved ACCUMULO-368. ----------------------------------- Resolution: Fixed > tablet had location but was not loaded > -------------------------------------- > > Key: ACCUMULO-368 > URL: https://issues.apache.org/jira/browse/ACCUMULO-368 > Project: Accumulo > Issue Type: Bug > Components: tserver > Affects Versions: 1.3.5 > Environment: Running random walktest against 1.4.0-SNAP on 10 node > cluster > Reporter: Keith Turner > Assignee: Keith Turner > Fix For: 1.4.0 > > > While running the random walk test a delete range operation got hung because > it could not split a tablet. The tablet in question failed to load because > the tablet server thought it was already serving it. > {noformat} > 03 11:19:18,249 [tabletserver.Tablet] TABLET_HIST: 3nq;77cd1e415c4547a4< > split 3nq;133660072804a502< 3nq;77cd1e415c4547a4;133660072804a502 > 03 11:19:18,249 [tabletserver.Tablet] TABLET_HIST: 3nq;133660072804a502< > opened > 03 11:19:26,236 [tabletserver.Tablet] TABLET_HIST: 3nq;133660072804a502< > import /b-0005t8f/I0005t8g.rf 388308 0 > 03 11:19:45,672 [tabletserver.Tablet] TABLET_HIST: 3nq;133660072804a502< MinC > [memory] -> /t-0005typ/F0005tz4.rf > 03 11:19:45,686 [tabletserver.Tablet] TABLET_HIST: 3nq;133660072804a502< > closed > 03 11:19:45,840 [tabletserver.Tablet] TABLET_HIST: 3nq;133660072804a502< > opened > 03 11:19:45,987 [tabletserver.Tablet] TABLET_HIST: 3nq;133660072804a502< > closed > 03 11:19:46,142 [tabletserver.TabletServer] INFO : Loading tablet > 3nq;133660072804a502< > 03 11:19:46,144 [tabletserver.TabletServer] ERROR: Tablet seems to be already > assigned to xxx.xxx.xxx.9:9997[135396fb18d3fb0] > 03 11:19:46,144 [tabletserver.TabletServer] INFO : Reporting tablet > 3nq;133660072804a502< assignment failure: unable to verify Tablet Information > {noformat} > Looking at the walogs below it seems that the data mutations for the last > successful open and close were written in reverse order. > {noformat} > 1 mutations: > 3nq;133660072804a502 > ~tab:~pr [system]:959756 [] ^@ > srv:dir [system]:959756 [] /t-0005typ > srv:time [system]:959756 [] M1328267935757 > loc:135396fb18d3fb0 [system]:959756 [] xxx.xxx.xxx.9:9997 > future:135396fb18d3fb0 [system]:959756 [] <deleted> > srv:lock [system]:959756 [] > tservers/xxx.xxx.xxx.9:9997/zlock-0000000000$135396fb18d3fb0 > MUTATION 6462 5 > 1 mutations: > 3nq;133660072804a502 > file:/b-0005t8f/I0005t8g.rf [system]:959986 [] 388308,0 > loaded:/b-0005t8f/I0005t8g.rf [system]:959986 [] 1681970597222144296 > srv:time [system]:959986 [] M1328267935757 > srv:lock [system]:959986 [] > tservers/xxx.xxx.xxx.9:9997/zlock-0000000000$135396fb18d3fb0 > MUTATION 6462 5 > 1 mutations: > 3nq;133660072804a502 > file:/t-0005typ/F0005tz4.rf [system]:960298 [] 185156,44330 > srv:time [system]:960298 [] M1328267963158 > last:135396fb18d3fb0 [system]:960298 [] xxx.xxx.xxx.9:9997 > log:xxx.xxx.xxx.12:11224/cad1617c-5fb2-4057-abec-8edd46d0cf7a > [system]:960298 [] <deleted> > log:xxx.xxx.xxx.5:11224/50611604-8e6c-48a8-8e16-eb739a991721 > [system]:960298 [] <deleted> > srv:flush [system]:960298 [] 0 > srv:lock [system]:960298 [] > tservers/xxx.xxx.xxx.9:9997/zlock-0000000000$135396fb18d3fb0 > MANY_MUTATIONS 6462 5 > 1 mutations: > 3nq;133660072804a502 > loc:135396fb18d3fb0 [system]:960302 [] <deleted> > MANY_MUTATIONS 6462 5 > 1 mutations: > 3nq;133660072804a502 > future:135396fb18d3fb0 [system]:960321 [] xxx.xxx.xxx.9:9997 > MANY_MUTATIONS 6462 5 > 1 mutations: > 3nq;133660072804a502 > loc:135396fb18d3fb0 [system]:960326 [] <deleted> > MANY_MUTATIONS 6462 5 > 1 mutations: > 3nq;133660072804a502 > loc:135396fb18d3fb0 [system]:960332 [] xxx.xxx.xxx.9:9997 > future:135396fb18d3fb0 [system]:960332 [] <deleted> > {noformat} > Looking at the tablet server code, a tablet is put in online tablets and then > the location is written to the metadata table. Since the tablet is in online > tablets it could be unloaded. I think that is what happened here. In the > short period of time between putting the tablet in onlinetablets and writing > the location to the metadata table, the tablet was unloaded. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira