[ 
https://issues.apache.org/jira/browse/ACCUMULO-4065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15032470#comment-15032470
 ] 

ASF GitHub Bot commented on ACCUMULO-4065:
------------------------------------------

Github user joshelser commented on the pull request:

    https://github.com/apache/accumulo/pull/54#issuecomment-160762876
  
    > Can you expand on this? I am really curious about the behavior of thrift 
when a oneway throws an exception on the server side.
    
    Best as I understand it: the oneway modifier just equates to the generated 
code not calling the method to read the response for an RPC off the wire. This 
has multiple implications:
    
    1. The caller does not know if the application received the message (only 
that the network connection succeeded).
    2. The caller does still receive a response object (this is how void works 
though, not oneway)
    3. If you try to run another RPC that isn't oneway on the same connection, 
it's possible that the synchronous call will get goofed up by that oneway's 
void response (this might be a Thrift bug -- it's certainly a pain to work 
around).
    
    Let me test what actually happens in the client code when the oneway throws 
an exception. I have a little test harness that I used to play around with this.


> Strange temporary errors in Master after upgrade
> ------------------------------------------------
>
>                 Key: ACCUMULO-4065
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4065
>             Project: Accumulo
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.6.4, 1.7.0
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>             Fix For: 1.6.5, 1.7.1, 1.8.0
>
>
> I'm running into a problem that I saw quite a while back in ACCUMULO-3653
> I'm still trying to understand what happened, but what I understand so far is 
> that, Accumulo was running, a newer version was installed beside the running 
> version, Accumulo was stopped, the symlink changed, and the new version was 
> started. After this, we started seeing a number of errors in the Master. 
> Shortly after that, the cluster was restarted and the errors stopped 
> happening.
> This is what I can extract from the logs:
> {noformat}
> 2015-11-19 22:42:47,115 [rpc.TServerUtils] DEBUG: Instantiating default, 
> unsecure custom half-async Thrift server
> 2015-11-19 22:42:47,122 [master.Master] INFO : Started replication 
> coordinator service at host3:10001
> 2015-11-19 22:42:47,158 [master.Master] ERROR: Error processing table state 
> for store Normal Tablets
> java.lang.RuntimeException: java.lang.RuntimeException: Failed to create 
> iterator
>       at 
> org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:72)
>       at 
> org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:56)
>       at 
> org.apache.accumulo.server.master.state.MetaDataStateStore.iterator(MetaDataStateStore.java:62)
>       at 
> org.apache.accumulo.master.TabletGroupWatcher.run(TabletGroupWatcher.java:172)
> Caused by: java.lang.RuntimeException: Failed to create iterator
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.<init>(TabletServerBatchReaderIterator.java:158)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReader.iterator(TabletServerBatchReader.java:115)
>       at 
> org.apache.accumulo.server.master.state.MetaDataTableScanner.<init>(MetaDataTableScanner.java:66)
>       ... 3 more
> Caused by: org.apache.accumulo.core.client.impl.AccumuloServerException: 
> Error on server host3:9997
>       at 
> org.apache.accumulo.core.client.impl.ThriftScanner.getBatchFromServer(ThriftScanner.java:116)
>       at 
> org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablet(MetadataLocationObtainer.java:95)
>       at 
> org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocation(TabletLocatorImpl.java:463)
>       at 
> org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocationAndCheckLock(TabletLocatorImpl.java:634)
>       at 
> org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:625)
>       at 
> org.apache.accumulo.core.client.impl.TabletLocatorImpl.binRanges(TabletLocatorImpl.java:280)
>       at 
> org.apache.accumulo.core.client.impl.TabletLocatorImpl.binRanges(TabletLocatorImpl.java:355)
>       at 
> org.apache.accumulo.core.client.impl.TimeoutTabletLocator.binRanges(TimeoutTabletLocator.java:100)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.binRanges(TabletServerBatchReaderIterator.java:233)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.lookup(TabletServerBatchReaderIterator.java:220)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.<init>(TabletServerBatchReaderIterator.java:154)
>       ... 5 more
> Caused by: org.apache.thrift.TApplicationException: Internal error processing 
> flush
>       at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startScan(TabletClientService.java:232)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startScan(TabletClientService.java:208)
>       at 
> org.apache.accumulo.core.client.impl.ThriftScanner.getBatchFromServer(ThriftScanner.java:98)
>       ... 15 more
> 2015-11-19 22:42:47,178 [impl.ThriftScanner] DEBUG: Scan failed, not serving 
> tablet (+r<<,host4:9997,35121a475360010)
> 2015-11-19 22:42:47,202 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : NotServingTabletException(extent:TKeyExtent(table:2B 72, 
> endRow:null, prevEndRow:null))
> 2015-11-19 22:42:47,283 [impl.ThriftScanner] DEBUG: Scan failed, not serving 
> tablet (+r<<,host4:9997,35121a475360010)
> 2015-11-19 22:42:47,372 [impl.TabletServerBatchReaderIterator] DEBUG: Server 
> : host4:9997 msg : startMultiScan failed: unknown result
> org.apache.thrift.TApplicationException: startMultiScan failed: unknown result
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:324)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:297)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:634)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:42:47,373 [impl.TabletServerBatchReaderIterator] WARN : Error 
> on server host4:9997
> org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server 
> host4:9997
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:695)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.TApplicationException: startMultiScan failed: 
> unknown result
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:324)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:297)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:634)
>       ... 6 more
> 2015-11-19 22:42:47,376 [master.Master] ERROR: Error processing table state 
> for store Metadata Tablets
> java.lang.RuntimeException: 
> org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server 
> host4:9997
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.hasNext(TabletServerBatchReaderIterator.java:181)
>       at 
> org.apache.accumulo.server.master.state.MetaDataTableScanner.hasNext(MetaDataTableScanner.java:121)
>       at 
> org.apache.accumulo.master.TabletGroupWatcher.run(TabletGroupWatcher.java:173)
> Caused by: org.apache.accumulo.core.client.impl.AccumuloServerException: 
> Error on server host4:9997
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:695)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.TApplicationException: startMultiScan failed: 
> unknown result
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:324)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:297)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:634)
>       ... 6 more
> {noformat}
> A bit later:
> {noformat}
> 2015-11-19 22:43:04,572 [recovery.RecoveryManager] DEBUG: Recovering 
> hdfs://mycluster/apps/accumulo/data/wal/host4+9997/a2831ffa-c980-47bf-9f33-14716a0df6ec
>  to 
> hdfs://mycluster/apps/accumulo/data/recovery/a2831ffa-c980-47bf-9f33-14716a0df6ec
> 2015-11-19 22:43:04,575 [impl.TabletServerBatchReaderIterator] DEBUG: Server 
> : host4:9997 msg : closeMultiScan failed: out of sequence response
> org.apache.thrift.TApplicationException: closeMultiScan failed: out of 
> sequence response
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_closeMultiScan(TabletClientService.java:371)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.closeMultiScan(TabletClientService.java:357)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:681)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:43:04,575 [impl.TabletServerBatchReaderIterator] WARN : Error 
> on server host4:9997
> org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server 
> host4:9997
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:695)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.TApplicationException: closeMultiScan failed: 
> out of sequence response
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_closeMultiScan(TabletClientService.java:371)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.closeMultiScan(TabletClientService.java:357)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:681)
>       ... 6 more
> 2015-11-19 22:43:04,576 [master.Master] ERROR: Error processing table state 
> for store Metadata Tablets
> java.lang.RuntimeException: 
> org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server 
> host4:9997
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.hasNext(TabletServerBatchReaderIterator.java:181)
>       at 
> org.apache.accumulo.server.master.state.MetaDataTableScanner.hasNext(MetaDataTableScanner.java:121)
>       at 
> org.apache.accumulo.master.TabletGroupWatcher.run(TabletGroupWatcher.java:173)
> Caused by: org.apache.accumulo.core.client.impl.AccumuloServerException: 
> Error on server host4:9997
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:695)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.TApplicationException: closeMultiScan failed: 
> out of sequence response
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_closeMultiScan(TabletClientService.java:371)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.closeMultiScan(TabletClientService.java:357)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:681)
>       ... 6 more
> 2015-11-19 22:43:04,882 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got c
> 2015-11-19 22:43:04,985 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got 0
> 2015-11-19 22:43:05,089 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got 16
> 2015-11-19 22:43:05,192 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got ffffffd6
> 2015-11-19 22:43:05,296 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got fffffff1
> 2015-11-19 22:43:05,399 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got ffffffb7
> 2015-11-19 22:43:05,502 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got ffffffe4
> 2015-11-19 22:43:05,605 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got ffffff98
> 2015-11-19 22:43:05,687 [impl.TabletServerBatchReaderIterator] DEBUG: Server 
> : host4:9997 msg : Expected protocol id ffffff82 but got fffffff7
> org.apache.thrift.protocol.TProtocolException: Expected protocol id ffffff82 
> but got fffffff7
>       at 
> org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:472)
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:317)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:297)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:634)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:43:05,688 [impl.TabletServerBatchReaderIterator] DEBUG: 
> org.apache.thrift.protocol.TProtocolException: Expected protocol id ffffff82 
> but got fffffff7
> java.io.IOException: org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got fffffff7
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:702)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.protocol.TProtocolException: Expected protocol 
> id ffffff82 but got fffffff7
>       at 
> org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:472)
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:317)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:297)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:634)
>       ... 6 more
> 2015-11-19 22:43:05,708 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got ffffffcf
> 2015-11-19 22:43:05,793 [impl.TabletServerBatchReaderIterator] DEBUG: Server 
> : host4:9997 msg : Expected protocol id ffffff82 but got ffffffc6
> org.apache.thrift.protocol.TProtocolException: Expected protocol id ffffff82 
> but got ffffffc6
>       at 
> org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:472)
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:317)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:297)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:634)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:43:05,794 [impl.TabletServerBatchReaderIterator] DEBUG: 
> org.apache.thrift.protocol.TProtocolException: Expected protocol id ffffff82 
> but got ffffffc6
> java.io.IOException: org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got ffffffc6
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:702)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.protocol.TProtocolException: Expected protocol 
> id ffffff82 but got ffffffc6
>       at 
> org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:472)
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:317)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:297)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:634)
>       ... 6 more
> 2015-11-19 22:43:05,810 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got ffffffd4
> 2015-11-19 22:43:05,913 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got 1
> 2015-11-19 22:43:05,960 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got 1c
> 2015-11-19 22:43:05,997 [impl.TabletServerBatchReaderIterator] DEBUG: Server 
> : host4:9997 msg : Expected protocol id ffffff82 but got 19
> org.apache.thrift.protocol.TProtocolException: Expected protocol id ffffff82 
> but got 19
>       at 
> org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:472)
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:317)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:297)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:634)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:43:05,998 [impl.TabletServerBatchReaderIterator] DEBUG: 
> org.apache.thrift.protocol.TProtocolException: Expected protocol id ffffff82 
> but got 19
> java.io.IOException: org.apache.thrift.protocol.TProtocolException: Expected 
> protocol id ffffff82 but got 19
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:702)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.protocol.TProtocolException: Expected protocol 
> id ffffff82 but got 19
>       at 
> org.apache.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:472)
>       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startMultiScan(TabletClientService.java:317)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startMultiScan(TabletClientService.java:297)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:634)
>       ... 6 more
> 2015-11-19 22:43:06,006 [master.Master] WARN : Lost servers 
> [host5:9997[25121a475480008]]
> {noformat}
> And even later
> {noformat}
> 2015-11-19 22:43:41,810 [tracer.ZooTraceClient] DEBUG: Processing event for 
> trace server zk watch
> 2015-11-19 22:43:41,812 [tracer.ZooTraceClient] DEBUG: Scanning trace hosts 
> in zookeeper: /tracers
> 2015-11-19 22:43:41,813 [tracer.ZooTraceClient] DEBUG: Trace hosts: 
> [10.240.0.76:12234, 10.240.0.76:12234]
> 2015-11-19 22:43:42,066 [impl.TabletServerBatchReaderIterator] WARN : null 
> column family
> java.lang.IllegalArgumentException: null column family
>       at org.apache.accumulo.core.data.Key.<init>(Key.java:391)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:647)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:43:42,070 [master.Master] ERROR: Error processing table state 
> for store Metadata Tablets
> java.lang.IllegalArgumentException: null column family
>       at org.apache.accumulo.core.data.Key.<init>(Key.java:391)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:647)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:43:43,178 [impl.TabletServerBatchReaderIterator] WARN : null 
> column family
> java.lang.IllegalArgumentException: null column family
>       at org.apache.accumulo.core.data.Key.<init>(Key.java:391)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:647)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:43:43,178 [master.Master] ERROR: Error processing table state 
> for store Metadata Tablets
> java.lang.IllegalArgumentException: null column family
>       at org.apache.accumulo.core.data.Key.<init>(Key.java:391)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:647)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:43:44,284 [impl.TabletServerBatchReaderIterator] WARN : null 
> column family
> java.lang.IllegalArgumentException: null column family
>       at org.apache.accumulo.core.data.Key.<init>(Key.java:391)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator.doLookup(TabletServerBatchReaderIterator.java:647)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchReaderIterator$QueryTask.run(TabletServerBatchReaderIterator.java:349)
>       at org.apache.htrace.wrappers.TraceRunnable.run(TraceRunnable.java:57)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> {noformat}
> And even more
> {noformat}
> 2015-11-19 22:44:05,375 [recovery.RecoveryManager] DEBUG: Recovering 
> hdfs://mycluster/apps/accumulo/data/wal/host4+9997/a2831ffa-c980-47bf-9f33-14716a0df6ec
>  to 
> hdfs://mycluster/apps/accumulo/data/recovery/a2831ffa-c980-47bf-9f33-14716a0df6ec
> 2015-11-19 22:44:05,385 [master.Master] DEBUG: 2 assigned to dead servers: 
> [!0;~<@(null,host4:9997[35121a475360010],host4:9997[35121a475360010]), 
> !0<;~@(null,host5:9997[25121a475480008],host5:9997[25121a475480008])]...
> 2015-11-19 22:44:05,405 [impl.TabletServerBatchWriter] ERROR: Server side 
> error on host4:9997: org.apache.thrift.TApplicationException: startUpdate 
> failed: unknown result
> 2015-11-19 22:44:05,405 [master.Master] ERROR: Error processing table state 
> for store Metadata Tablets
> org.apache.accumulo.server.master.state.DistributedStoreException: 
> org.apache.accumulo.core.client.MutationsRejectedException: # constraint 
> violations : 0  security codes: {}  # server errors 1 # exceptions 0
>       at 
> org.apache.accumulo.server.master.state.MetaDataStateStore.unassign(MetaDataStateStore.java:139)
>       at 
> org.apache.accumulo.master.TabletGroupWatcher.flushChanges(TabletGroupWatcher.java:738)
>       at 
> org.apache.accumulo.master.TabletGroupWatcher.run(TabletGroupWatcher.java:295)
> Caused by: org.apache.accumulo.core.client.MutationsRejectedException: # 
> constraint violations : 0  security codes: {}  # server errors 1 # exceptions > 0
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter.checkForFailures(TabletServerBatchWriter.java:550)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter.close(TabletServerBatchWriter.java:361)
>       at 
> org.apache.accumulo.core.client.impl.BatchWriterImpl.close(BatchWriterImpl.java:54)
>       at 
> org.apache.accumulo.server.master.state.MetaDataStateStore.unassign(MetaDataStateStore.java:137)
>       ... 2 more
> 2015-11-19 22:44:05,406 [impl.TabletServerBatchWriter] ERROR: Failed to send 
> tablet server host4:9997 its batch : Error on server host4:9997
> org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server 
> host4:9997
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.sendMutationsToTabletServer(TabletServerBatchWriter.java:950)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.access$1900(TabletServerBatchWriter.java:629)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter$SendTask.send(TabletServerBatchWriter.java:816)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter$SendTask.run(TabletServerBatchWriter.java:780)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.TApplicationException: startUpdate failed: 
> unknown result
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startUpdate(TabletClientService.java:403)
>       at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startUpdate(TabletClientService.java:381)
>       at 
> org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter.sendMutationsToTabletServer(TabletServerBatchWriter.java:893)
>       ... 9 more
> {noformat}
> And, curiously, after this exception, things seem to get happy:
> {noformat}
> 2015-11-19 22:46:35,247 [transport.TIOStreamTransport] WARN : Error closing 
> output stream.
> java.io.IOException: The stream is closed
>         at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:118)
>         at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>         at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>         at java.io.FilterOutputStream.close(FilterOutputStream.java:158)
>         at 
> org.apache.thrift.transport.TIOStreamTransport.close(TIOStreamTransport.java:110)
>         at 
> org.apache.thrift.transport.TFramedTransport.close(TFramedTransport.java:89)
>         at 
> org.apache.accumulo.core.client.impl.ThriftTransportPool$CachedTTransport.close(ThriftTransportPool.java:309)
>         at 
> org.apache.accumulo.core.client.impl.ThriftTransportPool.returnTransport(ThriftTransportPool.java:571)
>         at 
> org.apache.accumulo.core.rpc.ThriftUtil.returnClient(ThriftUtil.java:147)
>         at 
> org.apache.accumulo.core.client.impl.ThriftScanner.getBatchFromServer(ThriftScanner.java:113)
>         at 
> org.apache.accumulo.core.metadata.MetadataLocationObtainer.lookupTablet(MetadataLocationObtainer.java:95)
>         at 
> org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocation(TabletLocatorImpl.java:463)
>         at 
> org.apache.accumulo.core.client.impl.TabletLocatorImpl.lookupTabletLocationAndCheckLock(TabletLocatorImpl.java:634)
>         at 
> org.apache.accumulo.core.client.impl.TabletLocatorImpl._locateTablet(TabletLocatorImpl.java:620)
>         at 
> org.apache.accumulo.core.client.impl.TabletLocatorImpl.locateTablet(TabletLocatorImpl.java:439)
>         at org.apache.accumulo.core.client.impl.Writer.update(Writer.java:88)
>         at 
> org.apache.accumulo.server.util.MetadataTableUtil.update(MetadataTableUtil.java:153)
>         at 
> org.apache.accumulo.server.util.MetadataTableUtil.update(MetadataTableUtil.java:145)
>         at 
> org.apache.accumulo.server.util.MetadataTableUtil.addTablet(MetadataTableUtil.java:211)
>         at 
> org.apache.accumulo.master.tableOps.PopulateMetadata.call(PopulateMetadata.java:43)
>         at 
> org.apache.accumulo.master.tableOps.PopulateMetadata.call(PopulateMetadata.java:25)
>         at 
> org.apache.accumulo.master.tableOps.TraceRepo.call(TraceRepo.java:57)
>         at org.apache.accumulo.fate.Fate$TransactionRunner.run(Fate.java:72)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at 
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
>         at java.lang.Thread.run(Thread.java:745)
> 2015-11-19 22:46:35,249 [impl.ThriftScanner] DEBUG: Error getting transport 
> to host4:9997 : org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: 120000 millis timeout while wai
> ting for channel to be ready for write. ch : 
> java.nio.channels.SocketChannel[connected local=/10.240.0.76:40610 
> remote=host4/10.240.0.77:9997]
> 2015-11-19 22:46:35,258 [replication.ReplicationDriver] ERROR: Caught 
> Exception trying to create Replication status records
> java.lang.RuntimeException: 
> org.apache.accumulo.core.client.impl.AccumuloServerException: Error on server 
> host5:9997
>         at 
> org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:161)
>         at 
> org.apache.accumulo.master.replication.StatusMaker.run(StatusMaker.java:94)
>         at 
> org.apache.accumulo.master.replication.ReplicationDriver.run(ReplicationDriver.java:87)
> Caused by: org.apache.accumulo.core.client.impl.AccumuloServerException: 
> Error on server host5:9997
>         at 
> org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:293)
>         at 
> org.apache.accumulo.core.client.impl.ScannerIterator$Reader.run(ScannerIterator.java:80)
>         at 
> org.apache.accumulo.core.client.impl.ScannerIterator.hasNext(ScannerIterator.java:151)
>         ... 2 more
> Caused by: org.apache.thrift.TApplicationException: Internal error processing 
> flush
>         at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>         at 
> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
>         at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startScan(TabletClientService.java:232)
>         at 
> org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startScan(TabletClientService.java:208)
>         at 
> org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:410)
>         at 
> org.apache.accumulo.core.client.impl.ThriftScanner.scan(ThriftScanner.java:285)
>         ... 4 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to