Hi, We are running on HBase 1.4.0 on an AWS EMR/HBase cluster.
We have started seeing the following stacktrace when trying to take a snapshot of a table with a very large number of files (12000 regions and roughly 360000 - 400000 files). The number of files should go down as we haven't been compacting for a while for other operational reasons and are now running it. But I'd to understand why our snapshots are failing with the following: 2018-03-19 16:05:56,948 ERROR > [MASTER_TABLE_OPERATIONS-ip-10-194-208-6:16000-0] > snapshot.TakeSnapshotHandler: Failed taking snapshot { > ss=pgs-device.03-19-2018-15 table=pgs-device type=SKIPFLUSH } due to > exception:unable to parse data manifest Protocol message was too large. May > be malicious. Use CodedInputStream.setSizeLimit() to increase the size > limit. > > org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: unable to > parse data manifest Protocol message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:468) > > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.load(SnapshotManifest.java:297) > > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.open(SnapshotManifest.java:129) > > at > org.apache.hadoop.hbase.master.snapshot.MasterSnapshotVerifier.verifySnapshot(MasterSnapshotVerifier.java:108) > > at > org.apache.hadoop.hbase.master.snapshot.TakeSnapshotHandler.process(TakeSnapshotHandler.java:203) > > at > org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > > at java.lang.Thread.run(Thread.java:748) > > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > > at > com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110) > > at > com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755) > > at > com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:811) > > at > com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:329) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile.<init>(SnapshotProtos.java:1313) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile.<init>(SnapshotProtos.java:1263) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile$1.parsePartialFrom(SnapshotProtos.java:1364) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$StoreFile$1.parsePartialFrom(SnapshotProtos.java:1359) > > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles.<init>(SnapshotProtos.java:2161) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles.<init>(SnapshotProtos.java:2103) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles$1.parsePartialFrom(SnapshotProtos.java:2197) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$FamilyFiles$1.parsePartialFrom(SnapshotProtos.java:2192) > > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.<init>(SnapshotProtos.java:1165) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest.<init>(SnapshotProtos.java:1094) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1201) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotRegionManifest$1.parsePartialFrom(SnapshotProtos.java:1196) > > at > com.google.protobuf.CodedInputStream.readMessage(CodedInputStream.java:309) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.<init>(SnapshotProtos.java:3858) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.<init>(SnapshotProtos.java:3792) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3894) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest$1.parsePartialFrom(SnapshotProtos.java:3889) > > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:89) > > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:95) > > at > com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:49) > > at > org.apache.hadoop.hbase.protobuf.generated.SnapshotProtos$SnapshotDataManifest.parseFrom(SnapshotProtos.java:4115) > > at > org.apache.hadoop.hbase.snapshot.SnapshotManifest.readDataManifest(SnapshotManifest.java:464) > > : > > ... 8 more > > Thanks. ---- Saad