[ https://issues.apache.org/jira/browse/PHOENIX-3028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Samarth Jain updated PHOENIX-3028: ---------------------------------- Attachment: PHOENIX-3028.patch Patch that moves the commit stats time call so that it is run asynchronously. FYI, [~lhofhansl], this should solve the problem that we saw in our scale testing. The commit stats time call could possibly be a cross region server call which could fail in case of network partitions and other failure conditions. It however doesn't completely solve the problem of stats collection possibly causing compaction to run indefinitely. For example this call in StatisticsScanner, which we use for collecting stats, can fail and keep retrying: {code} @Override public boolean next(List<Cell> result, int limit) throws IOException { boolean ret = delegate.next(result, limit); updateStats(result); return ret; } {code} We would need to guard or have a wrapper scanner that essentially guards every such call in Phoenix compaction hooks so that they wouldn't cause compaction to fail. We should combine this with possibly having smaller retry counts than the default hbase retry count. We could possibly do that here in ServerUtil.java by passing our own config to HTablePool: {code} private static HTableInterface getTableFromSingletonPool(RegionCoprocessorEnvironment env, byte[] tableName) throws IOException { HTablePool pool = new HTablePool(env.getConfiguration(),1); // pass our own config here try { return pool.getTable(tableName); {code} [~jamestaylor] - WDYT? > StatisticsWriter shouldn't fail compaction in case of errors > ------------------------------------------------------------ > > Key: PHOENIX-3028 > URL: https://issues.apache.org/jira/browse/PHOENIX-3028 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.8.0 > Reporter: Samarth Jain > Assignee: Samarth Jain > Attachments: PHOENIX-3028.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)