[ https://issues.apache.org/jira/browse/ACCUMULO-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13825954#comment-13825954 ]
ASF subversion and git services commented on ACCUMULO-1833: ----------------------------------------------------------- Commit 515cd9dfd97845df2c3720548c2a5c53440c0654 in branch refs/heads/master from [~elserj] [ https://git-wip-us.apache.org/repos/asf?p=accumulo.git;h=515cd9d ] ACCUMULO-1833 Fix merge to 1.6.0 regarding TCredentials->Credentials > MultiTableBatchWriterImpl.getBatchWriter() is not performant for multiple > threads > --------------------------------------------------------------------------------- > > Key: ACCUMULO-1833 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1833 > Project: Accumulo > Issue Type: Improvement > Affects Versions: 1.5.0, 1.6.0 > Reporter: Chris McCubbin > Assignee: Josh Elser > Fix For: 1.5.1, 1.6.0 > > Attachments: ACCUMULO-1833-test.patch, ZooKeeperThreadUtilization.png > > > This issue comes from profiling our application. We have a > MultiTableBatchWriter created by normal means. I am attempting to write to it > with multiple threads by doing things like the following: > {code} > batchWriter.getBatchWriter(table).addMutations(mutations); > {code} > In my test with 4 threads writing to one table, this call is quite > inefficient and results in a large performance degradation over a single > BatchWriter. > I believe the culprit is the fact that the call is synchronized. Also there > is the possibility that the zookeeper call to Tables.getTableState on every > call is negatively affecting performance: > {code} > @Override > public synchronized BatchWriter getBatchWriter(String tableName) throws > AccumuloException, AccumuloSecurityException, TableNotFoundException { > ArgumentChecker.notNull(tableName); > String tableId = Tables.getNameToIdMap(instance).get(tableName); > if (tableId == null) > throw new TableNotFoundException(tableId, tableName, null); > > if (Tables.getTableState(instance, tableId) == TableState.OFFLINE) > throw new TableOfflineException(instance, tableId); > > BatchWriter tbw = tableWriters.get(tableId); > if (tbw == null) { > tbw = new TableBatchWriter(tableId); > tableWriters.put(tableId, tbw); > } > return tbw; > } > {code} > I recommend moving the synchronized block to happen only if the batchwriter > is not present, and also only checking if the table is online at that time: > {code} > @Override > public BatchWriter getBatchWriter(String tableName) throws > AccumuloException, AccumuloSecurityException, TableNotFoundException { > ArgumentChecker.notNull(tableName); > String tableId = Tables.getNameToIdMap(instance).get(tableName); > if (tableId == null) > throw new TableNotFoundException(tableId, tableName, null); > BatchWriter tbw = tableWriters.get(tableId); > if (tbw == null) { > if (Tables.getTableState(instance, tableId) == TableState.OFFLINE) > throw new TableOfflineException(instance, tableId); > tbw = new TableBatchWriter(tableId); > synchronized(tableWriters){ > //only create a new table writer if we haven't been beaten to it. > if (tableWriters.get(tableId) == null) > tableWriters.put(tableId, tbw); > } > } > return tbw; > } > {code} -- This message was sent by Atlassian JIRA (v6.1#6144)