[ https://issues.apache.org/jira/browse/PHOENIX-2783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209304#comment-15209304 ]
Sergey Soldatov commented on PHOENIX-2783: ------------------------------------------ Yep. That's the test I used: {noformat} @Benchmark public void listMultimapTest() { ListMultimap<String, Pair> columnsByName = ArrayListMultimap.create (NUM, 1); for (Pair column : l) { String familyName = column.getS1(); String columnName = column.getS2(); if (columnsByName.put(columnName, column)) { int count = 0; for (Pair dupColumn : columnsByName.get(columnName)) { if (Objects.equal(familyName, dupColumn.getS1())) { count++; if (count > 1) { System.out.println("Found duplicate"); break; } } } } } } @Benchmark public void hashSetTest() { HashSet<String> set = new HashSet<String>(NUM); for (Pair column : l) { String familyName = column.getS1(); String columnName = column.getS2(); if(!set.add(familyName+"."+columnName)) { System.out.println("Found duplicate"); break; } } } {noformat} Values for pairs were UUID generated, so cycles were running for all values without breaking for duplicates ( the worst case). I agree that using HashSet looks cleaner. I have no objection to do it in that way. But the way, should I refactor the check in PTableImpl in the same way? > Creating secondary index with duplicated columns makes the catalog corrupted > ---------------------------------------------------------------------------- > > Key: PHOENIX-2783 > URL: https://issues.apache.org/jira/browse/PHOENIX-2783 > Project: Phoenix > Issue Type: Bug > Affects Versions: 4.7.0 > Reporter: Sergey Soldatov > Assignee: Sergey Soldatov > Attachments: PHOENIX-2783-1.patch, PHOENIX-2783-2.patch > > > Simple example > {noformat} > create table x (t1 varchar primary key, t2 varchar, t3 varchar); > create index idx on x (t2) include (t1,t3,t3); > {noformat} > cause an exception that duplicated column was detected, but the client > updates the catalog before throwing it and makes it unusable. All following > attempt to use table x cause an exception ArrayIndexOutOfBounds. This problem > was discussed on the user list recently. > The cause of the problem is that check for duplicated columns happen in > PTableImpl after MetaDataClient complete the server createTable. > The simple way to fix is to add a similar check in MetaDataClient before > createTable is called. > Possible someone can suggest a more elegant way to fix it? -- This message was sent by Atlassian JIRA (v6.3.4#6332)