[ 
https://issues.apache.org/jira/browse/PHOENIX-2783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15209304#comment-15209304
 ] 

Sergey Soldatov commented on PHOENIX-2783:
------------------------------------------

Yep. That's the test I used:
{noformat}
    @Benchmark
    public void listMultimapTest() {
        ListMultimap<String, Pair> columnsByName = ArrayListMultimap.create
                (NUM, 1);
        for (Pair column : l) {
            String familyName = column.getS1();
            String columnName = column.getS2();
            if (columnsByName.put(columnName, column)) {
                int count = 0;
                for (Pair dupColumn : columnsByName.get(columnName)) {
                    if (Objects.equal(familyName, dupColumn.getS1())) {
                        count++;
                        if (count > 1) {
                            System.out.println("Found duplicate");
                            break;
                        }
                    }
                }
            }
        }
    }
    @Benchmark
    public void hashSetTest() {
        HashSet<String> set = new HashSet<String>(NUM);
        for (Pair column : l) {
            String familyName = column.getS1();
            String columnName = column.getS2();
            if(!set.add(familyName+"."+columnName)) {
                System.out.println("Found duplicate");
                break;
            }
        }
    }
{noformat}
Values for pairs were UUID generated, so cycles were running for all values 
without breaking for duplicates ( the worst case). I agree that using HashSet 
looks cleaner. I have no objection to do it in that way. But the way, should I 
refactor the check in PTableImpl in the same way?

> Creating secondary index with duplicated columns makes the catalog corrupted
> ----------------------------------------------------------------------------
>
>                 Key: PHOENIX-2783
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2783
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.7.0
>            Reporter: Sergey Soldatov
>            Assignee: Sergey Soldatov
>         Attachments: PHOENIX-2783-1.patch, PHOENIX-2783-2.patch
>
>
> Simple example
> {noformat}
> create table x (t1 varchar primary key, t2 varchar, t3 varchar);
> create index idx on x (t2) include (t1,t3,t3);
> {noformat}
> cause an exception that duplicated column was detected, but the client 
> updates the catalog before throwing it and makes it unusable. All following 
> attempt to use table x cause an exception ArrayIndexOutOfBounds. This problem 
> was discussed on the user list recently. 
> The cause of the problem is that check for duplicated columns happen in 
> PTableImpl after MetaDataClient complete the server createTable. 
> The simple way to fix is to add a similar check in MetaDataClient before 
> createTable is called. 
> Possible someone can suggest a more elegant way to fix it? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to