[GitHub] cloudstack pull request: CLOUDSTACK-9340: General DB Optimization

serg38 Sun, 10 Apr 2016 19:19:54 -0700

Github user serg38 commented on the pull request:

    https://github.com/apache/cloudstack/pull/1466#issuecomment-208121230
  
    @rafaelweingartner 
    >>>>>>>>
     Here you change this table "ovs_tunnel_network", you dropped the primary 
key "id"; Will that remove only the index or the field too?
        Also, this change on database structures should be reflected on the 
pseudo "JPA" mapping that we use right?
        For instance, the class "com.cloud.network.ovs.dao.OvsTunnelNetworkVO" 
that represents the "ovs_tunnel_network" table, it is annotated as the "id" 
field being the id, and not the "ovs_tunnel_network"
    >>>>>>>>>
    
    We don't change any table structure but indexes so no changes in the 
mapping should  be required. For ovs_tunnel_network it was the only table in 
ACS DB that had primary key  not based on ID whereas ID was an unique index. In 
create-schema.sql
    CREATE TABLE `cloud`.`ovs_tunnel_network`(
      `id` bigint unsigned NOT NULL UNIQUE AUTO_INCREMENT,
      `from` bigint unsigned COMMENT 'from host id',
      `to` bigint unsigned COMMENT 'to host id',
      `network_id` bigint unsigned COMMENT 'network identifier',
      `key` int unsigned COMMENT 'gre key',
      `port_name` varchar(32) COMMENT 'in port on open vswitch',
      `state` varchar(16) default 'FAILED' COMMENT 'result of tunnel 
creatation',
      PRIMARY KEY(`from`, `to`, `network_id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
    
    What we are proposing is to swap them by making ID a primary key and 
creating a new unique index based on 3 columns `from`, `to` and `network_id`.
    
    >>>>>>>>>
    Here you remove duplicated primary keys.
        You are removing indexes and not primary keys per se, right?
        I see you removing two indexes, does that mean that the table has 
another one? A third one?
        Because if you remove the ID field index, if we use a select by that 
field, it would cause a full table scan, right?
    >>>>>>>>
    That's correct. We are removing indexes that are duplicate to the primary 
key. In some tables there  are 2 such indexes.
    
    >>>>>>>>>
     did you execute some evaluation to see which fields were being most used 
to create an index for them? 
        For instance, why did you create an index for the field called "type" 
of table "op_it_work"
    >>>>>>>>>
    Yes we did. Our prod DB has 1 mil+ VMs and volumes and 20K+ accounts. We 
started seeing degradation in simple list calls. We reviewed and analyzed all 
MySQL slow queries to see if any of them use full table scans. So far we found 
7 cases where extra index could help. Altogether these changes reduced the load 
on Mysql by 30% (it doesn't spend time any more constantly doing full table 
scans) and improved some API calls by up to 200%. 
    
    >>>>>>>>>>
      This will remove from the projection the VMs already removed. Aren't they 
needed
    >>>>>>>>>
    We analyzed both the code and other DB tables. account_vmstats_view is only 
used in account_view and only in the join using these clauses
            LEFT JOIN `account_vmstats_view` `runningvm` ON (((`account`.`id` = 
`runningvm`.`account_id`)
                AND (`runningvm`.`state` = 'Running'))))
            LEFT JOIN `account_vmstats_view` `stoppedvm` ON (((`account`.`id` = 
`stoppedvm`.`account_id`)  AND (`stoppedvm`.`state` = 'Stopped'))))
    Since only Running and Stopped VMs used there is no need to bring Expunged 
one into the vmstats view. This changes alone improve listAccounts call 
retrieval from 11 sec to 3 ( x4 improvement).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] cloudstack pull request: CLOUDSTACK-9340: General DB Optimization

Reply via email to