Wei Zhou created CLOUDSTACK-685:
-----------------------------------
Summary: CloudStack 4.0 Network Usage is ZERO
Key: CLOUDSTACK-685
URL: https://issues.apache.org/jira/browse/CLOUDSTACK-685
Project: CloudStack
Issue Type: Bug
Security Level: Public (Anyone can view this level - this is the default.)
Components: Usage
Affects Versions: 4.0.0
Reporter: Wei Zhou
Problem description:
The usage server can give the system usage of each virtual machine (such as
running time, ServiceOffering, IPAddress, Volume, Template, ISO, Port
Forwarding Rule, Network offering), except the network bytes sent/received.
This problem only exists in CloudStack 4.0. In CloudStack 3.0, the usage
server works well.
We looked into Iptables of VR, its working fine (look at the output of 10g
download). This is just a sample.
root@r-17-VM:/# iptables -nvx -L NETWORK_STATS
Chain NETWORK_STATS (3 references)
pkts bytes target prot opt in out source
destination
246477 12943171 all -- eth0 eth2 0.0.0.0/0
0.0.0.0/0
125789 1008395759 all -- eth2 eth0 0.0.0.0/0
0.0.0.0/0
0 0 tcp -- !eth0 eth2 0.0.0.0/0
0.0.0.0/0
0 0 tcp -- eth2 !eth0 0.0.0.0/0
0.0.0.0/0
We tried to debug it further & found the following:
(1) Checked the CloudStack logs /var/log/cloud/management/management-server.log
2012-12-18 15:05:51,130 DEBUG [agent.transport.Request]
(AgentManager-Handler-8:null) Seq 1-1158742136: Processing: { Ans: , MgmtId:
345051509349, via: 1, Ver: v1, Flags: 10,
[{"NetworkUsageAnswer":{"routerName":"r-4-VM","bytesSent":5928,"bytesReceived":6188,"result":true,"details":"","wait":0}}]
}
2012-12-18 15:05:51,130 DEBUG [agent.transport.Request] (RouterMonitor-1:null)
Seq 1-1158742136: Received: { Ans: , MgmtId: 345051509349, via: 1, Ver: v1,
Flags: 10, { NetworkUsageAnswer } }
2012-12-18 15:05:51,130 DEBUG [agent.manager.AgentManagerImpl]
(RouterMonitor-1:null) Details from executing class
com.cloud.agent.api.NetworkUsageCommand:
2012-12-18 15:05:51,131 WARN
[network.router.VirtualNetworkApplianceManagerImpl] (RouterMonitor-1:null)
unable to find stats for account: 2
We can see that AgentManager works well. It can get the network usage.
(2) Checked the the database.
There are 4 tables, cloud.user_statistics, cloud_usage.user_statistics,
cloud_usage.usage_network and cloud_usage.cloud_usage which is used by API &
we're trying to filter Network Usage using 'usage_type' 4 or 5
As we add new network (VR), it adds a row in cloud.user_statistics & eventually
ends up (could be because of generateUsageRecords API command) in other 2
tables ( cloud_usage.user_statistics, cloud_usage.usage_network )
mysql> select * from cloud.user_statistics;
+----+----------------+------------+-------------------+-----------+--------------+------------+--------------------+----------------+------------------------+--------
| id | data_center_id | account_id | public_ip_address | device_id |
device_type | network_id | net_bytes_received | net_bytes_sent |
current_bytes_received | current
+----+----------------+------------+-------------------+-----------+--------------+------------+--------------------+----------------+------------------------+--------
| 1 | 1 | 2 | NULL | 4 |
DomainRouter | 204 | 0 | 0 |
0 |
| 2 | 1 | 2 | NULL | 6 |
DomainRouter | 205 | 0 | 0 |
0 |
+----+----------------+------------+-------------------+-----------+--------------+------------+--------------------+----------------+------------------------+--------
(3) Checked the source code.
(a)
/cloud-server/src/com/cloud/network/router/VirtualNetworkApplianceManagerImpl.java
( I have deleted some unrelated source codes)
public void run() {
for (DomainRouterVO router : routers) {
String privateIP = router.getPrivateIpAddress(); // from
domain_router table (id=4, name=r-4-VM,
vm_instance.private_ip_address=169.254.1.17)
if (privateIP != null) {
List<? extends Nic> routerNics =
_nicDao.listByVmId(router.getId()); // from nics table (instance_id=4).
Found 3 records. Return (networkid= 204/202/200)
for (Nic routerNic : routerNics) {
Network network =
_networkMgr.getNetwork(routerNic.getNetworkId()); // from nics table.
if (network.getTrafficType() == TrafficType.Public) {
// get traffic_type in networks table. Return Guest/Control/Public.
UserStatisticsVO previousStats =
_statsDao.findBy(router.getAccountId(),
router.getDataCenterIdToDeployIn(),
network.getId(), null, router.getId(), router.getType().toString());
if (answer != null) {
Transaction txn =
Transaction.open(Transaction.CLOUD_DB);
try {
txn.start();
UserStatisticsVO stats =
_statsDao.lock(router.getAccountId(),
router.getDataCenterIdToDeployIn(),
network.getId(), routerNic.getIp4Address(), router.getId(),
router.getType().toString()); // the parameter is “2, 1, 200,
10.11.102.168, 4”. It’s inconsistent to database.
if (stats == null) {
s_logger.warn("unable to find stats for
account: " + router.getAccountId());
continue;
}
………………..
}
(b) /cloud-server/src/com/cloud/vm/dao/DomainRouterDaoImpl.java
public void addRouterToGuestNetwork(VirtualRouter router,
Network guestNetwork) {
if (stats == null) {
stats = new UserStatisticsVO(router.getAccountId(),
router.getDataCenterIdToDeployIn(), null,
router.getId(),router.getType().toString(), guestNetwork.getId()); // the
parameter is “2, 1, null, 204, 4”. It’s consistent to database.
_userStatsDao.persist(stats);
}
}
(4) Changed databases and tested.
Since the IP Address is null & the network_id is not public in the
cloud.user_statistics table, it's not doing anything related to traffic. As a
result all the column related to traffic stats remains 0.
But if we explicitly add a row in cloud.user_statistics with network's
(VR) Source NATed IP Address and the public network_id, It worked!
| id | data_center_id | account_id | public_ip_address | device_id |
device_type | network_id |
| 3 | 1 | 2 | 10.11.102.168 | 4 | DomainRouter |
200 |
After 5 minutes, we can get the VR data traffic.It looks like
| id | data_center_id | account_id | public_ip_address | device_id |
device_type | network_id | net_bytes_received | net_bytes_sent |
current_bytes_received | current_bytes_sent | agg_bytes_received |
agg_bytes_sent |
| 3 | 1 | 2 | 10.11.102.168 | 4 | DomainRouter |
200 | 0 | 0 | 2284239008 | 33613977 |
2284239008 | 33613977 |
>From Step 1 to Step 4, we can see that maybe the inconsistent of (3.a) and
>(3.b) is the root cause of this problem.
We notice that there are some changes of
VirtualNetworkApplianceManagerImpl.java between CloudStack 3.0.2 and CloudStack
4.0.0. This is why this problem only exists in CloudStack 4.0.0.
In CloudStack 4.0.0 :
https://github.com/CloudStack-extras/CloudStack-archive/blob/master/server/src/com/cloud/network/router/VirtualNetworkApplianceManagerImpl.java
In CloudStack 3.0.2 :
https://github.com/CloudStack-extras/CloudStack-archive/blob/3.0.2.maintenance/server/src/com/cloud/network/router/VirtualNetworkApplianceManagerImpl.java
Why does CloudStack make this change which cause this problem?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira