Take a look at: http://www.fpaste.org/86434/51637081/


2014-03-18 14:15 GMT-03:00 Alena Prokharchyk <alena.prokharc...@citrix.com>:

>  Diego,
>
>  Check for these messages in the CS log:
>
>
> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-3-VM","bytesSen
> t":0,"bytesReceived":0,"result":true,"details":"success","wait":0}}] }
>
>
>  And see if you are seeing non-0 bytesReceived/bytesSent reported from
> the backend for any day after 2013-11-08
>
>  -Alena.
>
>   From: Diego Spinola Castro <spinolacas...@gmail.com>
> Date: Tuesday, March 18, 2014 at 10:12 AM
>
> To: Alena Prokharchyk <alena.prokharc...@citrix.com>
> Cc: "users@cloudstack.apache.org" <users@cloudstack.apache.org>, Kishan
> Kavala <kishan.kav...@citrix.com>
> Subject: Re: NetworkUsage
>
>   Hi Alena, there's no records after 2013-11-08 (when i did the upgrade
> to CS 4.2).
>
>
>  If there are no records for Feb2 in the DB, can you check if CS reported
> any usage during this day? How do i do that?
>
>
> 2014-03-18 13:45 GMT-03:00 Alena Prokharchyk <alena.prokharc...@citrix.com
> >:
>
>>  Ok, so having 0 for some router records is actually fine as either the
>> Vrs are stopped, or there are no vms. Good you've confirmed that's not the
>> problem.
>>
>>  Now we have to debug something different - API not returning certain
>> types of usage. For that, you have to:
>>
>>
>>    1. go to cloud_usage DB
>>    2. Select * from cloudusage where usage_type in (4,5)
>>
>> And see if there are any records for the Feb 2 are present. If there are,
>> then its an API problem - when data is present in the DB, but API doesn't
>> return it.
>>
>>  If there are no records for Feb2 in the DB, can you check if CS
>> reported any usage during this day?
>>
>>  Thanks,
>> Alena.
>>
>>   From: Diego Spinola Castro <spinolacas...@gmail.com>
>>  Date: Tuesday, March 18, 2014 at 8:46 AM
>> To: Alena Prokharchyk <alena.prokharc...@citrix.com>
>> Cc: "users@cloudstack.apache.org" <users@cloudstack.apache.org>, Kishan
>> Kavala <kishan.kav...@citrix.com>
>> Subject: Re: NetworkUsage
>>
>>   HI Alena, thanks for the answer. Let me explain better my case.
>> Regarding those accounts , i have a lot of them without any instance
>> running, there's also some vrouters of that list which are even stopped.
>> I'm trying get the traffic count from usage api, i don't know why but all
>> usage type 4 is zero, there's the commands i'm running:
>>
>>  Trying get UsageType=4
>> curl "
>> http://localhost:8096/client/api?command=listUsageRecords&startdate=2014-02-01&enddate=2014-02-01&type=4
>> "
>>  Neither Type 4 nor 5 are returning.
>>
>>  This used to work before upgrade to 4.2.
>> BTW i have only one management server.
>>
>>
>>
>>
>> 2014-03-17 20:54 GMT-03:00 Alena Prokharchyk <
>> alena.prokharc...@citrix.com>:
>>
>>> Diego, spent some more time on debugging your issue. Might not be related
>>> to transaction refactoring as it went in after 4.2. And records for some
>>> accounts actually got non-0 network stat:
>>>
>>> 0 network stats:
>>>
>>> mysql> select count(*) from user_statistics where net_bytes_received=0 or
>>> net_bytes_sent=0 order by account_id;
>>> +----------+
>>> | count(*) |
>>> +----------+
>>> |       26 |
>>> +----------+
>>> 1 row in set (0.00 sec)
>>>
>>> Non-0 network stats
>>>
>>> mysql> select count(*) from user_statistics where net_bytes_received!=0
>>> or
>>> net_bytes_sent!=0 order by account_id;
>>> +----------+
>>> | count(*) |
>>> +----------+
>>> |      207 |
>>> +----------+
>>> 1 row in set (0.00 sec)
>>>
>>>
>>>
>>> Adding Kishan to the thread as he has more expertise in Network usage
>>> calculation and he might be able to help us.
>>>
>>>
>>> I would divide problematic accounts (Networks) that are not getting usage
>>> stats update, into 2 categories:
>>>
>>> CATEGORY #1 Networks having non-0 current_bytes* and 0 net_bytes received
>>>
>>> This situation is totally normal as net_bytes_received get updated only
>>> when VR gets stopped. Yet on Diego's setup, those entries are not counted
>>> when do network usage calculation. Here is the account having problems:
>>>
>>> mysql> select * from user_statistics where account_id=176;
>>>
>>> +-----+----------------+------------+-------------------+-----------+------
>>>
>>> --------+------------+--------------------+----------------+---------------
>>> ---------+--------------------+--------------------+----------------+
>>> | id  | data_center_id | account_id | public_ip_address | device_id |
>>> device_type  | network_id | net_bytes_received | net_bytes_sent |
>>> current_bytes_received | current_bytes_sent | agg_bytes_received |
>>> agg_bytes_sent |
>>>
>>> +-----+----------------+------------+-------------------+-----------+------
>>>
>>> --------+------------+--------------------+----------------+---------------
>>> ---------+--------------------+--------------------+----------------+
>>> | 228 |              1 |        176 | NULL              |      1193 |
>>> DomainRouter |        366 |                  0 |              0 |
>>>    4271398427 |         2014478889 |                  0 |
>>>  0 |
>>>
>>> +-----+----------------+------------+-------------------+-----------+------
>>>
>>> --------+------------+--------------------+----------------+---------------
>>> ---------+--------------------+--------------------+----------------+
>>> 1 row in set (0.00 sec)
>>>
>>>
>>>
>>> And here is the Usage log:
>>>
>>>
>>> 1. 2014-03-17 03:15:02,779 DEBUG [usage.parser.NetworkUsageParser]
>>> (Usage-Job-1:null) Parsing all Network usage events for account: 176
>>>
>>> 2. 2014-03-17 03:15:02,779 DEBUG [usage.parser.NetworkUsageParser]
>>> (Usage-Job-1:null) No usage record (0 bytes used) generated for account:
>>> 176
>>>
>>>
>>>
>>>
>>> Kishan, is it by design not to count current_bytes in network usage? Can
>>> you please double check the code, because it doesn't look right to me as
>>> the VR may not go into Stopped state for a while, yet we have to
>>> calculate
>>> the usage
>>>
>>>
>>> CATEGORY #2
>>>
>>> Accounts (networks) having all 0s for
>>>
>>> net_bytes_received/net_bytes_sent/current_bytes_received/current_bytes_sent
>>>
>>> Diego, for those accounts we have to figure out whether the backend
>>> reporting any usage for the Vrs. To debug that, you have to:
>>>
>>> A) get all the VR names
>>>
>>> mysql> select name from vm_instance where removed is null and
>>> vm_type='DomainRouter' and account_id in (select account_id from
>>> user_statistics where net_bytes_received = 0 and net_bytes_sent=0 and
>>> current_bytes_received=0 and curre
>>> +-----------+
>>> | name      |
>>> +-----------+
>>> | r-621-VM  |
>>> | r-651-VM  |
>>> | r-658-VM  |
>>> | r-671-VM  |
>>> | r-682-VM  |
>>> | r-886-VM  |
>>> | r-892-VM  |
>>> | r-915-VM  |
>>> | r-934-VM  |
>>> | r-935-VM  |
>>> | r-986-VM  |
>>> | r-1002-VM |
>>> | r-1004-VM |
>>> | r-1030-VM |
>>> | r-1148-VM |
>>> | r-1205-VM |
>>> | r-1221-VM |
>>> +-----------+
>>>
>>>
>>> B) Search for NetworkUsageAnswer and routerName in the log to locate
>>> statements like that. Do it on your today's log and the log(s) from
>>> couple
>>> of days before. You might also want to run some traffic
>>>
>>>
>>> 2014-03-17 14:41:12,661 DEBUG [agent.transport.Request]
>>> (DirectAgent-15:null) Seq 1-281411626: Processing:  { Ans: , MgmtId:
>>> 4278190080, via: 1, Ver: v1, Flags: 10,
>>>
>>> [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-3-VM","bytesSen
>>> t":0,"bytesReceived":0,"result":true,"details":"success","wait":0}}] }
>>>
>>>
>>> If you see that the backend reports non-0 value, it means that the
>>> Management server fails to save it. If the backend reports 0, and yet you
>>> are sure that there is a network traffic generated on the routers above,
>>> there might be some error in a way we collect the traffic on the VR.
>>>
>>>
>>> Also how many management servers do you have in your setup?
>>>
>>>
>>> -Alena.
>>>
>>>
>>>
>>> On 3/17/14, 12:05 PM, "Alena Prokharchyk" <alena.prokharc...@citrix.com>
>>> wrote:
>>>
>>> >Diego, please go ahead and file a bug with Priority Critical against CS.
>>> >Looks like there is some bug in a way to commit db statement in
>>> >transaction resulting in the DB not being updated:
>>> >
>>> >2014-03-17 13:14:15,431 TRACE [db.Transaction.Transaction]
>>> >(RouterMonitor-1:null) txn: Not committing because transaction started
>>> >elsewhere:
>>>
>>> >-GenericDaoBase.update:814-ComponentInstantiationPostProcessor$Interceptor
>>>
>>> >Dispatcher.intercept:125-GenericDaoBase.update:774-ComponentInstantiationP
>>>
>>> >ostProcessor$InterceptorDispatcher.intercept:125-GenericDaoBase.update:126
>>>
>>> >2-ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept:125-
>>>
>>> >VirtualNetworkApplianceManagerImpl$NetworkUsageTask.run:948-Executors$Runn
>>>
>>> >ableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAn
>>>
>>> >dReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165-
>>>  >ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267 /
>>>
>>> >-VirtualNetworkApplianceManagerImpl$NetworkUsageTask.run:900-Executors$Run
>>>
>>> >nableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runA
>>> >ndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165
>>>
>>> >-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecuto
>>> >r.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:701 :
>>> >update, update, update,
>>>
>>> >-VirtualNetworkApplianceManagerImpl$NetworkUsageTask.run:900-Executors$Run
>>>
>>> >nableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runA
>>> >ndReset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165
>>>
>>> >-ScheduledThreadPoolExecutor$ScheduledFutureTask.run:267-ThreadPoolExecuto
>>> >r.runWorker:1146-ThreadPoolExecutor$Worker.run:615-Thread.run:701,
>>> >
>>> >
>>> > Attach the latest log from this email to the bug.
>>> >
>>> >Unfortunately, there is no way to workaround it on your side as its a
>>> >code issue.
>>> >
>>> >-Alena.
>>> >
>>> >From: Diego Spinola Castro
>>>  ><spinolacas...@gmail.com<mailto:spinolacas...@gmail.com>>
>>> >Date: Monday, March 17, 2014 at 11:25 AM
>>> >To: Alena Prokharchyk
>>>  ><alena.prokharc...@citrix.com<mailto:alena.prokharc...@citrix.com>>
>>> >Cc: "users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>"
>>> ><users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>>
>>> >Subject: Re: NetworkUsage
>>> >
>>> >Hi, sorry.  i went lost with this huge log.
>>> >
>>> >
>>> >
>>> >2014-03-17 15:10 GMT-03:00 Alena Prokharchyk
>>>  ><alena.prokharc...@citrix.com<mailto:alena.prokharc...@citrix.com>>:
>>> >Diego, the log you've provided, doesn't contain "Rolling back the
>>> >transaction" statement. You should wait for this particular error to
>>> >happen (while logging on TRACE), and then take a log snippet.
>>> >
>>> >From: Diego Spinola Castro
>>>  ><spinolacas...@gmail.com<mailto:spinolacas...@gmail.com>>
>>> >Date: Monday, March 17, 2014 at 11:00 AM
>>> >To: Alena Prokharchyk
>>>  ><alena.prokharc...@citrix.com<mailto:alena.prokharc...@citrix.com>>
>>> >Cc: "users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>"
>>> ><users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>>
>>> >Subject: Re: NetworkUsage
>>> >
>>> >Hi, Alena. thanks for the tips, i got some logs:
>>> >http://www.fpaste.org/86141/13950791/
>>> >
>>> >
>>> >2014-03-17 14:27 GMT-03:00 Alena Prokharchyk
>>>  ><alena.prokharc...@citrix.com<mailto:alena.prokharc...@citrix.com>>:
>>>  >Diego,
>>> >
>>> >Unfortunately CS lock lack the needed info for debugging the problem. I
>>> >see only transaction roll back statement:
>>> >
>>> >2014-03-17 06:01:22,363 DEBUG [db.Transaction.Transaction]
>>> >(RouterMonitor-1:null) Rolling back the transaction: Time = 2 Name =
>>>
>>> >-VirtualNetworkApplianceManagerImpl$NetworkUsageTask.run:900-Executors$Run
>>> >n
>>>
>>> >ableAdapter.call:471-FutureTask$Sync.innerRunAndReset:351-FutureTask.runAn
>>> >d
>>>
>>> >Reset:178-ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201:165-
>>> >
>>> >
>>> >To debug it further, you need to change the log level from DEBUG to
>>> TRACE
>>> >for a certain class by modifying log4 properties (no MS restart is
>>> >required)
>>> >
>>> >
>>> >Here is the link on how to do it:
>>> >
>>> >
>>> https://cwiki.apache.org/confluence/display/CLOUDSTACK/Useful+Tips+and+Tri
>>> >c
>>> >ks, "How to enable trace logging for a particular class in CloudStack?²
>>> >section. Replace category name with:
>>> >
>>> ><category name=³com.cloud.utils.db">
>>> >     <priority value=³TRACE"/>
>>> >   </category>
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >After the change is done, wait till you see TRACE statements in the log,
>>> >and till the error "Rolling back the transaction" occurs again. Take the
>>> >log snippet and send it out. After you are done with that, you will need
>>> >to remove TRACE config from log4 properties file, otherwise your logs
>>> will
>>> >grow huge.
>>> >
>>> >-Alena.
>>> >
>>> >
>>> >On 3/17/14, 9:48 AM, "Diego Spinola Castro"
>>>  ><spinolacas...@gmail.com<mailto:spinolacas...@gmail.com>>
>>>  >wrote:
>>> >
>>> >>Hi guys, i've been running into a issue when getting network traffic
>>> from
>>> >>usage server. All records are Zero, i have a CS 4.2.0 install.
>>> >>
>>> >>Here the usage.logs http://www.fpaste.org/86116/13950748/
>>> >>And management-server.log http://www.fpaste.org/86118/07486613/
>>> >>
>>> >>Does anyone know what is wrong ?
>>> >
>>> >
>>> >
>>>
>>>
>>
>

Reply via email to