Hi Patrice,

I have another question. I launched a test this morning, and it ends
with an error (Don't worry, it is not related to any timer, or vm
configuration,...). 

In the terminal of the manager I see:

14:53:50.228| sailfin_uac 00 (S+F)=   18607 F=      39 IHS= 0.20960%
14:53:50.228|    *IHS ALL*  (S+F)=   18607 F=      39 IHS= 0.20960%
14:53:50.303|CPU0 18439557ms: 45.4545 MT: 3631924 MF: 2486792
14:53:51.198|<TS1> ERROR_REPORT 0xfff
14:53:51.198|<TS1> ERROR_REPORT 0xfff
14:53:51.199|<TS1> ERROR_REPORT 0xfff
...
14:53:51.218|<TS1> ERROR_REPORT 0xfff
14:53:51.218|<TS1> ERROR_REPORT 0xfff
14:53:51.313|CPU0 18440567ms: 70.5882 MT: 3631924 MF: 2486792
14:53:52.313|CPU0 18441567ms: 35.0000 MT: 3631924 MF: 2486792
14:53:52.595|TS1 deregistered
14:53:52.595|shutdown
14:53:52.595|Closing CPU0 connection... 
14:53:52.596|please wait...
14:53:54.597|Manager exit with rc=0

It is not the first time a test ends this way. Do you have an idea where
the error could come from? It is a network error or something like that?
The IHS is not big, the SUT is not overloaded at all, ...

Regards,

A.





Le vendredi 28 mai 2010 à 14:37 +0100, Buriez, Patrice a écrit :
> Hi Antoine,
> 
> There is obviously something wrong with the clock on your setup.
> Just take a look at the leftmost timestamp in manager.log: it shows that the 
> manager ran from 10:31:37.295 to 10:39:26.425. However, it also shows on 
> lines 472, 730 and 1912 that the clock suddenly and temporarily jumped 4398 
> seconds (+01:13:18) into the future. This problem is further confirmed in 
> report.xml, for example:
> <!-- update step at 12  2010-05-28 10:33:22.823 [4481546ms]-->
> <!-- update step at 18  2010-05-28 10:33:22.823 [83500ms]-->
> For your information, the YMDHMS and the ms timestamps are printed from 
> different variables, and the ms timestamp is sometimes too high by 4398046ms. 
> This is the same difference as in your previous report.xml files, so the 
> problem seems to be very reproducible.
> Checking in the source code, I tracked this problem down to getmilliseconds() 
> in utils.cpp and to the gettimeofday() system call. The root cause of the 
> problem is most probably the system clock that transiently jumps into the 
> future for some reason... When it happens, the manager thinks that the step 
> is completed and increases the load to the next step.
> 
> It is very likely that your system clock problem originates from a 
> disagreement between the hypervisor and the virtual machine about what time 
> it is. By the way, what hypervisor are you using?
> I will let you further investigate the HV and VM clock mis-configuration. You 
> already killed ntpd and ptpd on the VM, but there is probably a 
> synchronization feature in the HV that regularly adjusts the clock on the 
> VM... Or may be yet another time-related daemon on the VM that incorrectly 
> moves the clock ahead and the HV (almost) immediately restores it to the 
> correct time...
> To evidence the problem and to confirm that it has been solved, you can write 
> a simple C program that calls gettimeofday() in a loop, prints the tv_sec 
> field of the timeval structure, and checks whether it has suddenly decreased 
> (showing that it was incorrectly increased in the previous loop iteration). 
> In order to avoid 100% CPU usage, you could add a sleep() inside the loop, 
> but since this system call is also time-based, it might actually prevent your 
> program from showing the problem!
> 
> I don't mean that IMS Bench SIPp is not supported in a VM, but we never 
> tested this setup. In fact, we usually do the opposite: in order to benchmark 
> our SUTs, we use a stack of at least 4 physical servers on which we run at 
> least 4 SIPp TS instances. Anyway, it might run in a VM, provided the 
> generated load is not too high, but you will probably lose some precision on 
> the results. The major prerequisite, on a VM or on a physical server, is that 
> the clock is linear and monotonic, which is not your case at the moment.
> Regarding the clock precision, the installation guide at 
> http://sipp.sourceforge.net/ims_bench/reference.html#Pre-requisites 
> recommends to recompile the kernel with the timer frequency set to 1000Hz. 
> Some distributions come with a kernel already configured that way, and 
> otherwise may be you recompiled your kernel accordingly on the VM. But 
> whatever the configuration of the VM, it still depends on the configuration 
> of the underlying HV, which might possibly be out of your control.
> 
> I guess that your current goal is to validate whether IMS Bench SIPp can be 
> used to benchmark SailFin, and to check that the generated reports contain 
> the information that you expect. For that purpose, I understand your choice 
> of running in a VM with a low load, because you plan to validate features 
> rather than full performance. I also guess that, once you validated the 
> tools, you would deploy a real test setup using one or more dedicated 
> physical servers in order to benchmark the real performance of your SailFin 
> SUT.
> If that's the case, and unless the VM/HV clock mis-configuration issue is 
> really obvious, then there is no real value in spending your time to make it 
> work in a VM, because the final setup would use physical servers anyway. 
> Instead, I would suggest that you temporarily use a real physical machine, 
> because a low end server or even a desktop should be good enough to validate 
> the features under a low load.
> 
> Regarding the "segmentation fault" problem, I think that it is related to the 
> clock issue, because we obviously assume that the clock is monotonic and we 
> make decisions based on the amount of time spent. If the time difference is 
> sometimes negative, then we might take wrong decisions, such as deleting an 
> object which could later be accessed at another point in the code when the 
> time difference is correct again... So I wouldn't worry too much about it, 
> until the clock issue is solved.
> 
> Regards,
> Patrice
> 
> -----Original Message-----
> From: Antoine Roly [mailto:[email protected]] 
> Sent: Friday, May 28, 2010 10:47 AM
> To: Buriez, Patrice
> Cc: [email protected]
> Subject: RE: sipp ims bench
> 
> Hi Patrice,
> 
> I've tried some tests with the initialSAPS to a even value, but the
> results are still strange, and the SAPS increases more than expected.
> The files you've asked (report and manafer.log) are attached.
> 
> If the problem could come from the clock, I'm going to investigate in
> this direction. The TS and manager are running in a virtual machine,
> maybe something is wrong... It should not, and I've never had a problem
> with it but we never know...
> 
> Regards,
> 
> A.
> 
> 
> 
> Le jeudi 27 mai 2010 à 18:35 +0100, Buriez, Patrice a écrit :
> > Hi Antoine,
> > 
> > This is really weird, the [ms] timestamp in the report.xml still moves back 
> > and forth, while the "YMD HMS.ms" seems correct!
> > Because of that transient wrong time reference, the load is increased too 
> > often. That's why you got 60 instead of 5.
> > 
> > Can you do one more try, with InitialSAPS set to an even value, or to any 
> > multiple of (StirSteps+1)?
> > Please also attach the manager.log file.
> > It's OK to run the manager and TS on the same computer.
> > 
> > Regards,
> > Patrice
> > 
> > -----Original Message-----
> > From: Antoine Roly [mailto:[email protected]] 
> > Sent: Thursday, May 27, 2010 6:14 PM
> > To: Buriez, Patrice
> > Cc: [email protected]
> > Subject: RE: sipp ims bench
> > 
> > Hi Patrice,
> > 
> > I've "svn co" revision 587 and killed ptpd and ntpd. 
> > I haven't seen anything weird when I compiled the soft (make rmtl,  ossl
> > and mgr as in the doc). 
> > 
> > The manager and the TS are the same host, so I suppose it's ok to run
> > the test without both ntpd and ptpd, but I had to put the MaxTimeOffset
> > to 0. 
> > I don't know if this can have an important negative impact on the test
> > (other than for the time in the report of course).
> > 
> > I've made several tests today, the results are strange. Almost all tests
> > end correctly (i.e. without seg fault, but the results are weird), some
> > test ends with a seg fault like in a previous mail.
> > 
> > Here are the 3 files from the latest test... In this one, the SAPS
> > increased more than expected and overloaded sailfin. As you can see in
> > the report, the requested load of the first step was 5, but the mean
> > value is 60!!! I don't understand why the SAPS increase so much. Gsl is
> > working, I think the soft uses that to generate traffic so...
> > 
> > Obviously there's something wrong, maybe in the way I'm using the bench,
> > I don't know... Is it possible it's not working as expected due to the
> > very low value I'm using (for initialSAPS, SAPSincreaseAmount,...)? I
> > suppose not but... Or because I've only a single TS running on the same
> > host than the manager, and without ntpd or ptpd?
> > 
> > Regards,
> > 
> > A.
> > 
> > Le mercredi 26 mai 2010 à 17:54 +0100, Buriez, Patrice a écrit :
> > > Hi Antoine,
> > > 
> > > I investigated the files you sent.
> > > The report.xml file suggests that the time reference is moving back and 
> > > forth.
> > > I see several possible reasons for that:
> > > 
> > > - Are you running ntpd and ptpd at the same time?
> > > If that's the case, kill at least one of them, or even both, and try 
> > > again.
> > > 
> > > - The "Segmentation fault" suggests that something is going really bad. 
> > > May be the stack got corrupted...
> > > Try a "make clean", then "make", and check for errors and warnings. 
> > > Anything weird there?
> > > 
> > > - We might have a regression in IMS Bench SIPp.
> > > Get revision 587 and try again with this first version that supports 
> > > SailFin:
> > >   svn co -r 587 
> > > https://sipp.svn.sourceforge.net/svnroot/sipp/sipp/branches/ims_bench 
> > > ims_bench-587
> > > 
> > > Regards,
> > > Patrice
> > > 
> > > -----Original Message-----
> > > From: Antoine Roly [mailto:[email protected]] 
> > > Sent: Wednesday, May 26, 2010 2:12 PM
> > > To: Buriez, Patrice
> > > Subject: sipp ims bench
> > > 
> > > Hi Patrice,
> > > 
> > > Here are the files you asked.
> > >  
> > > For this test, only one instance of SIPp was running, on the same host
> > > that the manager. I suppose this is not a problem. Of course the SUT was
> > > another host.
> > > 
> > > Thanks in advance
> > > 
> > > Regards,
> > > 
> > > Antoine
> > > 
> ---------------------------------------------------------------------
> Intel Corporation NV/SA
> Rond point Schuman 6, B-1040 Brussels
> RPM (Bruxelles) 0415.497.718. 
> Citibank, Brussels, account 570/1031255/09
> 
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> 



------------------------------------------------------------------------------

_______________________________________________
Sipp-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sipp-users

Reply via email to