Hi Patrice, I have another question. I launched a test this morning, and it ends with an error (Don't worry, it is not related to any timer, or vm configuration,...).
In the terminal of the manager I see: 14:53:50.228| sailfin_uac 00 (S+F)= 18607 F= 39 IHS= 0.20960% 14:53:50.228| *IHS ALL* (S+F)= 18607 F= 39 IHS= 0.20960% 14:53:50.303|CPU0 18439557ms: 45.4545 MT: 3631924 MF: 2486792 14:53:51.198|<TS1> ERROR_REPORT 0xfff 14:53:51.198|<TS1> ERROR_REPORT 0xfff 14:53:51.199|<TS1> ERROR_REPORT 0xfff ... 14:53:51.218|<TS1> ERROR_REPORT 0xfff 14:53:51.218|<TS1> ERROR_REPORT 0xfff 14:53:51.313|CPU0 18440567ms: 70.5882 MT: 3631924 MF: 2486792 14:53:52.313|CPU0 18441567ms: 35.0000 MT: 3631924 MF: 2486792 14:53:52.595|TS1 deregistered 14:53:52.595|shutdown 14:53:52.595|Closing CPU0 connection... 14:53:52.596|please wait... 14:53:54.597|Manager exit with rc=0 It is not the first time a test ends this way. Do you have an idea where the error could come from? It is a network error or something like that? The IHS is not big, the SUT is not overloaded at all, ... Regards, A. Le vendredi 28 mai 2010 à 14:37 +0100, Buriez, Patrice a écrit : > Hi Antoine, > > There is obviously something wrong with the clock on your setup. > Just take a look at the leftmost timestamp in manager.log: it shows that the > manager ran from 10:31:37.295 to 10:39:26.425. However, it also shows on > lines 472, 730 and 1912 that the clock suddenly and temporarily jumped 4398 > seconds (+01:13:18) into the future. This problem is further confirmed in > report.xml, for example: > <!-- update step at 12 2010-05-28 10:33:22.823 [4481546ms]--> > <!-- update step at 18 2010-05-28 10:33:22.823 [83500ms]--> > For your information, the YMDHMS and the ms timestamps are printed from > different variables, and the ms timestamp is sometimes too high by 4398046ms. > This is the same difference as in your previous report.xml files, so the > problem seems to be very reproducible. > Checking in the source code, I tracked this problem down to getmilliseconds() > in utils.cpp and to the gettimeofday() system call. The root cause of the > problem is most probably the system clock that transiently jumps into the > future for some reason... When it happens, the manager thinks that the step > is completed and increases the load to the next step. > > It is very likely that your system clock problem originates from a > disagreement between the hypervisor and the virtual machine about what time > it is. By the way, what hypervisor are you using? > I will let you further investigate the HV and VM clock mis-configuration. You > already killed ntpd and ptpd on the VM, but there is probably a > synchronization feature in the HV that regularly adjusts the clock on the > VM... Or may be yet another time-related daemon on the VM that incorrectly > moves the clock ahead and the HV (almost) immediately restores it to the > correct time... > To evidence the problem and to confirm that it has been solved, you can write > a simple C program that calls gettimeofday() in a loop, prints the tv_sec > field of the timeval structure, and checks whether it has suddenly decreased > (showing that it was incorrectly increased in the previous loop iteration). > In order to avoid 100% CPU usage, you could add a sleep() inside the loop, > but since this system call is also time-based, it might actually prevent your > program from showing the problem! > > I don't mean that IMS Bench SIPp is not supported in a VM, but we never > tested this setup. In fact, we usually do the opposite: in order to benchmark > our SUTs, we use a stack of at least 4 physical servers on which we run at > least 4 SIPp TS instances. Anyway, it might run in a VM, provided the > generated load is not too high, but you will probably lose some precision on > the results. The major prerequisite, on a VM or on a physical server, is that > the clock is linear and monotonic, which is not your case at the moment. > Regarding the clock precision, the installation guide at > http://sipp.sourceforge.net/ims_bench/reference.html#Pre-requisites > recommends to recompile the kernel with the timer frequency set to 1000Hz. > Some distributions come with a kernel already configured that way, and > otherwise may be you recompiled your kernel accordingly on the VM. But > whatever the configuration of the VM, it still depends on the configuration > of the underlying HV, which might possibly be out of your control. > > I guess that your current goal is to validate whether IMS Bench SIPp can be > used to benchmark SailFin, and to check that the generated reports contain > the information that you expect. For that purpose, I understand your choice > of running in a VM with a low load, because you plan to validate features > rather than full performance. I also guess that, once you validated the > tools, you would deploy a real test setup using one or more dedicated > physical servers in order to benchmark the real performance of your SailFin > SUT. > If that's the case, and unless the VM/HV clock mis-configuration issue is > really obvious, then there is no real value in spending your time to make it > work in a VM, because the final setup would use physical servers anyway. > Instead, I would suggest that you temporarily use a real physical machine, > because a low end server or even a desktop should be good enough to validate > the features under a low load. > > Regarding the "segmentation fault" problem, I think that it is related to the > clock issue, because we obviously assume that the clock is monotonic and we > make decisions based on the amount of time spent. If the time difference is > sometimes negative, then we might take wrong decisions, such as deleting an > object which could later be accessed at another point in the code when the > time difference is correct again... So I wouldn't worry too much about it, > until the clock issue is solved. > > Regards, > Patrice > > -----Original Message----- > From: Antoine Roly [mailto:[email protected]] > Sent: Friday, May 28, 2010 10:47 AM > To: Buriez, Patrice > Cc: [email protected] > Subject: RE: sipp ims bench > > Hi Patrice, > > I've tried some tests with the initialSAPS to a even value, but the > results are still strange, and the SAPS increases more than expected. > The files you've asked (report and manafer.log) are attached. > > If the problem could come from the clock, I'm going to investigate in > this direction. The TS and manager are running in a virtual machine, > maybe something is wrong... It should not, and I've never had a problem > with it but we never know... > > Regards, > > A. > > > > Le jeudi 27 mai 2010 à 18:35 +0100, Buriez, Patrice a écrit : > > Hi Antoine, > > > > This is really weird, the [ms] timestamp in the report.xml still moves back > > and forth, while the "YMD HMS.ms" seems correct! > > Because of that transient wrong time reference, the load is increased too > > often. That's why you got 60 instead of 5. > > > > Can you do one more try, with InitialSAPS set to an even value, or to any > > multiple of (StirSteps+1)? > > Please also attach the manager.log file. > > It's OK to run the manager and TS on the same computer. > > > > Regards, > > Patrice > > > > -----Original Message----- > > From: Antoine Roly [mailto:[email protected]] > > Sent: Thursday, May 27, 2010 6:14 PM > > To: Buriez, Patrice > > Cc: [email protected] > > Subject: RE: sipp ims bench > > > > Hi Patrice, > > > > I've "svn co" revision 587 and killed ptpd and ntpd. > > I haven't seen anything weird when I compiled the soft (make rmtl, ossl > > and mgr as in the doc). > > > > The manager and the TS are the same host, so I suppose it's ok to run > > the test without both ntpd and ptpd, but I had to put the MaxTimeOffset > > to 0. > > I don't know if this can have an important negative impact on the test > > (other than for the time in the report of course). > > > > I've made several tests today, the results are strange. Almost all tests > > end correctly (i.e. without seg fault, but the results are weird), some > > test ends with a seg fault like in a previous mail. > > > > Here are the 3 files from the latest test... In this one, the SAPS > > increased more than expected and overloaded sailfin. As you can see in > > the report, the requested load of the first step was 5, but the mean > > value is 60!!! I don't understand why the SAPS increase so much. Gsl is > > working, I think the soft uses that to generate traffic so... > > > > Obviously there's something wrong, maybe in the way I'm using the bench, > > I don't know... Is it possible it's not working as expected due to the > > very low value I'm using (for initialSAPS, SAPSincreaseAmount,...)? I > > suppose not but... Or because I've only a single TS running on the same > > host than the manager, and without ntpd or ptpd? > > > > Regards, > > > > A. > > > > Le mercredi 26 mai 2010 à 17:54 +0100, Buriez, Patrice a écrit : > > > Hi Antoine, > > > > > > I investigated the files you sent. > > > The report.xml file suggests that the time reference is moving back and > > > forth. > > > I see several possible reasons for that: > > > > > > - Are you running ntpd and ptpd at the same time? > > > If that's the case, kill at least one of them, or even both, and try > > > again. > > > > > > - The "Segmentation fault" suggests that something is going really bad. > > > May be the stack got corrupted... > > > Try a "make clean", then "make", and check for errors and warnings. > > > Anything weird there? > > > > > > - We might have a regression in IMS Bench SIPp. > > > Get revision 587 and try again with this first version that supports > > > SailFin: > > > svn co -r 587 > > > https://sipp.svn.sourceforge.net/svnroot/sipp/sipp/branches/ims_bench > > > ims_bench-587 > > > > > > Regards, > > > Patrice > > > > > > -----Original Message----- > > > From: Antoine Roly [mailto:[email protected]] > > > Sent: Wednesday, May 26, 2010 2:12 PM > > > To: Buriez, Patrice > > > Subject: sipp ims bench > > > > > > Hi Patrice, > > > > > > Here are the files you asked. > > > > > > For this test, only one instance of SIPp was running, on the same host > > > that the manager. I suppose this is not a problem. Of course the SUT was > > > another host. > > > > > > Thanks in advance > > > > > > Regards, > > > > > > Antoine > > > > --------------------------------------------------------------------- > Intel Corporation NV/SA > Rond point Schuman 6, B-1040 Brussels > RPM (Bruxelles) 0415.497.718. > Citibank, Brussels, account 570/1031255/09 > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. > ------------------------------------------------------------------------------ _______________________________________________ Sipp-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/sipp-users
