[Qemu-devel] [PATCH 1/1] migration: calculate expected_downtime considering redirtied ram

2019-01-22 Thread bala24
From: Balamuruhan S 

currently we calculate expected_downtime by time taken to transfer
remaining ram, but during the time we had transferred remaining ram
few pages of ram might be redirtied and we need to retransfer it,
so it is better to consider them for calculating expected_downtime
for getting more accurate values.

Total ram to be transferred = remaining ram + (redirtied ram at the
   time when the remaining
   ram gets transferred)

redirtied ram = dirty_pages_rate * time taken to transfer remaining ram

redirtied ram = dirty_pages_rate * (remaining ram / bandwidth)

expected_downtime = (remaining ram + redirtied ram) / bandwidth

Suggested-by: David Gibson 
Suggested-by: Dr. David Alan Gilbert 
Signed-off-by: Balamuruhan S 
---
 migration/migration.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index ffc4d9e556..dc38e9a380 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2903,7 +2903,13 @@ static void migration_update_counters(MigrationState *s,
  * recalculate. 1 is a small enough number for our purposes
  */
 if (ram_counters.dirty_pages_rate && transferred > 1) {
-s->expected_downtime = ram_counters.remaining / bandwidth;
+/* Time required to transfer remaining ram */
+remaining_ram_transfer_time = ram_counters.remaining / bandwidth
+
+/* redirty of ram at the time remaining ram gets transferred*/
+newly_dirtied_ram = ram_counters.dirty_pages_rate * 
remaining_ram_transfer_time
+
+s->expected_downtime = (ram_counters.remaining + newly_dirtied_ram) / 
bandwidth;
 }
 
 qemu_file_reset_rate_limit(s->to_dst_file);
-- 
2.14.5




[Qemu-devel] [PATCH 0/1] migration: calculate expected_downtime considering redirtied ram

2019-01-22 Thread bala24
From: Balamuruhan S 

Based on the discussion with Dave and David Gibson earlier with respect
to expected_downtime calculation, 

https://lists.gnu.org/archive/html/qemu-devel/2018-04/msg02418.html

got suggestions that the calculation is of not accurate and we need to
consider the ram that gets redirtied during the time when we would have
actually transferred ram in the current iteration.

so I have came up with a calculation by considering the ram that could
get redirtied during the current iteration at the time we would have
transferred the remaining ram in current iteration. By this way,
the total ram to be transferred will be remaining ram + redirtied ram
and dividing with bandwidth would yield us better expected_downtime
value.

Please help to review and suggest about this approach.

Balamuruhan S (1):
  migration: calculate expected_downtime considering redirtied ram

 migration/migration.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

-- 
2.14.5




Re: [Qemu-devel] [PATCH] migration: calculate expected_downtime with ram_bytes_remaining()

2018-04-03 Thread bala24

On 2018-04-03 11:40, Peter Xu wrote:

On Sun, Apr 01, 2018 at 12:25:36AM +0530, Balamuruhan S wrote:
expected_downtime value is not accurate with dirty_pages_rate * 
page_size,

using ram_bytes_remaining would yeild it correct.

Signed-off-by: Balamuruhan S 
---
 migration/migration.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 58bd382730..4e43dc4f92 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2245,8 +2245,7 @@ static void 
migration_update_counters(MigrationState *s,

  * recalculate. 1 is a small enough number for our purposes
  */
 if (ram_counters.dirty_pages_rate && transferred > 1) {
-s->expected_downtime = ram_counters.dirty_pages_rate *
-qemu_target_page_size() / bandwidth;
+s->expected_downtime = ram_bytes_remaining() / bandwidth;


This field was removed in e4ed1541ac ("savevm: New save live migration
method: pending", 2012-12-20), in which remaing RAM was used.

And it was added back in 90f8ae724a ("migration: calculate
expected_downtime", 2013-02-22), in which dirty rate was used.

However I didn't find a clue on why we changed from using remaining
RAM to using dirty rate...  So I'll leave this question to Juan.

Besides, I'm a bit confused on when we'll want such a value.  AFAIU
precopy is mostly used by setting up the target downtime before hand,
so we should already know the downtime before hand.  Then why we want
to observe such a thing?


Thanks Peter Xu for reviewing,

I tested precopy migration with 16M hugepage backed ppc guest and 
granularity
of page size in migration is 4K so any page dirtied would result in 4096 
pages

to be transmitted again, this led for migration to continue endless,

default migrate_parameters:
downtime-limit: 300 milliseconds

info migrate:
expected downtime: 1475 milliseconds

Migration status: active
total time: 130874 milliseconds
expected downtime: 1475 milliseconds
setup: 3475 milliseconds
transferred ram: 18197383 kbytes
throughput: 866.83 mbps
remaining ram: 376892 kbytes
total ram: 8388864 kbytes
duplicate: 1678265 pages
skipped: 0 pages
normal: 4536795 pages
normal bytes: 18147180 kbytes
dirty sync count: 6
page size: 4 kbytes
dirty pages rate: 39044 pages

In order to complete migration I configured downtime-limit to 1475
milliseconds but still migration was endless. Later calculated expected
downtime by remaining ram 376892 Kbytes / 866.83 mbps yeilded 3478.34
milliseconds and configuring it as downtime-limit succeeds the migration
to complete. This led to the conclusion that expected downtime is not
accurate.

Regards,
Balamuruhan S



Thanks,