Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table
On Thu, Jul 16, 2015 at 2:58 AM, Chen, Tiejun wrote: >> I think I would say: >> >> -- >> Now use the hypervisor-supplied memory map to build our final e820 table: >> * Add regions for BIOS ranges and other special mappings not in the >> hypervisor map >> * Add in the hypervisor regions >> * Adjust the lowmem and highmem regions if we've had to relocate >> memory (adding a highmem region if necessary) >> * Sort all the ranges so that they appear in memory order. >> -- > > > I'll update this and thanks a lot. > >> >>> >>> CC: Keir Fraser >>> CC: Jan Beulich >>> CC: Andrew Cooper >>> CC: Ian Jackson >>> CC: Stefano Stabellini >>> CC: Ian Campbell >>> CC: Wei Liu >>> Signed-off-by: Tiejun Chen >>> --- > > > [snip] > >>> +/* Low RAM goes here. Reserve space for special pages. */ >>> +BUG_ON(low_mem_end < (2u << 20)); >> >> >> Won't this BUG if the guest was actually given less than 2GiB of RAM? > > > 2u << 20 = 0x20, so this is 2M, not 2G :) Oh, right. :-) -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table
I think I would say: -- Now use the hypervisor-supplied memory map to build our final e820 table: * Add regions for BIOS ranges and other special mappings not in the hypervisor map * Add in the hypervisor regions * Adjust the lowmem and highmem regions if we've had to relocate memory (adding a highmem region if necessary) * Sort all the ranges so that they appear in memory order. -- I'll update this and thanks a lot. CC: Keir Fraser CC: Jan Beulich CC: Andrew Cooper CC: Ian Jackson CC: Stefano Stabellini CC: Ian Campbell CC: Wei Liu Signed-off-by: Tiejun Chen --- [snip] +/* Low RAM goes here. Reserve space for special pages. */ +BUG_ON(low_mem_end < (2u << 20)); Won't this BUG if the guest was actually given less than 2GiB of RAM? 2u << 20 = 0x20, so this is 2M, not 2G :) + +/* + * We may need to adjust real lowmem end since we may + * populate RAM to get enough MMIO previously. + */ [snip] + +/* + * And then we also need to adjust highmem. + */ +if ( add_high_mem ) +{ +for ( i = 0; i < memory_map.nr_map; i++ ) +{ +if ( e820[i].type == E820_RAM && + e820[i].addr > (1ull << 32)) +e820[i].size += add_high_mem; +} +} What if there was originally no high memory, but resizing the pci hole meant we had to create a high memory region? You're right. We need to consider this case. Thanks Tiejun ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table
On Thu, Jul 9, 2015 at 6:33 AM, Tiejun Chen wrote: > Now we can use that memory map to build our final > e820 table but it may need to reorder all e820 > entries. I think I would say: -- Now use the hypervisor-supplied memory map to build our final e820 table: * Add regions for BIOS ranges and other special mappings not in the hypervisor map * Add in the hypervisor regions * Adjust the lowmem and highmem regions if we've had to relocate memory (adding a highmem region if necessary) * Sort all the ranges so that they appear in memory order. -- > > CC: Keir Fraser > CC: Jan Beulich > CC: Andrew Cooper > CC: Ian Jackson > CC: Stefano Stabellini > CC: Ian Campbell > CC: Wei Liu > Signed-off-by: Tiejun Chen > --- > v5 ~ v7: > > * Nothing is changed. > > v4: > > * Rename local variable, low_mem_pgend, to low_mem_end. > > * Improve some code comments > > * Adjust highmem after lowmem is changed. > > tools/firmware/hvmloader/e820.c | 80 > + > 1 file changed, 66 insertions(+), 14 deletions(-) > > diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c > index 3e53c47..aa2569f 100644 > --- a/tools/firmware/hvmloader/e820.c > +++ b/tools/firmware/hvmloader/e820.c > @@ -108,7 +108,9 @@ int build_e820_table(struct e820entry *e820, > unsigned int lowmem_reserved_base, > unsigned int bios_image_base) > { > -unsigned int nr = 0; > +unsigned int nr = 0, i, j; > +uint64_t add_high_mem = 0; > +uint64_t low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT; > > if ( !lowmem_reserved_base ) > lowmem_reserved_base = 0xA; > @@ -152,13 +154,6 @@ int build_e820_table(struct e820entry *e820, > e820[nr].type = E820_RESERVED; > nr++; > > -/* Low RAM goes here. Reserve space for special pages. */ > -BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20)); > -e820[nr].addr = 0x10; > -e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) - e820[nr].addr; > -e820[nr].type = E820_RAM; > -nr++; > - > /* > * Explicitly reserve space for special pages. > * This space starts at RESERVED_MEMBASE an extends to cover various > @@ -194,16 +189,73 @@ int build_e820_table(struct e820entry *e820, > nr++; > } > > - > -if ( hvm_info->high_mem_pgend ) > +/* > + * Construct E820 table according to recorded memory map. > + * > + * The memory map created by toolstack may include, > + * > + * #1. Low memory region > + * > + * Low RAM starts at least from 1M to make sure all standard regions > + * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios, > + * have enough space. > + * > + * #2. Reserved regions if they exist > + * > + * #3. High memory region if it exists > + */ > +for ( i = 0; i < memory_map.nr_map; i++ ) > { > -e820[nr].addr = ((uint64_t)1 << 32); > -e820[nr].size = > -((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - > e820[nr].addr; > -e820[nr].type = E820_RAM; > +e820[nr] = memory_map.map[i]; > nr++; > } > > +/* Low RAM goes here. Reserve space for special pages. */ > +BUG_ON(low_mem_end < (2u << 20)); Won't this BUG if the guest was actually given less than 2GiB of RAM? > + > +/* > + * We may need to adjust real lowmem end since we may > + * populate RAM to get enough MMIO previously. > + */ > +for ( i = 0; i < memory_map.nr_map; i++ ) > +{ > +uint64_t end = e820[i].addr + e820[i].size; > +if ( e820[i].type == E820_RAM && > + low_mem_end > e820[i].addr && low_mem_end < end ) > +{ > +add_high_mem = end - low_mem_end; > +e820[i].size = low_mem_end - e820[i].addr; > +} > +} > + > +/* > + * And then we also need to adjust highmem. > + */ > +if ( add_high_mem ) > +{ > +for ( i = 0; i < memory_map.nr_map; i++ ) > +{ > +if ( e820[i].type == E820_RAM && > + e820[i].addr > (1ull << 32)) > +e820[i].size += add_high_mem; > +} > +} What if there was originally no high memory, but resizing the pci hole meant we had to create a high memory region? Other than those things, looks good. -George ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table
>>> On 14.07.15 at 12:22, wrote: > On 2015/7/14 17:32, Jan Beulich wrote: > On 14.07.15 at 07:22, wrote: > +for ( i = 0; i < memory_map.nr_map; i++ ) > +{ > +uint64_t end = e820[i].addr + e820[i].size; Either loop index/boundary or used array are wrong here: In the earlier loop you copied memory_map[0...nr_map-1] to e820[n...n+nr_map-1], but here you're looping looking at e820[0...nr_map-1] >>> >>> You're right. I should lookup all e820[] like this, >>> >>> for ( i = 0; i < nr; i++ ) >> >> Hmm, I would have thought you only care about either part of >> the just glued together map. >> > +if ( e820[i].type == E820_RAM && > + low_mem_end > e820[i].addr && low_mem_end < end ) Assuming you mean to look at the RDM e820[] entries here, this is not a correct check: You don't care about partly or fully contained, all you care about is whether low_mem_end extends beyond the start of the region. >>> >>> Here I'm looking at the e820 entry indicating low memory. Because >>> >>> low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT; >>> >>> and when we allocate MMIO in pci.c, its possible to populate RAM so >>> hvm_info->low_mem_pgend would be changed over there. So we need to >>> compensate this loss with high memory. Here memory_map[] also records >>> the original low/high memory, so if low_mem_end is less-than the >>> original we need this compensation. >> >> And I'm not disputing your intentions - I'm merely pointing out that >> afaics the code above doesn't match these intentions. In particular >> (as said) I don't see why you need to check low_mem_end < end. >> > > Before we probably relocate RAM, > > low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT > > and the e820 entry specific to low memory, > > [e820[X].addr, end] > > Here, end = e820[X].addr + e820[X].size; > > Now low_mem_end = end. > > After that, low_mem_end < end. so if > > (low_mem_end > e820[X].addr && low_mem_end < end) is true, this means > that associated RAM entry is hitting, right? Then we need to revise this > entry as [e820[X].addr, low_mem_end], and compensate [end - low_mem_end] > to high memory. Anything I'm still wrong here? Ah, I think I see now what I misunderstood. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table
On 2015/7/14 17:32, Jan Beulich wrote: On 14.07.15 at 07:22, wrote: +for ( i = 0; i < memory_map.nr_map; i++ ) +{ +uint64_t end = e820[i].addr + e820[i].size; Either loop index/boundary or used array are wrong here: In the earlier loop you copied memory_map[0...nr_map-1] to e820[n...n+nr_map-1], but here you're looping looking at e820[0...nr_map-1] You're right. I should lookup all e820[] like this, for ( i = 0; i < nr; i++ ) Hmm, I would have thought you only care about either part of the just glued together map. +if ( e820[i].type == E820_RAM && + low_mem_end > e820[i].addr && low_mem_end < end ) Assuming you mean to look at the RDM e820[] entries here, this is not a correct check: You don't care about partly or fully contained, all you care about is whether low_mem_end extends beyond the start of the region. Here I'm looking at the e820 entry indicating low memory. Because low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT; and when we allocate MMIO in pci.c, its possible to populate RAM so hvm_info->low_mem_pgend would be changed over there. So we need to compensate this loss with high memory. Here memory_map[] also records the original low/high memory, so if low_mem_end is less-than the original we need this compensation. And I'm not disputing your intentions - I'm merely pointing out that afaics the code above doesn't match these intentions. In particular (as said) I don't see why you need to check low_mem_end < end. Before we probably relocate RAM, low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT and the e820 entry specific to low memory, [e820[X].addr, end] Here, end = e820[X].addr + e820[X].size; Now low_mem_end = end. After that, low_mem_end < end. so if (low_mem_end > e820[X].addr && low_mem_end < end) is true, this means that associated RAM entry is hitting, right? Then we need to revise this entry as [e820[X].addr, low_mem_end], and compensate [end - low_mem_end] to high memory. Anything I'm still wrong here? Thanks Tiejun ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table
>>> On 14.07.15 at 07:22, wrote: >>> +for ( i = 0; i < memory_map.nr_map; i++ ) >>> +{ >>> +uint64_t end = e820[i].addr + e820[i].size; >> >> Either loop index/boundary or used array are wrong here: In the >> earlier loop you copied memory_map[0...nr_map-1] to >> e820[n...n+nr_map-1], but here you're looping looking at >> e820[0...nr_map-1] > > You're right. I should lookup all e820[] like this, > > for ( i = 0; i < nr; i++ ) Hmm, I would have thought you only care about either part of the just glued together map. >>> +if ( e820[i].type == E820_RAM && >>> + low_mem_end > e820[i].addr && low_mem_end < end ) >> >> Assuming you mean to look at the RDM e820[] entries here, this >> is not a correct check: You don't care about partly or fully >> contained, all you care about is whether low_mem_end extends >> beyond the start of the region. > > Here I'm looking at the e820 entry indicating low memory. Because > > low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT; > > and when we allocate MMIO in pci.c, its possible to populate RAM so > hvm_info->low_mem_pgend would be changed over there. So we need to > compensate this loss with high memory. Here memory_map[] also records > the original low/high memory, so if low_mem_end is less-than the > original we need this compensation. And I'm not disputing your intentions - I'm merely pointing out that afaics the code above doesn't match these intentions. In particular (as said) I don't see why you need to check low_mem_end < end. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table
Now we can use that memory map to build our final e820 table but it may need to reorder all e820 entries. "it" being what? I'm afraid I can't really make sense of the second half of the sentence... I hope the following can work for you, ... but finally we should sort them into an increasing order since we shouldn't assume the original order is always good. --- a/tools/firmware/hvmloader/e820.c +++ b/tools/firmware/hvmloader/e820.c @@ -108,7 +108,9 @@ int build_e820_table(struct e820entry *e820, unsigned int lowmem_reserved_base, unsigned int bios_image_base) [snip] + */ +for ( i = 0; i < memory_map.nr_map; i++ ) +{ +uint64_t end = e820[i].addr + e820[i].size; Either loop index/boundary or used array are wrong here: In the earlier loop you copied memory_map[0...nr_map-1] to e820[n...n+nr_map-1], but here you're looping looking at e820[0...nr_map-1] You're right. I should lookup all e820[] like this, for ( i = 0; i < nr; i++ ) +if ( e820[i].type == E820_RAM && + low_mem_end > e820[i].addr && low_mem_end < end ) Assuming you mean to look at the RDM e820[] entries here, this is not a correct check: You don't care about partly or fully contained, all you care about is whether low_mem_end extends beyond the start of the region. Here I'm looking at the e820 entry indicating low memory. Because low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT; and when we allocate MMIO in pci.c, its possible to populate RAM so hvm_info->low_mem_pgend would be changed over there. So we need to compensate this loss with high memory. Here memory_map[] also records the original low/high memory, so if low_mem_end is less-than the original we need this compensation. So here we have two steps to address this issue, #1. Calculate the loss +{ +add_high_mem = end - low_mem_end; +e820[i].size = low_mem_end - e820[i].addr; +} +} + +/* + * And then we also need to adjust highmem. + */ A single line comment should use the respective comment style. #2. Compensate the loss +if ( add_high_mem ) +{ +for ( i = 0; i < memory_map.nr_map; i++ ) s/memory_map.nr_map/nr +{ +if ( e820[i].type == E820_RAM && + e820[i].addr > (1ull << 32)) +e820[i].size += add_high_mem; +} +} But looking at the code I think the comment should be extended to state that we currently expect there to be exactly one such RAM region. I can add this at the beginning of #1 loop, Its possible to relocate RAM to allocate sufficient MMIO previously so low_mem_pgend would be changed over there. And here memory_map[] records the original low/high memory, so if low_mem_end is less than the original we need to revise low/high memory range in e820. Thanks Tiejun ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table
>>> On 09.07.15 at 07:33, wrote: > Now we can use that memory map to build our final > e820 table but it may need to reorder all e820 > entries. "it" being what? I'm afraid I can't really make sense of the second half of the sentence... > --- a/tools/firmware/hvmloader/e820.c > +++ b/tools/firmware/hvmloader/e820.c > @@ -108,7 +108,9 @@ int build_e820_table(struct e820entry *e820, > unsigned int lowmem_reserved_base, > unsigned int bios_image_base) > { > -unsigned int nr = 0; > +unsigned int nr = 0, i, j; > +uint64_t add_high_mem = 0; > +uint64_t low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT; For the last one I don't see why uint64_t; uint32_t should be just fine and less (binary) code. > @@ -194,16 +189,73 @@ int build_e820_table(struct e820entry *e820, > nr++; > } > > - > -if ( hvm_info->high_mem_pgend ) > +/* > + * Construct E820 table according to recorded memory map. > + * > + * The memory map created by toolstack may include, > + * > + * #1. Low memory region > + * > + * Low RAM starts at least from 1M to make sure all standard regions > + * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios, > + * have enough space. > + * > + * #2. Reserved regions if they exist > + * > + * #3. High memory region if it exists > + */ > +for ( i = 0; i < memory_map.nr_map; i++ ) > { > -e820[nr].addr = ((uint64_t)1 << 32); > -e820[nr].size = > -((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - > e820[nr].addr; > -e820[nr].type = E820_RAM; > +e820[nr] = memory_map.map[i]; > nr++; > } > > +/* Low RAM goes here. Reserve space for special pages. */ > +BUG_ON(low_mem_end < (2u << 20)); > + > +/* > + * We may need to adjust real lowmem end since we may > + * populate RAM to get enough MMIO previously. "populate"? Don't you mean "relocate"? > + */ > +for ( i = 0; i < memory_map.nr_map; i++ ) > +{ > +uint64_t end = e820[i].addr + e820[i].size; Either loop index/boundary or used array are wrong here: In the earlier loop you copied memory_map[0...nr_map-1] to e820[n...n+nr_map-1], but here you're looping looking at e820[0...nr_map-1] > +if ( e820[i].type == E820_RAM && > + low_mem_end > e820[i].addr && low_mem_end < end ) Assuming you mean to look at the RDM e820[] entries here, this is not a correct check: You don't care about partly or fully contained, all you care about is whether low_mem_end extends beyond the start of the region. > +{ > +add_high_mem = end - low_mem_end; > +e820[i].size = low_mem_end - e820[i].addr; > +} > +} > + > +/* > + * And then we also need to adjust highmem. > + */ A single line comment should use the respective comment style. > +if ( add_high_mem ) > +{ > +for ( i = 0; i < memory_map.nr_map; i++ ) > +{ > +if ( e820[i].type == E820_RAM && > + e820[i].addr > (1ull << 32)) > +e820[i].size += add_high_mem; > +} > +} But looking at the code I think the comment should be extended to state that we currently expect there to be exactly one such RAM region. > +/* Finally we need to reorder all e820 entries. */ "reorder"? Perhaps "sort"? But despite the many comments - the patch looks a lot better now than earlier versions. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [v7][PATCH 07/16] hvmloader/e820: construct guest e820 table
Now we can use that memory map to build our final e820 table but it may need to reorder all e820 entries. CC: Keir Fraser CC: Jan Beulich CC: Andrew Cooper CC: Ian Jackson CC: Stefano Stabellini CC: Ian Campbell CC: Wei Liu Signed-off-by: Tiejun Chen --- v5 ~ v7: * Nothing is changed. v4: * Rename local variable, low_mem_pgend, to low_mem_end. * Improve some code comments * Adjust highmem after lowmem is changed. tools/firmware/hvmloader/e820.c | 80 + 1 file changed, 66 insertions(+), 14 deletions(-) diff --git a/tools/firmware/hvmloader/e820.c b/tools/firmware/hvmloader/e820.c index 3e53c47..aa2569f 100644 --- a/tools/firmware/hvmloader/e820.c +++ b/tools/firmware/hvmloader/e820.c @@ -108,7 +108,9 @@ int build_e820_table(struct e820entry *e820, unsigned int lowmem_reserved_base, unsigned int bios_image_base) { -unsigned int nr = 0; +unsigned int nr = 0, i, j; +uint64_t add_high_mem = 0; +uint64_t low_mem_end = hvm_info->low_mem_pgend << PAGE_SHIFT; if ( !lowmem_reserved_base ) lowmem_reserved_base = 0xA; @@ -152,13 +154,6 @@ int build_e820_table(struct e820entry *e820, e820[nr].type = E820_RESERVED; nr++; -/* Low RAM goes here. Reserve space for special pages. */ -BUG_ON((hvm_info->low_mem_pgend << PAGE_SHIFT) < (2u << 20)); -e820[nr].addr = 0x10; -e820[nr].size = (hvm_info->low_mem_pgend << PAGE_SHIFT) - e820[nr].addr; -e820[nr].type = E820_RAM; -nr++; - /* * Explicitly reserve space for special pages. * This space starts at RESERVED_MEMBASE an extends to cover various @@ -194,16 +189,73 @@ int build_e820_table(struct e820entry *e820, nr++; } - -if ( hvm_info->high_mem_pgend ) +/* + * Construct E820 table according to recorded memory map. + * + * The memory map created by toolstack may include, + * + * #1. Low memory region + * + * Low RAM starts at least from 1M to make sure all standard regions + * of the PC memory map, like BIOS, VGA memory-mapped I/O and vgabios, + * have enough space. + * + * #2. Reserved regions if they exist + * + * #3. High memory region if it exists + */ +for ( i = 0; i < memory_map.nr_map; i++ ) { -e820[nr].addr = ((uint64_t)1 << 32); -e820[nr].size = -((uint64_t)hvm_info->high_mem_pgend << PAGE_SHIFT) - e820[nr].addr; -e820[nr].type = E820_RAM; +e820[nr] = memory_map.map[i]; nr++; } +/* Low RAM goes here. Reserve space for special pages. */ +BUG_ON(low_mem_end < (2u << 20)); + +/* + * We may need to adjust real lowmem end since we may + * populate RAM to get enough MMIO previously. + */ +for ( i = 0; i < memory_map.nr_map; i++ ) +{ +uint64_t end = e820[i].addr + e820[i].size; +if ( e820[i].type == E820_RAM && + low_mem_end > e820[i].addr && low_mem_end < end ) +{ +add_high_mem = end - low_mem_end; +e820[i].size = low_mem_end - e820[i].addr; +} +} + +/* + * And then we also need to adjust highmem. + */ +if ( add_high_mem ) +{ +for ( i = 0; i < memory_map.nr_map; i++ ) +{ +if ( e820[i].type == E820_RAM && + e820[i].addr > (1ull << 32)) +e820[i].size += add_high_mem; +} +} + +/* Finally we need to reorder all e820 entries. */ +for ( j = 0; j < nr-1; j++ ) +{ +for ( i = j+1; i < nr; i++ ) +{ +if ( e820[j].addr > e820[i].addr ) +{ +struct e820entry tmp; +tmp = e820[j]; +e820[j] = e820[i]; +e820[i] = tmp; +} +} +} + return nr; } -- 1.9.1 ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel