Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Kai Huang


On 02/12/2015 03:08 PM, Tian, Kevin wrote:

From: Kai Huang [mailto:kai.hu...@linux.intel.com]
Sent: Thursday, February 12, 2015 2:46 PM

On 02/12/2015 02:25 PM, Tian, Kevin wrote:

From: Kai Huang [mailto:kai.hu...@linux.intel.com]
Sent: Thursday, February 12, 2015 10:35 AM

On 02/11/2015 09:13 PM, Jan Beulich wrote:

On 11.02.15 at 12:52,  wrote:

On 11/02/15 08:28, Kai Huang wrote:

With PML, we don't have to use write protection but just clear D-bit
of EPT entry of guest memory to do dirty logging, with an additional
PML buffer full VMEXIT for 512 dirty GPAs. Theoretically, this can
reduce hypervisor overhead when guest is in dirty logging mode, and
therefore more CPU cycles can be allocated to guest, so it's expected
benchmarks in guest will have better performance comparing to

non-PML.

One issue with basic EPT A/D tracking was the scan of the EPT tables.
Here, hardware will give us a list of affected gfns, but how is Xen
supposed to efficiently clear the dirty bits again?  Using EPT
misconfiguration is no better than the existing fault path.

Why not? The misconfiguration exit ought to clear the D bit for all
511 entries in the L1 table (and set it for the one entry that is
currently serving the access). All further D bit handling will then
be PML based.

Indeed, we clear D-bit in EPT misconfiguration. In my understanding, the
sequences are as follows:

1) PML enabled for the domain.
2) ept_invalidate_emt (or ept_invalidate_emt_range) is called.
3) Guest accesses specific GPA (which has been invalidated by step 2),
and EPT misconfig is triggered.
4) Then resolve_misconfig is called, which fixes up GFN (above GPA >>
12) to p2m_ram_logdirty, and calls ept_p2m_type_to_flags, in which we
clear D-bit of EPT entry (instead of clear W-bit) if p2m type is
p2m_ram_logdirty. Then dirty logging of this GFN will be handled by PML.

The above 2) ~ 4) will be repeated when log-dirty radix tree is cleared.

is ept_invalidate_emt required by existing logdirty mode or by PML enable?

It's in existing logdirty code.

can we clear D bits directly when log-dirty radix tree is cleared to reduce
EPT misconfig exits for repeatedly dirtied pages?

Theoretically we can, and looks logdirty for video ram is done in this
way (logdirty for the page is re-enabled while it is reported to
dirty_bitmap). One thing is looks video ram logdirty only exists for HAP
mode.
But in current log dirty implementation for global logdirty, at common
paging layer, the log-dirty radix tree is cleaned in single step after
reporting all dirty pages to userspace. And it just calls
ept_invalidate_emt essentially. Therefore we need to modify logdirty
common code at paging layer to achieve this, which is more like logdirty
enhancement but not related to PML enabling directly. And any change of
interface in paging layer requires modification in shadow mode
accordingly, so currently I just choose not to do it.


for general log dirty, ept_invalidate_emt is required because there is
access permission change (dirtied page becomes rw after 1st fault,
so need to change them back to ro again for the new dirty tracking
round). But for PML, there's no permission change at all (always rw),
so such behavior should be noted by general logdirty layer for better
optimization. I'm OK not doing so for initial enabling patch, but it's
something you can think about later. :-)

Yes thanks for the point :)

Thanks,
-Kai


Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Kai Huang


On 02/12/2015 03:09 PM, Tian, Kevin wrote:

From: Kai Huang [mailto:kai.hu...@linux.intel.com]
Sent: Thursday, February 12, 2015 2:57 PM

On 02/12/2015 02:54 PM, Tian, Kevin wrote:

From: Kai Huang [mailto:kai.hu...@linux.intel.com]
Sent: Thursday, February 12, 2015 10:39 AM

PML needs to be enabled (allocate PML buffer, initialize PML index,
PML base address, turn PML on VMCS, etc) for all vcpus of the domain,
as PML buffer and PML index are per-vcpu, but EPT table may be shared
by vcpus. Enabling PML on partial vcpus of the domain won't work. Also
PML will only be enabled for the domain when it is switched to dirty
logging mode, and it will be disabled when domain is switched back to
normal mode. As looks vcpu number won't be changed dynamically

during

guest is running (correct me if I am wrong here), so we don't have to
consider enabling PML for new created vcpu when guest is in dirty
logging mode.

There are exactly d->max_vcpus worth of struct vcpus (and therefore
VMCSes) for a domain after creation, and will exist for the lifetime of
the domain.  There is no dynamic adjustment of numbers of vcpus during
runtime.

Good to know.

could we at least detect and warn vcpu changes when PML is enabled?
dirty logging happens out of guest's knowledge and there could be the
case where user right online/offline a vcpu within that window.

Why is the warning necessary? There's no harm leaving PML enabled when
vcpu becomes offline.

what about online? you need enable PML for newly-online vcpu since
meaningful work may be scheduled to it within logdirty window.
Do you mean vcpu number (those offline + those online) can be changed 
during guest's runtime, ex, a new vcpu is created and becomes online 
after PML is enabled for the domain? Otherwise, I don't see a problem.


As long as the total number of vcpus remains constant, it's not a 
problem, as we only enable PML after all vcpus are created (and it 
remains constant), and the vcpu status is irrelevant.


Thanks,
-Kai



Also we will not disable PML for that vcpu when it becomes offline, in
which case we don't need to re-enable PML, which can fail, when vcpu
becomes online again. It simplifies the logic.

offline is not a problem

Thanks
Kevin




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Kai Huang


On 02/12/2015 03:02 PM, Tian, Kevin wrote:

From: Kai Huang [mailto:kai.hu...@linux.intel.com]
Sent: Thursday, February 12, 2015 10:50 AM


- PML buffer flush

There are two places we need to flush PML buffer. The first place is PML
buffer full VMEXIT handler (apparently), and the second place is in
paging_log_dirty_op (either peek or clean), as vcpus are running
asynchronously along with paging_log_dirty_op is called from userspace

via

hypercall, and it's possible there are dirty GPAs logged in vcpus' PML
buffers but not full. Therefore we'd better to flush all vcpus' PML buffers
before reporting dirty GPAs to userspace.

We handle above two cases by flushing PML buffer at the beginning of all
VMEXITs. This solves the first case above, and it also solves the second
case, as prior to paging_log_dirty_op, domain_pause is called, which kicks
vcpus (that are in guest mode) out of guest mode via sending IPI, which

cause

VMEXIT, to them.

This also makes log-dirty radix tree more updated as PML buffer is flushed
on basis of all VMEXITs but not only PML buffer full VMEXIT.

Is that really efficient? Flushing the buffer only as needed doesn't
seem to be a major problem (apart from the usual preemption issue
when dealing with guests with very many vCPU-s, but you certainly
recall that at this point HVM is still limited to 128).

Apart from these two remarks, the design looks okay to me.

While keeping log-dirty radix tree more updated is probably irrelevant,
I do think we'd better to flush PML buffers in paging_log_dirty_op (both
peek and clear) before reporting dirty pages to userspace, in which case
I think flushing PML buffer at beginning of VMEXIT is a good idea, as
domain_pause does the job automatically. I am not sure how much cycles
will flushing PML buffer contribute but I think it should be relatively
small comparing to VMEXIT itself, therefore it can be ignored.

it's not intuitive to add overhead (one extra vmread) to every vmexit
just for utilizing the side-effect of one specific exit due to domain_pause.

What's the cost of one vmread? It's reasonable to avoid it if it's heavy.




An optimized way probably is we only flush PML buffer for external
interrupt VMEXIT, which domain_pause really triggers, but not at
beginning of all VMEXITs. But as log as the overhead of flush PML buffer
is negligible, this optimization is also unnecessary.


this optimization is not real optimization as you still stick to implementation
detail of other operations.
Would you give me some possible hints? To me above is the most optimized 
way I can figure :)

If you really want to take use of domain_pause,
piggyback PML flush explicitly in that path make things clearer.
domain_pause is called in many code path, looks it's not as optimized as 
my above one.


Thanks,
-Kai


Thanks
Keivn

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Tian, Kevin
> From: Kai Huang [mailto:kai.hu...@linux.intel.com]
> Sent: Thursday, February 12, 2015 2:57 PM
> 
> On 02/12/2015 02:54 PM, Tian, Kevin wrote:
> >> From: Kai Huang [mailto:kai.hu...@linux.intel.com]
> >> Sent: Thursday, February 12, 2015 10:39 AM
>  PML needs to be enabled (allocate PML buffer, initialize PML index,
>  PML base address, turn PML on VMCS, etc) for all vcpus of the domain,
>  as PML buffer and PML index are per-vcpu, but EPT table may be shared
>  by vcpus. Enabling PML on partial vcpus of the domain won't work. Also
>  PML will only be enabled for the domain when it is switched to dirty
>  logging mode, and it will be disabled when domain is switched back to
>  normal mode. As looks vcpu number won't be changed dynamically
> during
>  guest is running (correct me if I am wrong here), so we don't have to
>  consider enabling PML for new created vcpu when guest is in dirty
>  logging mode.
> >>> There are exactly d->max_vcpus worth of struct vcpus (and therefore
> >>> VMCSes) for a domain after creation, and will exist for the lifetime of
> >>> the domain.  There is no dynamic adjustment of numbers of vcpus during
> >>> runtime.
> >> Good to know.
> > could we at least detect and warn vcpu changes when PML is enabled?
> > dirty logging happens out of guest's knowledge and there could be the
> > case where user right online/offline a vcpu within that window.
> Why is the warning necessary? There's no harm leaving PML enabled when
> vcpu becomes offline.

what about online? you need enable PML for newly-online vcpu since
meaningful work may be scheduled to it within logdirty window.

> 
> Also we will not disable PML for that vcpu when it becomes offline, in
> which case we don't need to re-enable PML, which can fail, when vcpu
> becomes online again. It simplifies the logic.

offline is not a problem

Thanks
Kevin


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Tian, Kevin
> From: Kai Huang [mailto:kai.hu...@linux.intel.com]
> Sent: Thursday, February 12, 2015 2:46 PM
> 
> On 02/12/2015 02:25 PM, Tian, Kevin wrote:
> >> From: Kai Huang [mailto:kai.hu...@linux.intel.com]
> >> Sent: Thursday, February 12, 2015 10:35 AM
> >>
> >> On 02/11/2015 09:13 PM, Jan Beulich wrote:
> >> On 11.02.15 at 12:52,  wrote:
>  On 11/02/15 08:28, Kai Huang wrote:
> > With PML, we don't have to use write protection but just clear D-bit
> > of EPT entry of guest memory to do dirty logging, with an additional
> > PML buffer full VMEXIT for 512 dirty GPAs. Theoretically, this can
> > reduce hypervisor overhead when guest is in dirty logging mode, and
> > therefore more CPU cycles can be allocated to guest, so it's expected
> > benchmarks in guest will have better performance comparing to
> >> non-PML.
>  One issue with basic EPT A/D tracking was the scan of the EPT tables.
>  Here, hardware will give us a list of affected gfns, but how is Xen
>  supposed to efficiently clear the dirty bits again?  Using EPT
>  misconfiguration is no better than the existing fault path.
> >>> Why not? The misconfiguration exit ought to clear the D bit for all
> >>> 511 entries in the L1 table (and set it for the one entry that is
> >>> currently serving the access). All further D bit handling will then
> >>> be PML based.
> >> Indeed, we clear D-bit in EPT misconfiguration. In my understanding, the
> >> sequences are as follows:
> >>
> >> 1) PML enabled for the domain.
> >> 2) ept_invalidate_emt (or ept_invalidate_emt_range) is called.
> >> 3) Guest accesses specific GPA (which has been invalidated by step 2),
> >> and EPT misconfig is triggered.
> >> 4) Then resolve_misconfig is called, which fixes up GFN (above GPA >>
> >> 12) to p2m_ram_logdirty, and calls ept_p2m_type_to_flags, in which we
> >> clear D-bit of EPT entry (instead of clear W-bit) if p2m type is
> >> p2m_ram_logdirty. Then dirty logging of this GFN will be handled by PML.
> >>
> >> The above 2) ~ 4) will be repeated when log-dirty radix tree is cleared.
> > is ept_invalidate_emt required by existing logdirty mode or by PML enable?
> It's in existing logdirty code.
> > can we clear D bits directly when log-dirty radix tree is cleared to reduce
> > EPT misconfig exits for repeatedly dirtied pages?
> Theoretically we can, and looks logdirty for video ram is done in this
> way (logdirty for the page is re-enabled while it is reported to
> dirty_bitmap). One thing is looks video ram logdirty only exists for HAP
> mode.
> But in current log dirty implementation for global logdirty, at common
> paging layer, the log-dirty radix tree is cleaned in single step after
> reporting all dirty pages to userspace. And it just calls
> ept_invalidate_emt essentially. Therefore we need to modify logdirty
> common code at paging layer to achieve this, which is more like logdirty
> enhancement but not related to PML enabling directly. And any change of
> interface in paging layer requires modification in shadow mode
> accordingly, so currently I just choose not to do it.
> 

for general log dirty, ept_invalidate_emt is required because there is 
access permission change (dirtied page becomes rw after 1st fault,
so need to change them back to ro again for the new dirty tracking
round). But for PML, there's no permission change at all (always rw),
so such behavior should be noted by general logdirty layer for better
optimization. I'm OK not doing so for initial enabling patch, but it's
something you can think about later. :-)

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-11 Thread Jeremy Fitzhardinge

On 02/11/2015 03:28 PM, Linus Torvalds wrote:
>
>
> On Feb 11, 2015 3:15 PM, "Jeremy Fitzhardinge"  > wrote:
> >
> > Right now it needs to be a locked operation to prevent read-reordering.
> > x86 memory ordering rules state that all writes are seen in a globally
> > consistent order, and are globally ordered wrt reads *on the same
> > addresses*, but reads to different addresses can be reordered wrt to
> writes.
>
> The modern x86 rules are actually much tighter than that.
>
> Every store is a release, and every load is an acquire. So a
> non-atomic store is actually a perfectly fine unlock. All preceding
> stores will be seen by other cpu's before the unlock, and while reads
> can pass stores, they only pass *earlier* stores.
>

Right, so in this particular instance, the read of the SLOWPATH flag
*can't* pass the previous unlock store, hence the need for an atomic
unlock or some other mechanism to prevent the read from being reordered.

J

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Kai Huang


On 02/12/2015 02:54 PM, Tian, Kevin wrote:

From: Kai Huang [mailto:kai.hu...@linux.intel.com]
Sent: Thursday, February 12, 2015 10:39 AM

PML needs to be enabled (allocate PML buffer, initialize PML index,
PML base address, turn PML on VMCS, etc) for all vcpus of the domain,
as PML buffer and PML index are per-vcpu, but EPT table may be shared
by vcpus. Enabling PML on partial vcpus of the domain won't work. Also
PML will only be enabled for the domain when it is switched to dirty
logging mode, and it will be disabled when domain is switched back to
normal mode. As looks vcpu number won't be changed dynamically during
guest is running (correct me if I am wrong here), so we don't have to
consider enabling PML for new created vcpu when guest is in dirty
logging mode.

There are exactly d->max_vcpus worth of struct vcpus (and therefore
VMCSes) for a domain after creation, and will exist for the lifetime of
the domain.  There is no dynamic adjustment of numbers of vcpus during
runtime.

Good to know.

could we at least detect and warn vcpu changes when PML is enabled?
dirty logging happens out of guest's knowledge and there could be the
case where user right online/offline a vcpu within that window.
Why is the warning necessary? There's no harm leaving PML enabled when 
vcpu becomes offline.


Also we will not disable PML for that vcpu when it becomes offline, in 
which case we don't need to re-enable PML, which can fail, when vcpu 
becomes online again. It simplifies the logic.


Thanks,
-Kai



which presumably
means that the PML buffer flush needs to be aware of which gfns are
mapped by superpages to be able to correctly set a block of bits in the
logdirty bitmap.


Unfortunately PML itself can't tell us if the logged GPA comes from
superpage or not, but even in PML we still need to split superpages to
4K page, just like traditional write protection approach does. I think
this is because live migration should be based on 4K page granularity.
Marking all 512 bits of a 2M page to be dirty by a single write doesn't
make sense in both write protection and PML cases.


agree. extending one write to superpage enlarges dirty set unnecessary.
since spec doesn't say superpage logging is not supported, I'd think a
4k-aligned entry being logged if within superpage.

Thanks
Kevin




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Tian, Kevin
> From: Kai Huang [mailto:kai.hu...@linux.intel.com]
> Sent: Thursday, February 12, 2015 10:50 AM
> 
> >> - PML buffer flush
> >>
> >> There are two places we need to flush PML buffer. The first place is PML
> >> buffer full VMEXIT handler (apparently), and the second place is in
> >> paging_log_dirty_op (either peek or clean), as vcpus are running
> >> asynchronously along with paging_log_dirty_op is called from userspace
> via
> >> hypercall, and it's possible there are dirty GPAs logged in vcpus' PML
> >> buffers but not full. Therefore we'd better to flush all vcpus' PML buffers
> >> before reporting dirty GPAs to userspace.
> >>
> >> We handle above two cases by flushing PML buffer at the beginning of all
> >> VMEXITs. This solves the first case above, and it also solves the second
> >> case, as prior to paging_log_dirty_op, domain_pause is called, which kicks
> >> vcpus (that are in guest mode) out of guest mode via sending IPI, which
> cause
> >> VMEXIT, to them.
> >>
> >> This also makes log-dirty radix tree more updated as PML buffer is flushed
> >> on basis of all VMEXITs but not only PML buffer full VMEXIT.
> > Is that really efficient? Flushing the buffer only as needed doesn't
> > seem to be a major problem (apart from the usual preemption issue
> > when dealing with guests with very many vCPU-s, but you certainly
> > recall that at this point HVM is still limited to 128).
> >
> > Apart from these two remarks, the design looks okay to me.
> While keeping log-dirty radix tree more updated is probably irrelevant,
> I do think we'd better to flush PML buffers in paging_log_dirty_op (both
> peek and clear) before reporting dirty pages to userspace, in which case
> I think flushing PML buffer at beginning of VMEXIT is a good idea, as
> domain_pause does the job automatically. I am not sure how much cycles
> will flushing PML buffer contribute but I think it should be relatively
> small comparing to VMEXIT itself, therefore it can be ignored.

it's not intuitive to add overhead (one extra vmread) to every vmexit
just for utilizing the side-effect of one specific exit due to domain_pause.

> 
> An optimized way probably is we only flush PML buffer for external
> interrupt VMEXIT, which domain_pause really triggers, but not at
> beginning of all VMEXITs. But as log as the overhead of flush PML buffer
> is negligible, this optimization is also unnecessary.
> 

this optimization is not real optimization as you still stick to implementation
detail of other operations. If you really want to take use of domain_pause,
piggyback PML flush explicitly in that path make things clearer.

Thanks
Keivn

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Tian, Kevin
> From: Kai Huang [mailto:kai.hu...@linux.intel.com]
> Sent: Thursday, February 12, 2015 10:39 AM
> >
> >> PML needs to be enabled (allocate PML buffer, initialize PML index,
> >> PML base address, turn PML on VMCS, etc) for all vcpus of the domain,
> >> as PML buffer and PML index are per-vcpu, but EPT table may be shared
> >> by vcpus. Enabling PML on partial vcpus of the domain won't work. Also
> >> PML will only be enabled for the domain when it is switched to dirty
> >> logging mode, and it will be disabled when domain is switched back to
> >> normal mode. As looks vcpu number won't be changed dynamically during
> >> guest is running (correct me if I am wrong here), so we don't have to
> >> consider enabling PML for new created vcpu when guest is in dirty
> >> logging mode.
> > There are exactly d->max_vcpus worth of struct vcpus (and therefore
> > VMCSes) for a domain after creation, and will exist for the lifetime of
> > the domain.  There is no dynamic adjustment of numbers of vcpus during
> > runtime.
> Good to know.

could we at least detect and warn vcpu changes when PML is enabled?
dirty logging happens out of guest's knowledge and there could be the
case where user right online/offline a vcpu within that window.

> > which presumably
> > means that the PML buffer flush needs to be aware of which gfns are
> > mapped by superpages to be able to correctly set a block of bits in the
> > logdirty bitmap.
> >
> Unfortunately PML itself can't tell us if the logged GPA comes from
> superpage or not, but even in PML we still need to split superpages to
> 4K page, just like traditional write protection approach does. I think
> this is because live migration should be based on 4K page granularity.
> Marking all 512 bits of a 2M page to be dirty by a single write doesn't
> make sense in both write protection and PML cases.
> 

agree. extending one write to superpage enlarges dirty set unnecessary.
since spec doesn't say superpage logging is not supported, I'd think a
4k-aligned entry being logged if within superpage.

Thanks
Kevin


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Kai Huang


On 02/12/2015 02:25 PM, Tian, Kevin wrote:

From: Kai Huang [mailto:kai.hu...@linux.intel.com]
Sent: Thursday, February 12, 2015 10:35 AM

On 02/11/2015 09:13 PM, Jan Beulich wrote:

On 11.02.15 at 12:52,  wrote:

On 11/02/15 08:28, Kai Huang wrote:

With PML, we don't have to use write protection but just clear D-bit
of EPT entry of guest memory to do dirty logging, with an additional
PML buffer full VMEXIT for 512 dirty GPAs. Theoretically, this can
reduce hypervisor overhead when guest is in dirty logging mode, and
therefore more CPU cycles can be allocated to guest, so it's expected
benchmarks in guest will have better performance comparing to

non-PML.

One issue with basic EPT A/D tracking was the scan of the EPT tables.
Here, hardware will give us a list of affected gfns, but how is Xen
supposed to efficiently clear the dirty bits again?  Using EPT
misconfiguration is no better than the existing fault path.

Why not? The misconfiguration exit ought to clear the D bit for all
511 entries in the L1 table (and set it for the one entry that is
currently serving the access). All further D bit handling will then
be PML based.

Indeed, we clear D-bit in EPT misconfiguration. In my understanding, the
sequences are as follows:

1) PML enabled for the domain.
2) ept_invalidate_emt (or ept_invalidate_emt_range) is called.
3) Guest accesses specific GPA (which has been invalidated by step 2),
and EPT misconfig is triggered.
4) Then resolve_misconfig is called, which fixes up GFN (above GPA >>
12) to p2m_ram_logdirty, and calls ept_p2m_type_to_flags, in which we
clear D-bit of EPT entry (instead of clear W-bit) if p2m type is
p2m_ram_logdirty. Then dirty logging of this GFN will be handled by PML.

The above 2) ~ 4) will be repeated when log-dirty radix tree is cleared.

is ept_invalidate_emt required by existing logdirty mode or by PML enable?

It's in existing logdirty code.

can we clear D bits directly when log-dirty radix tree is cleared to reduce
EPT misconfig exits for repeatedly dirtied pages?
Theoretically we can, and looks logdirty for video ram is done in this 
way (logdirty for the page is re-enabled while it is reported to 
dirty_bitmap). One thing is looks video ram logdirty only exists for HAP 
mode.
But in current log dirty implementation for global logdirty, at common 
paging layer, the log-dirty radix tree is cleaned in single step after 
reporting all dirty pages to userspace. And it just calls 
ept_invalidate_emt essentially. Therefore we need to modify logdirty 
common code at paging layer to achieve this, which is more like logdirty 
enhancement but not related to PML enabling directly. And any change of 
interface in paging layer requires modification in shadow mode 
accordingly, so currently I just choose not to do it.


Thanks,
-Kai


Thanks
Kevin




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 31/35] arm : acpi map status override table to dom0

2015-02-11 Thread Stefano Stabellini
On Wed, 11 Feb 2015, Julien Grall wrote:
> Hi Ian,
> 
> On 05/02/2015 19:47, Ian Campbell wrote:
> > On Thu, 2015-02-05 at 16:27 +0530, Parth Dixit wrote:
> > > > > +stao->header.length = sizeof(struct acpi_table_header) + 1;
> > > > > +stao->header.checksum = 0;
> > > > > +ACPI_MEMCPY(stao->header.oem_id, "LINARO", 6);
> > > > > +ACPI_MEMCPY(stao->header.oem_table_id, "RTSMVEV8", 8);
> > > > 
> > > > 
> > > > I though the plan was to use a Xen OEM ID?
> > > yes, but its not clear what should be used as xen oem id is not finalized
> > > yet.
> > 
> > Are these IDs the ones defined for x86 in
> > tools/firmware/hvmloader/acpi/acpi2_0.h:
> >  #define ACPI_OEM_ID "Xen"
> >  #define ACPI_OEM_TABLE_ID   "HVM"
> >  #define ACPI_OEM_REVISION   0
> > 
> >  #define ACPI_CREATOR_ID ASCII32('H','V','M','L') /*
> > HVMLoader */
> >  #define ACPI_CREATOR_REVISION   0
> > 
> > ? If so we should reuse them, although maybe not OEM_TABLE_ID and
> > CREATOR_ID since those are x86/HVM specific.
> 
> I didn't know that HVMLoader was using one.
> 
> "XenVMM" was decided for ARM (see see
> http://wiki.xenproject.org/mediawiki/images/c/c4/Xen-environment-table.pdf).
> 
> Although, it would be good to have a single OEM ID for Xen project.
> 
> > What is the process for assigning those? Given our unique OEM_ID are we
> > allowed to just declare them ourselves?
> 
> Stefano sent an email to the ACPI guys to know the process. I guess the x86
> one has not been declared?

I don't know the process but on x86 we are already using "Xen" as
OEM_ID, see tools/firmware/hvmloader/acpi/acpi2_0.h

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/2] arm, arm64/xen: move Xen initialization earlier

2015-02-11 Thread Julien Grall
From: Stefano Stabellini 

Currently, Xen is initialized/discovered in an initcall. This doesn't
allow us to support earlyprintk or choosing the preferred console when
running on Xen.

The current function xen_guest_init is now split in 2 parts:
- xen_early_init: Check if there is a Xen node in the device tree
and setup domain type
- xen_guest_init: Retrieve the information from the device node and
initialize Xen (grant table, shared page...)

The former is called in setup_arch, while the latter is an initcall.

Signed-off-by: Stefano Stabellini 
Signed-off-by: Julien Grall 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 

---
It's based on a patch sent by Stefano nearly 2 years ago [1].

[1] http://lists.xen.org/archives/html/xen-devel/2013-08/msg02960.html
---
 arch/arm/include/asm/xen/hypervisor.h |  8 +
 arch/arm/kernel/setup.c   |  2 ++
 arch/arm/xen/enlighten.c  | 58 ---
 arch/arm64/kernel/setup.c |  2 ++
 4 files changed, 46 insertions(+), 24 deletions(-)

diff --git a/arch/arm/include/asm/xen/hypervisor.h 
b/arch/arm/include/asm/xen/hypervisor.h
index 1317ee4..04ff8e7 100644
--- a/arch/arm/include/asm/xen/hypervisor.h
+++ b/arch/arm/include/asm/xen/hypervisor.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_ARM_XEN_HYPERVISOR_H
 #define _ASM_ARM_XEN_HYPERVISOR_H
 
+#include 
+
 extern struct shared_info *HYPERVISOR_shared_info;
 extern struct start_info *xen_start_info;
 
@@ -18,4 +20,10 @@ static inline enum paravirt_lazy_mode 
paravirt_get_lazy_mode(void)
 
 extern struct dma_map_ops *xen_dma_ops;
 
+#ifdef CONFIG_XEN
+void __init xen_early_init(void);
+#else
+static inline void xen_early_init(void) { return; }
+#endif
+
 #endif /* _ASM_ARM_XEN_HYPERVISOR_H */
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index e55408e..8b59d0d 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -936,6 +937,7 @@ void __init setup_arch(char **cmdline_p)
 
arm_dt_init_cpu_maps();
psci_init();
+   xen_early_init();
 #ifdef CONFIG_SMP
if (is_smp()) {
if (!mdesc->smp_init || !mdesc->smp_init()) {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 90101c8..0abeefa 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -53,6 +53,8 @@ EXPORT_SYMBOL_GPL(xen_platform_pci_unplug);
 
 static unsigned int xen_events_irq;
 
+static __initdata struct device_node *xen_node;
+
 /* map fgmfn of domid to lpfn in the current domain */
 static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
unsigned int domid)
@@ -222,42 +224,28 @@ static irqreturn_t xen_arm_callback(int irq, void *arg)
  * documentation of the Xen Device Tree format.
  */
 #define GRANT_TABLE_PHYSADDR 0
-static int __init xen_guest_init(void)
+void __init xen_early_init(void)
 {
-   struct xen_add_to_physmap xatp;
-   static struct shared_info *shared_info_page = 0;
-   struct device_node *node;
int len;
const char *s = NULL;
const char *version = NULL;
const char *xen_prefix = "xen,xen-";
-   struct resource res;
-   phys_addr_t grant_frames;
 
-   node = of_find_compatible_node(NULL, NULL, "xen,xen");
-   if (!node) {
+   xen_node = of_find_compatible_node(NULL, NULL, "xen,xen");
+   if (!xen_node) {
pr_debug("No Xen support\n");
-   return 0;
+   return;
}
-   s = of_get_property(node, "compatible", &len);
+   s = of_get_property(xen_node, "compatible", &len);
if (strlen(xen_prefix) + 3  < len &&
!strncmp(xen_prefix, s, strlen(xen_prefix)))
version = s + strlen(xen_prefix);
if (version == NULL) {
pr_debug("Xen version not found\n");
-   return 0;
-   }
-   if (of_address_to_resource(node, GRANT_TABLE_PHYSADDR, &res))
-   return 0;
-   grant_frames = res.start;
-   xen_events_irq = irq_of_parse_and_map(node, 0);
-   if (!xen_events_irq) {
-   pr_debug("Xen event channel interrupt not found\n");
-   return -ENODEV;
+   return;
}
 
-   pr_info("Xen %s support found, events_irq=%d gnttab_frame=%pa\n",
-   version, xen_events_irq, &grant_frames);
+   pr_info("Xen %s support found\n", version);
 
xen_domain_type = XEN_HVM_DOMAIN;
 
@@ -267,10 +255,32 @@ static int __init xen_guest_init(void)
xen_start_info->flags |= SIF_INITDOMAIN|SIF_PRIVILEGED;
else
xen_start_info->flags &= ~(SIF_INITDOMAIN|SIF_PRIVILEGED);
+}
+
+static int __init xen_guest_init(void)
+{
+   struct xen_add_to_physmap xatp;
+   struct shared_info *shared_info_page = NULL;
+   struct resource res;
+   phys_ad

[Xen-devel] [PATCH 0/2] arm/arm64: Detect Xen support earlier

2015-02-11 Thread Julien Grall
Hello,

This small patch series move the detection of running on Xen earlier. This is
required in order to support earlyprintk via Xen and selecting the preferred
console.

Ard, the patch to move the call earlier (see #2) differed from the one I sent
you privately mostly because it's not possible to translate an IRQ before the
GIC has been initialized.

Let me know if it works for you.

Sincerely yours,

Julien Grall (1):
  arm/xen: Correctly check if the event channel interrupt is present

Stefano Stabellini (1):
  arm,arm64/xen: move Xen initialization earlier

 arch/arm/include/asm/xen/hypervisor.h |  8 +
 arch/arm/kernel/setup.c   |  2 ++
 arch/arm/xen/enlighten.c  | 58 +--
 arch/arm64/kernel/setup.c |  2 ++
 4 files changed, 47 insertions(+), 23 deletions(-)

-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/2] arm/xen: Correctly check if the event channel interrupt is present

2015-02-11 Thread Julien Grall
The function irq_of_parse_and_map returns 0 when the IRQ is not found.

Furthermore xen_events_irq is only read when the CPU is bring up, so
it's not necessary to use the attribute __read_mostly.

Lastly, move the check before notifying the user that we are running on
Xen.

Signed-off-by: Julien Grall 
---
 arch/arm/xen/enlighten.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 263a204..90101c8 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -51,7 +51,7 @@ EXPORT_SYMBOL_GPL(xen_have_vector_callback);
 int xen_platform_pci_unplug = XEN_UNPLUG_ALL;
 EXPORT_SYMBOL_GPL(xen_platform_pci_unplug);
 
-static __read_mostly int xen_events_irq = -1;
+static unsigned int xen_events_irq;
 
 /* map fgmfn of domid to lpfn in the current domain */
 static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
@@ -251,12 +251,14 @@ static int __init xen_guest_init(void)
return 0;
grant_frames = res.start;
xen_events_irq = irq_of_parse_and_map(node, 0);
+   if (!xen_events_irq) {
+   pr_debug("Xen event channel interrupt not found\n");
+   return -ENODEV;
+   }
+
pr_info("Xen %s support found, events_irq=%d gnttab_frame=%pa\n",
version, xen_events_irq, &grant_frames);
 
-   if (xen_events_irq < 0)
-   return -ENODEV;
-
xen_domain_type = XEN_HVM_DOMAIN;
 
xen_setup_features();
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Tian, Kevin
> From: Kai Huang [mailto:kai.hu...@linux.intel.com]
> Sent: Thursday, February 12, 2015 10:35 AM
> 
> On 02/11/2015 09:13 PM, Jan Beulich wrote:
>  On 11.02.15 at 12:52,  wrote:
> >> On 11/02/15 08:28, Kai Huang wrote:
> >>> With PML, we don't have to use write protection but just clear D-bit
> >>> of EPT entry of guest memory to do dirty logging, with an additional
> >>> PML buffer full VMEXIT for 512 dirty GPAs. Theoretically, this can
> >>> reduce hypervisor overhead when guest is in dirty logging mode, and
> >>> therefore more CPU cycles can be allocated to guest, so it's expected
> >>> benchmarks in guest will have better performance comparing to
> non-PML.
> >> One issue with basic EPT A/D tracking was the scan of the EPT tables.
> >> Here, hardware will give us a list of affected gfns, but how is Xen
> >> supposed to efficiently clear the dirty bits again?  Using EPT
> >> misconfiguration is no better than the existing fault path.
> > Why not? The misconfiguration exit ought to clear the D bit for all
> > 511 entries in the L1 table (and set it for the one entry that is
> > currently serving the access). All further D bit handling will then
> > be PML based.
> Indeed, we clear D-bit in EPT misconfiguration. In my understanding, the
> sequences are as follows:
> 
> 1) PML enabled for the domain.
> 2) ept_invalidate_emt (or ept_invalidate_emt_range) is called.
> 3) Guest accesses specific GPA (which has been invalidated by step 2),
> and EPT misconfig is triggered.
> 4) Then resolve_misconfig is called, which fixes up GFN (above GPA >>
> 12) to p2m_ram_logdirty, and calls ept_p2m_type_to_flags, in which we
> clear D-bit of EPT entry (instead of clear W-bit) if p2m type is
> p2m_ram_logdirty. Then dirty logging of this GFN will be handled by PML.
> 
> The above 2) ~ 4) will be repeated when log-dirty radix tree is cleared.

is ept_invalidate_emt required by existing logdirty mode or by PML enable?
can we clear D bits directly when log-dirty radix tree is cleared to reduce 
EPT misconfig exits for repeatedly dirtied pages?

Thanks
Kevin


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC v1 8/8] xen: x86: remove CONFIG_XEN dependency PARAVIRT and PARAVIRT_CLOCK

2015-02-11 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Now that the respective PV modes have the specific requirements
selectable just remove this from CONFIG_XEN This is as per the
agreed upon Xen Kconfig changes [0].

[0] http://comments.gmane.org/gmane.comp.emulators.xen.devel/231579

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/xen/Kconfig | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 9298eb3..19b2d3d 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -4,8 +4,6 @@
 
 config XEN
bool "Xen guest support"
-   depends on PARAVIRT
-   select PARAVIRT_CLOCK
select PARAVIRT_MMU
depends on X86_64 || (X86_32 && X86_PAE)
depends on X86_TSC
-- 
2.2.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC v1 7/8] xen: unwrap XEN_BACKEND from XEN_DOM0

2015-02-11 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This unwraps XEN_BACKEND from depending on XEN_DOM0, it
instead makes it depend on the possible x86 backends and
under what scenerios its allowed under ARM. This is as per
the agreed upon Xen Kconfig changes [0].

[0] http://comments.gmane.org/gmane.comp.emulators.xen.devel/231579

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/xen/Kconfig | 2 ++
 drivers/xen/Kconfig  | 3 ++-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 50e2fb4..9298eb3 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -16,8 +16,10 @@ config XEN
 
 config XEN_DOM0
def_bool y
+   select XEN_BACKEND
depends on XEN && PCI_XEN && SWIOTLB_XEN
depends on X86_LOCAL_APIC && X86_IO_APIC && ACPI && PCI
+   depends on XEN_PV || XEN_PVH
 
 config XEN_PVHVM
def_bool y
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 31391bc..d8bd3f6 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -77,7 +77,8 @@ config XEN_DEV_EVTCHN
 
 config XEN_BACKEND
bool "Backend driver support"
-   depends on XEN_DOM0
+   depends on ARM || ARM64 || (X86 && (XEN_PV || XEN_PVH || XEN_PVHVM))
+   select SWIOTLB_XEN if ARM || ARM64
default y
help
  Support for backend device drivers that provide I/O services
-- 
2.2.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC v1 6/8] xen: x86: make XEN_PV* stuff depend on PARAVIRT and PARAVIRT_CLOCK

2015-02-11 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This will later more easily let us unfold PARAVIRT and PARAVIRT_CLOCK
from under CONFIG_XEN. All the XEN_PV* stuff is under the x86 universe.
This is as per the agreed upon Xen Kconfig changes [0].

[0] http://comments.gmane.org/gmane.comp.emulators.xen.devel/231579

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/xen/Kconfig | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 9e0442f..50e2fb4 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -21,6 +21,8 @@ config XEN_DOM0
 
 config XEN_PVHVM
def_bool y
+   select PARAVIRT
+   select PARAVIRT_CLOCK
depends on XEN && PCI && X86_LOCAL_APIC
 
 config XEN_MAX_DOMAIN_MEMORY
@@ -49,11 +51,15 @@ config XEN_DEBUG_FS
 config XEN_PVH
bool "Support for running as a PVH guest"
depends on X86_64 && XEN
+   select PARAVIRT
+   select PARAVIRT_CLOCK
select XEN_PVHVM
def_bool n
 
 config XEN_PV
bool "Support for running as a PV guest"
depends on XEN && X86
+   select PARAVIRT
+   select PARAVIRT_CLOCK
select XEN_HAVE_PVMMU
def_bool n
-- 
2.2.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC v1 5/8] xen: x86: add XEN_PV

2015-02-11 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This lets us rip out under the general XEN config the
XEN_HAVE_PVMMU dependency. This only exists on the x86
universe. This is as per the agreed upon Xen Kconfig
changes [0].

[0] http://comments.gmane.org/gmane.comp.emulators.xen.devel/231579

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/xen/Kconfig | 7 ++-
 drivers/xen/Kconfig  | 3 ++-
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index b675e14..9e0442f 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -7,7 +7,6 @@ config XEN
depends on PARAVIRT
select PARAVIRT_CLOCK
select PARAVIRT_MMU
-   select XEN_HAVE_PVMMU
depends on X86_64 || (X86_32 && X86_PAE)
depends on X86_TSC
help
@@ -52,3 +51,9 @@ config XEN_PVH
depends on X86_64 && XEN
select XEN_PVHVM
def_bool n
+
+config XEN_PV
+   bool "Support for running as a PV guest"
+   depends on XEN && X86
+   select XEN_HAVE_PVMMU
+   def_bool n
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 2af6f69..31391bc 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -257,7 +257,8 @@ config XEN_MCE_LOG
  converting it into Linux mcelog format for mcelog tools
 
 config XEN_HAVE_PVMMU
-   bool
+   bool
+   depends on XEN_PV
 
 config XEN_EFI
def_bool y
-- 
2.2.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC v1 4/8] xen: x86: make XEN_PVH select XEN_PVHVM

2015-02-11 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

This lets us expose XEN_PVH and set what is required for it.
This only exists on the x86 universe. This is as per the agreed
upon Xen Kconfig changes [0].

[0] http://comments.gmane.org/gmane.comp.emulators.xen.devel/231579

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/xen/Kconfig | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 4d3db19..b675e14 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -49,5 +49,6 @@ config XEN_DEBUG_FS
 
 config XEN_PVH
bool "Support for running as a PVH guest"
-   depends on X86_64 && XEN && XEN_PVHVM
+   depends on X86_64 && XEN
+   select XEN_PVHVM
def_bool n
-- 
2.2.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-11 Thread Stefano Stabellini
On Thu, 12 Feb 2015, Julien Grall wrote:
> On 12/02/2015 12:54, Ian Campbell wrote:
> > On Thu, 2015-02-12 at 04:35 +, Stefano Stabellini wrote:
> > > On Tue, 10 Feb 2015, Ian Campbell wrote:
> > > > On Tue, 2015-02-10 at 15:51 +0800, Ard Biesheuvel wrote:
> > > > > > FWIW on x86 this doesn't depend on console_set_on_cmdline, does it
> > > > > > need
> > > > > > to here?
> > > > > > 
> > > > > 
> > > > > I didn't check the code, but it seems inappropriate to add a preferred
> > > > > console implicitly if the user has set 'console=' on the command line.
> > > > 
> > > > I had been thinking that add_preferred_console would DTRT, but it seems
> > > > not. Seems strange that most calls to it do not check if the console is
> > > > already set, but it does seem like the right thing in this case.
> > > > 
> > > > > > On x86 it does depend on !xen_initial_domain. I suppose on the
> > > > > > principal
> > > > > > that a VT is normally available there. I suppose that doesn't apply
> > > > > > to
> > > > > > ARM so much, although it could.
> > > > > > 
> > > > > 
> > > > > OK, I got confused by the xen_guest_init(). So do you mean if if
> > > > > (!xen_initial_domain) should be added?
> > > > 
> > > > (dom0 is "Just A Guest" too ;-))
> > > > 
> > > > Adding it would be consistent with x86, I'm not precisely sure if that
> > > > is important or desirable in this case. I'd be inclined to start with
> > > > the if there.
> > > 
> > > The reasoning is that dom0 command line arguments come from its old
> > > native grub stanza, therefore the console parameter is incorrect, right?
> > > As opposed to regular domUs, that being freshly installed, are supposed
> > > to have the correct console parameter?
> > 
> > Other way around I think, dom0 has the correct stuff from grub.cfg
> > whereas the guest may not.
> 
> Yes, currently for ARM guest you have to add 'extra="console=hvc0"' in the
> configuration file.

I see. If this is the concern that we are trying to address, then yes,
doing the same that we are already doing on x86 might be best.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC v1 3/8] xen: drivers: add XEN_FRONTEND and fold front end drivers under them

2015-02-11 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Fold Xen front end drivers under their own Kconfig entry.
You may want to for example only enable domU guests with
pv-drivers.

While at it make HVC_XEN_FRONTEND select HVC_XEN.

This is a per the agreed upon Kconfig changes for Xen [0].

[0] http://comments.gmane.org/gmane.comp.emulators.xen.devel/231579

Signed-off-by: Luis R. Rodriguez 
---
 drivers/block/Kconfig   |  3 +--
 drivers/input/misc/Kconfig  |  3 +--
 drivers/net/Kconfig |  3 +--
 drivers/pci/Kconfig |  3 +--
 drivers/scsi/Kconfig|  3 +--
 drivers/tty/hvc/Kconfig |  4 ++--
 drivers/video/fbdev/Kconfig |  3 +--
 drivers/xen/Kconfig | 10 ++
 8 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index 014a1cf..8259879 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -482,9 +482,8 @@ config XILINX_SYSACE
 
 config XEN_BLKDEV_FRONTEND
tristate "Xen virtual block device support"
-   depends on XEN
+   depends on XEN_FRONTEND
default y
-   select XEN_XENBUS_FRONTEND
help
  This driver implements the front-end of the Xen virtual
  block device driver.  It communicates with a back-end driver
diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig
index 23297ab..71a736b 100644
--- a/drivers/input/misc/Kconfig
+++ b/drivers/input/misc/Kconfig
@@ -656,9 +656,8 @@ config INPUT_CMA3000_I2C
 
 config INPUT_XEN_KBDDEV_FRONTEND
tristate "Xen virtual keyboard and mouse support"
-   depends on XEN
+   depends on XEN_FRONTEND
default y
-   select XEN_XENBUS_FRONTEND
help
  This driver implements the front-end of the Xen virtual
  keyboard and mouse device driver.  It communicates with a back-end
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index d6607ee..0ae5cbc 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -333,8 +333,7 @@ source "drivers/net/ieee802154/Kconfig"
 
 config XEN_NETDEV_FRONTEND
tristate "Xen network device frontend driver"
-   depends on XEN
-   select XEN_XENBUS_FRONTEND
+   depends on XEN_FRONTEND
default y
help
  This driver provides support for Xen paravirtual network
diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig
index 7a8f1c5..0120499 100644
--- a/drivers/pci/Kconfig
+++ b/drivers/pci/Kconfig
@@ -56,9 +56,8 @@ config PCI_STUB
 
 config XEN_PCIDEV_FRONTEND
 tristate "Xen PCI Frontend"
-depends on PCI && X86 && XEN
+depends on PCI && X86 && XEN_FRONTEND
 select PCI_XEN
-   select XEN_XENBUS_FRONTEND
 default y
 help
   The PCI device frontend driver allows the kernel to import arbitrary
diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
index 9c92f41..e369f0b 100644
--- a/drivers/scsi/Kconfig
+++ b/drivers/scsi/Kconfig
@@ -588,8 +588,7 @@ config VMWARE_PVSCSI
 
 config XEN_SCSI_FRONTEND
tristate "XEN SCSI frontend driver"
-   depends on SCSI && XEN
-   select XEN_XENBUS_FRONTEND
+   depends on SCSI && XEN_FRONTEND
help
  The XEN SCSI frontend driver allows the kernel to access SCSI Devices
  within another guest OS (usually Dom0).
diff --git a/drivers/tty/hvc/Kconfig b/drivers/tty/hvc/Kconfig
index 8902f9b..b9dec5f 100644
--- a/drivers/tty/hvc/Kconfig
+++ b/drivers/tty/hvc/Kconfig
@@ -70,8 +70,8 @@ config HVC_XEN
 
 config HVC_XEN_FRONTEND
bool "Xen Hypervisor Multiple Consoles support"
-   depends on HVC_XEN
-   select XEN_XENBUS_FRONTEND
+   depends on XEN_FRONTEND
+   select HVC_XEN
default y
help
  Xen driver for secondary virtual consoles
diff --git a/drivers/video/fbdev/Kconfig b/drivers/video/fbdev/Kconfig
index 4916c97..e0149d1 100644
--- a/drivers/video/fbdev/Kconfig
+++ b/drivers/video/fbdev/Kconfig
@@ -2243,14 +2243,13 @@ config FB_VIRTUAL
 
 config XEN_FBDEV_FRONTEND
tristate "Xen virtual frame buffer support"
-   depends on FB && XEN
+   depends on FB && XEN_FRONTEND
select FB_SYS_FILLRECT
select FB_SYS_COPYAREA
select FB_SYS_IMAGEBLIT
select FB_SYS_FOPS
select FB_DEFERRED_IO
select INPUT_XEN_KBDDEV_FRONTEND if INPUT_MISC
-   select XEN_XENBUS_FRONTEND
default y
help
  This driver implements the front-end of the Xen virtual
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 9350de02..2af6f69 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -83,6 +83,16 @@ config XEN_BACKEND
  Support for backend device drivers that provide I/O services
  to other virtual machines.
 
+config XEN_FRONTEND
+   bool "Frontend driver support"
+   select XEN
+   select XEN_XENBUS_FRONTEND
+   default y
+   help
+ Support for frontend device drivers for Xen. You want to enable
+ this if you want t

[Xen-devel] [RFC v1 2/8] xen: x86: make XEN_MAX_DOMAIN_MEMORY depend on XEN_HAVE_PVMMU

2015-02-11 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Although XEN currently selects XEN_HAVE_PVMMU that will not
be the case in the near future so select this requirement
explicitly as per the agreed upon Kconfig changes [0].

[0] http://comments.gmane.org/gmane.comp.emulators.xen.devel/231579

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/xen/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 490e43e..4d3db19 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -28,7 +28,7 @@ config XEN_MAX_DOMAIN_MEMORY
int
default 500 if X86_64
default 64 if X86_32
-   depends on XEN
+   depends on XEN && XEN_HAVE_PVMMU
help
  This only affects the sizing of some bss arrays, the unused
  portions of which are freed.
-- 
2.2.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC v1 1/8] xen: make dom0 specific changes depend on XEN_DOM0

2015-02-11 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

These are Kconfig options which are known to only make
sense with Xen dom0 support. This is as per the agreed
upon changes to Xen's kconfig changes [0].

[0] http://comments.gmane.org/gmane.comp.emulators.xen.devel/231579

Signed-off-by: Luis R. Rodriguez 
---
 arch/x86/xen/Kconfig | 4 ++--
 drivers/watchdog/Kconfig | 2 +-
 drivers/xen/Kconfig  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index 6f615a3..490e43e 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -35,13 +35,13 @@ config XEN_MAX_DOMAIN_MEMORY
 
 config XEN_SAVE_RESTORE
bool
-   depends on XEN
+   depends on XEN_DOM0
select HIBERNATE_CALLBACKS
default y
 
 config XEN_DEBUG_FS
bool "Enable Xen debug and tuning parameters in debugfs"
-   depends on XEN && DEBUG_FS
+   depends on XEN_DOM0 && DEBUG_FS
default n
help
  Enable statistics output and various tuning options in debugfs.
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 08f41ad..34af197 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1409,7 +1409,7 @@ config WATCHDOG_RIO
 
 config XEN_WDT
tristate "Xen Watchdog support"
-   depends on XEN
+   depends on XEN_DOM0
help
  Say Y here to support the hypervisor watchdog capability provided
  by Xen 4.0 and newer.  The watchdog timeout period is normally one
diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index b812462..9350de02 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -222,7 +222,7 @@ config XEN_ACPI_HOTPLUG_CPU
 
 config XEN_ACPI_PROCESSOR
tristate "Xen ACPI processor"
-   depends on XEN && X86 && ACPI_PROCESSOR && CPU_FREQ
+   depends on XEN_DOM0 && X86 && ACPI_PROCESSOR && CPU_FREQ
default m
help
   This ACPI processor uploads Power Management information to the Xen
-- 
2.2.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [RFC v1 0/8] xen: kconfig changes

2015-02-11 Thread Luis R. Rodriguez
From: "Luis R. Rodriguez" 

Here's the first shot at the Kconfig changes for Xen as discussed
on the mailing list a little while ago [0]. Let me know if you spot
any issues or if you'd like things split differently. I tried to
make things as atomic as possible, but not being too rediculous
on the atomicity of the changes, for instance the HVC changes
were reasonable to just fold into the other change it touched.

Haven't gone to war with testing the Kconfig changes yet given this
is just the first RFC. If things look good please look for major
issues and let me know.

[0] http://comments.gmane.org/gmane.comp.emulators.xen.devel/231579

Luis R. Rodriguez (8):
  xen: make dom0 specific changes depend on XEN_DOM0
  xen: x86: make XEN_MAX_DOMAIN_MEMORY depend on XEN_HAVE_PVMMU
  xen: drivers: add XEN_FRONTEND and fold front end drivers under them
  xen: x86: make XEN_PVH select XEN_PVHVM
  xen: x86: add XEN_PV
  xen: x86: make XEN_PV* stuff depend on PARAVIRT and PARAVIRT_CLOCK
  xen: unwrap XEN_BACKEND from XEN_DOM0
  xen: x86: remove CONFIG_XEN dependency PARAVIRT and PARAVIRT_CLOCK

 arch/x86/xen/Kconfig| 26 +++---
 drivers/block/Kconfig   |  3 +--
 drivers/input/misc/Kconfig  |  3 +--
 drivers/net/Kconfig |  3 +--
 drivers/pci/Kconfig |  3 +--
 drivers/scsi/Kconfig|  3 +--
 drivers/tty/hvc/Kconfig |  4 ++--
 drivers/video/fbdev/Kconfig |  3 +--
 drivers/watchdog/Kconfig|  2 +-
 drivers/xen/Kconfig | 18 +++---
 10 files changed, 43 insertions(+), 25 deletions(-)

-- 
2.2.2


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Kai Huang


On 02/12/2015 10:49 AM, Kai Huang wrote:


On 02/11/2015 09:06 PM, Jan Beulich wrote:

On 11.02.15 at 09:28,  wrote:

- PML enable/disable for particular Domain

PML needs to be enabled (allocate PML buffer, initialize PML index, 
PML base
address, turn PML on VMCS, etc) for all vcpus of the domain, as PML 
buffer
and PML index are per-vcpu, but EPT table may be shared by vcpus. 
Enabling
PML on partial vcpus of the domain won't work. Also PML will only be 
enabled
for the domain when it is switched to dirty logging mode, and it 
will be
disabled when domain is switched back to normal mode. As looks vcpu 
number
won't be changed dynamically during guest is running (correct me if 
I am
wrong here), so we don't have to consider enabling PML for new 
created vcpu

when guest is in dirty logging mode.

After PML is enabled for the domain, we only need to clear EPT 
entry's D-bit
for guest memory in dirty logging mode. We achieve this by checking 
if PML is

enabled for the domain when p2m_ram_rx changed to p2m_ram_logdirty, and
updating EPT entry accordingly. However, for super pages, we still 
write
protect them in case of PML as we still need to split super page to 
4K page

in dirty logging mode.

While it doesn't matter much for our immediate needs, the
documentation isn't really clear about the behavior when a 2M or
1G page gets its D bit set: Wouldn't it be rather useful to the
consumer to know of that fact (e.g. by setting some of the lower
bits of the PML entry to indicate so)?
This is good point. The documentation only tells us the GPA will be 
logged with last 12 bits cleared. Whether hardware just clears last 12 
bits or performs 2M alignment (in case of 2M page) is not certain. I 
will confirm this with hardware guys. But as you said, it's not 
related to our immediate needs.
Forgot to say, to me currently it is certain that the lower 12 bits are 
cleared as specification says GPA is written to log with 4K aligned. But 
it should be possible to push hardware guys to modify if necessary, 
though I am not 100% sure.


Thanks,
-Kai



- PML buffer flush

There are two places we need to flush PML buffer. The first place is 
PML

buffer full VMEXIT handler (apparently), and the second place is in
paging_log_dirty_op (either peek or clean), as vcpus are running
asynchronously along with paging_log_dirty_op is called from 
userspace via

hypercall, and it's possible there are dirty GPAs logged in vcpus' PML
buffers but not full. Therefore we'd better to flush all vcpus' PML 
buffers

before reporting dirty GPAs to userspace.

We handle above two cases by flushing PML buffer at the beginning of 
all
VMEXITs. This solves the first case above, and it also solves the 
second
case, as prior to paging_log_dirty_op, domain_pause is called, which 
kicks
vcpus (that are in guest mode) out of guest mode via sending IPI, 
which cause

VMEXIT, to them.

This also makes log-dirty radix tree more updated as PML buffer is 
flushed

on basis of all VMEXITs but not only PML buffer full VMEXIT.

Is that really efficient? Flushing the buffer only as needed doesn't
seem to be a major problem (apart from the usual preemption issue
when dealing with guests with very many vCPU-s, but you certainly
recall that at this point HVM is still limited to 128).

Apart from these two remarks, the design looks okay to me.
While keeping log-dirty radix tree more updated is probably 
irrelevant, I do think we'd better to flush PML buffers in 
paging_log_dirty_op (both peek and clear) before reporting dirty pages 
to userspace, in which case I think flushing PML buffer at beginning 
of VMEXIT is a good idea, as domain_pause does the job automatically. 
I am not sure how much cycles will flushing PML buffer contribute but 
I think it should be relatively small comparing to VMEXIT itself, 
therefore it can be ignored.


An optimized way probably is we only flush PML buffer for external 
interrupt VMEXIT, which domain_pause really triggers, but not at 
beginning of all VMEXITs. But as log as the overhead of flush PML 
buffer is negligible, this optimization is also unnecessary.


Thanks,
-Kai


Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-11 Thread Julien Grall



On 12/02/2015 12:54, Ian Campbell wrote:

On Thu, 2015-02-12 at 04:35 +, Stefano Stabellini wrote:

On Tue, 10 Feb 2015, Ian Campbell wrote:

On Tue, 2015-02-10 at 15:51 +0800, Ard Biesheuvel wrote:

FWIW on x86 this doesn't depend on console_set_on_cmdline, does it need
to here?



I didn't check the code, but it seems inappropriate to add a preferred
console implicitly if the user has set 'console=' on the command line.


I had been thinking that add_preferred_console would DTRT, but it seems
not. Seems strange that most calls to it do not check if the console is
already set, but it does seem like the right thing in this case.


On x86 it does depend on !xen_initial_domain. I suppose on the principal
that a VT is normally available there. I suppose that doesn't apply to
ARM so much, although it could.



OK, I got confused by the xen_guest_init(). So do you mean if if
(!xen_initial_domain) should be added?


(dom0 is "Just A Guest" too ;-))

Adding it would be consistent with x86, I'm not precisely sure if that
is important or desirable in this case. I'd be inclined to start with
the if there.


The reasoning is that dom0 command line arguments come from its old
native grub stanza, therefore the console parameter is incorrect, right?
As opposed to regular domUs, that being freshly installed, are supposed
to have the correct console parameter?


Other way around I think, dom0 has the correct stuff from grub.cfg
whereas the guest may not.


Yes, currently for ARM guest you have to add 'extra="console=hvc0"' in 
the configuration file.


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-11 Thread Ian Campbell
On Thu, 2015-02-12 at 04:35 +, Stefano Stabellini wrote:
> On Tue, 10 Feb 2015, Ian Campbell wrote:
> > On Tue, 2015-02-10 at 15:51 +0800, Ard Biesheuvel wrote:
> > > > FWIW on x86 this doesn't depend on console_set_on_cmdline, does it need
> > > > to here?
> > > >
> > > 
> > > I didn't check the code, but it seems inappropriate to add a preferred
> > > console implicitly if the user has set 'console=' on the command line.
> > 
> > I had been thinking that add_preferred_console would DTRT, but it seems
> > not. Seems strange that most calls to it do not check if the console is
> > already set, but it does seem like the right thing in this case.
> > 
> > > > On x86 it does depend on !xen_initial_domain. I suppose on the principal
> > > > that a VT is normally available there. I suppose that doesn't apply to
> > > > ARM so much, although it could.
> > > >
> > > 
> > > OK, I got confused by the xen_guest_init(). So do you mean if if
> > > (!xen_initial_domain) should be added?
> > 
> > (dom0 is "Just A Guest" too ;-))
> > 
> > Adding it would be consistent with x86, I'm not precisely sure if that
> > is important or desirable in this case. I'd be inclined to start with
> > the if there.
> 
> The reasoning is that dom0 command line arguments come from its old
> native grub stanza, therefore the console parameter is incorrect, right?
> As opposed to regular domUs, that being freshly installed, are supposed
> to have the correct console parameter?

Other way around I think, dom0 has the correct stuff from grub.cfg
whereas the guest may not.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-11 Thread Stefano Stabellini
On Thu, 12 Feb 2015, Stefano Stabellini wrote:
> On Tue, 10 Feb 2015, Ian Campbell wrote:
> > On Tue, 2015-02-10 at 15:51 +0800, Ard Biesheuvel wrote:
> > > > FWIW on x86 this doesn't depend on console_set_on_cmdline, does it need
> > > > to here?
> > > >
> > > 
> > > I didn't check the code, but it seems inappropriate to add a preferred
> > > console implicitly if the user has set 'console=' on the command line.
> > 
> > I had been thinking that add_preferred_console would DTRT, but it seems
> > not. Seems strange that most calls to it do not check if the console is
> > already set, but it does seem like the right thing in this case.
> > 
> > > > On x86 it does depend on !xen_initial_domain. I suppose on the principal
> > > > that a VT is normally available there. I suppose that doesn't apply to
> > > > ARM so much, although it could.
> > > >
> > > 
> > > OK, I got confused by the xen_guest_init(). So do you mean if if
> > > (!xen_initial_domain) should be added?
> > 
> > (dom0 is "Just A Guest" too ;-))
> > 
> > Adding it would be consistent with x86, I'm not precisely sure if that
> > is important or desirable in this case. I'd be inclined to start with
> > the if there.
> 
> The reasoning is that dom0 command line arguments come from its old
> native grub stanza, therefore the console parameter is incorrect, right?
> As opposed to regular domUs, that being freshly installed, are supposed
> to have the correct console parameter?

I got it the other way around, but the question remains: why the
difference between dom0 and domUs? I guess that on x86 might make sense
because dom0 has access to vga there.

It is also interesting to note that the actual code is:

if (!xen_initial_domain()) {
add_preferred_console("xenboot", 0, NULL);
add_preferred_console("tty", 0, NULL);
add_preferred_console("hvc", 0, NULL);

that I am guessing it would prioritize a possible graphical console if
present, like fbcon on pvfb.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] xen/arm: allow console=hvc0 to be omitted for guests

2015-02-11 Thread Stefano Stabellini
On Tue, 10 Feb 2015, Ian Campbell wrote:
> On Tue, 2015-02-10 at 15:51 +0800, Ard Biesheuvel wrote:
> > > FWIW on x86 this doesn't depend on console_set_on_cmdline, does it need
> > > to here?
> > >
> > 
> > I didn't check the code, but it seems inappropriate to add a preferred
> > console implicitly if the user has set 'console=' on the command line.
> 
> I had been thinking that add_preferred_console would DTRT, but it seems
> not. Seems strange that most calls to it do not check if the console is
> already set, but it does seem like the right thing in this case.
> 
> > > On x86 it does depend on !xen_initial_domain. I suppose on the principal
> > > that a VT is normally available there. I suppose that doesn't apply to
> > > ARM so much, although it could.
> > >
> > 
> > OK, I got confused by the xen_guest_init(). So do you mean if if
> > (!xen_initial_domain) should be added?
> 
> (dom0 is "Just A Guest" too ;-))
> 
> Adding it would be consistent with x86, I'm not precisely sure if that
> is important or desirable in this case. I'd be inclined to start with
> the if there.

The reasoning is that dom0 command line arguments come from its old
native grub stanza, therefore the console parameter is incorrect, right?
As opposed to regular domUs, that being freshly installed, are supposed
to have the correct console parameter?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Xen's Linux kernel config options V2

2015-02-11 Thread Luis R. Rodriguez
On Fri, Feb 6, 2015 at 2:51 PM, Luis R. Rodriguez
 wrote:
>>> >> >   XEN_PLATFORM_PCI
>>> >
>>> > definitely x86 only
>>>
>>> All?
>>
>> only XEN_PLATFORM_PCI
>
> Updated.

Then again commit 5fbdc10395cd500d6ff844825a918c4e6f38de37 removed
this so its no longer relevant as its all folded under XEN_PVHVM

 Luis

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 01/12] Add support of parsing grub which has 'submenu' primitive

2015-02-11 Thread Ian Campbell
On Wed, 2015-02-11 at 14:44 +, Ian Jackson wrote:
> Robert Ho writes ("[PATCH OSSTEST 01/12] Add support of parsing grub which 
> has 'submenu' primitive"):
> >  From a hvm kernel build from Linux stable Kernel tree,
> >  the auto generated grub2 menu will have 'submenu' primitive, upon the
> >  'menuentry' items. Xen boot entries will be grouped into a submenu. This
> >  patch adds capability to support such grub formats. Also, this patch adjust
> >  some indent alignments.
> 
> Thanks for this submission.  Dealing with submenus is definitely
> something we want to do.
> 
> I haven't looked at the code in detail yet but I have a general
> question: we currently count menu entries and eventually write
> GRUB_DEFAULT=  into /etc/default/grub.

FWIW at some point (possibly coinciding with the addition of submenus,
I'm not sure) grub gained the ability to specify the title of the menu
item you wish to boot as the default, which would simplify things if it
could be used universally (i.e. was supported in Wheezy already).



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [OSSTEST PATCH v2 10/10] rump kernel tests: Repeat the xenstorels test 50 times

2015-02-11 Thread Ian Campbell
On Wed, 2015-02-11 at 14:37 +, Ian Jackson wrote:
> Ian Campbell writes ("Re: [OSSTEST PATCH 10/10] rump kernel tests: Repeat the 
> xenstorels test 50 times"):
> > On Fri, 2015-02-06 at 19:17 +, Ian Jackson wrote:
> > > Add a new step which uses repeat-ts to run
> > > ts-rumpuserxen-demo-xenstorels many times.
> > 
> > Acked-by: Ian Campbell 
> 
> Thanks.
> 
> After reviewing the output of my full-flight adhoc test, I decided
> that the order of steps ought to be different, so here is a v2 of this
> patch.

Still looks fine. Ack.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Usage of efi_enabled - Was: Re: [PATCH RFC 33/35] arm : acpi enable efi for acpi

2015-02-11 Thread Ian Campbell
On Wed, 2015-02-11 at 11:22 +, Jan Beulich wrote:
> > Does that also imply that some code which is using it to signal
> > availability of Runtime Services should be switch to some other (new?)
> > variable?
> 
> I hope not - we already have efi_rs_enable,

Good, I was just mislead by the nearby comment then (which sort of
implied that efi_enable == 0 due to no RTS support).

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [rumpuserxen test] 34486: regressions - FAIL

2015-02-11 Thread xen . org
flight 34486 rumpuserxen real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34486/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 33866
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 33866

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a

version targeted for testing:
 rumpuserxen  8af836e751ed191f3e2918668649710dd307e0b5
baseline version:
 rumpuserxen  30d72f3fc5e35cd53afd82c8179cc0e0b11146ad


People who touched revisions under test:
  Ian Jackson 
  Martin Lucina 


jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-i386-rumpuserxen-i386 blocked 



sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Not pushing.


commit 8af836e751ed191f3e2918668649710dd307e0b5
Author: Martin Lucina 
Date:   Tue Feb 10 10:49:23 2015 +0100

Pull in latest buildrump.sh and src-netbsd

For fixes to libc compat symbols missing (issue #21)

commit e28e2b9daf7ab2922913889d90ec438b9bee3d56
Author: Ian Jackson 
Date:   Wed Feb 4 16:29:26 2015 +

app-tools: Support old -D__RUMPUSER_XEN__ for now

Released versions of Xen (Xen 4.5) rely on __RUMPUSER_XEN__ being
defined.

A patch to change this in Xen upstream exists and will be backported,
but until that makes it through to a stable point release of Xen 4.5,
we should support both #defines.

This commit partially reverts 91d56232d987
   Renaming platform macros, app-tools and autoconf target string

Signed-off-by: Ian Jackson 
CC: Martin Lucina 
CC: Ian Campbell 
CC: Wei Liu 

commit 05e06b0fe52918d6575e33b7d7551d85c93f7aff
Author: Martin Lucina 
Date:   Mon Feb 2 18:01:52 2015 +0100

Sync Travis CI configuration with app-tools rename

Signed-off-by: Martin Lucina 

commit 3b36d1f55a08e1849ccd5424afb0fbe29647bd6c
Author: Martin Lucina 
Date:   Mon Feb 2 18:00:36 2015 +0100

Remove even older rumpxen-app-* variants of app-tools

Signed-off-by: Martin Lucina 

commit 91d56232d987f5df594723ed46b9000b4d43e21a
Author: Martin Lucina 
Date:   Mon Feb 2 17:52:41 2015 +0100

Renaming platform macros, app-tools and autoconf target string

As discussed at: http://thread.gmane.org/gmane.comp.rumpkernel.user/739

This commit renames the platform macros, app-tools and autoconf target
string to be consistent with current naming of the entire stack:

app-tools/rumpapp-xen-* -> app-tools/rumprun-xen-*
$ARCH-rumpxen-netbsd -> $ARCH-rumprun-netbsd
-D__RUMPUSER_XEN__ -D__RUMPAPP__ -> -D__RUMPRUN__

Signed-off-by: Martin Lucina 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Kai Huang


On 02/11/2015 09:06 PM, Jan Beulich wrote:

On 11.02.15 at 09:28,  wrote:

- PML enable/disable for particular Domain

PML needs to be enabled (allocate PML buffer, initialize PML index, PML base
address, turn PML on VMCS, etc) for all vcpus of the domain, as PML buffer
and PML index are per-vcpu, but EPT table may be shared by vcpus. Enabling
PML on partial vcpus of the domain won't work. Also PML will only be enabled
for the domain when it is switched to dirty logging mode, and it will be
disabled when domain is switched back to normal mode. As looks vcpu number
won't be changed dynamically during guest is running (correct me if I am
wrong here), so we don't have to consider enabling PML for new created vcpu
when guest is in dirty logging mode.

After PML is enabled for the domain, we only need to clear EPT entry's D-bit
for guest memory in dirty logging mode. We achieve this by checking if PML is
enabled for the domain when p2m_ram_rx changed to p2m_ram_logdirty, and
updating EPT entry accordingly. However, for super pages, we still write
protect them in case of PML as we still need to split super page to 4K page
in dirty logging mode.

While it doesn't matter much for our immediate needs, the
documentation isn't really clear about the behavior when a 2M or
1G page gets its D bit set: Wouldn't it be rather useful to the
consumer to know of that fact (e.g. by setting some of the lower
bits of the PML entry to indicate so)?
This is good point. The documentation only tells us the GPA will be 
logged with last 12 bits cleared. Whether hardware just clears last 12 
bits or performs 2M alignment (in case of 2M page) is not certain. I 
will confirm this with hardware guys. But as you said, it's not related 
to our immediate needs.



- PML buffer flush

There are two places we need to flush PML buffer. The first place is PML
buffer full VMEXIT handler (apparently), and the second place is in
paging_log_dirty_op (either peek or clean), as vcpus are running
asynchronously along with paging_log_dirty_op is called from userspace via
hypercall, and it's possible there are dirty GPAs logged in vcpus' PML
buffers but not full. Therefore we'd better to flush all vcpus' PML buffers
before reporting dirty GPAs to userspace.

We handle above two cases by flushing PML buffer at the beginning of all
VMEXITs. This solves the first case above, and it also solves the second
case, as prior to paging_log_dirty_op, domain_pause is called, which kicks
vcpus (that are in guest mode) out of guest mode via sending IPI, which cause
VMEXIT, to them.

This also makes log-dirty radix tree more updated as PML buffer is flushed
on basis of all VMEXITs but not only PML buffer full VMEXIT.

Is that really efficient? Flushing the buffer only as needed doesn't
seem to be a major problem (apart from the usual preemption issue
when dealing with guests with very many vCPU-s, but you certainly
recall that at this point HVM is still limited to 128).

Apart from these two remarks, the design looks okay to me.
While keeping log-dirty radix tree more updated is probably irrelevant, 
I do think we'd better to flush PML buffers in paging_log_dirty_op (both 
peek and clear) before reporting dirty pages to userspace, in which case 
I think flushing PML buffer at beginning of VMEXIT is a good idea, as 
domain_pause does the job automatically. I am not sure how much cycles 
will flushing PML buffer contribute but I think it should be relatively 
small comparing to VMEXIT itself, therefore it can be ignored.


An optimized way probably is we only flush PML buffer for external 
interrupt VMEXIT, which domain_pause really triggers, but not at 
beginning of all VMEXITs. But as log as the overhead of flush PML buffer 
is negligible, this optimization is also unnecessary.


Thanks,
-Kai


Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Kai Huang


On 02/11/2015 07:52 PM, Andrew Cooper wrote:

On 11/02/15 08:28, Kai Huang wrote:

Hi all,

PML (Page Modification Logging) is a new feature on Intel's Boardwell
server platfrom targeted to reduce overhead of dirty logging
mechanism. Below is the design for Xen. Would you help to review and
give comments?

Thankyou for this design.  It is a very good starting point!

Thanks!




Background
==

Currently, dirty logging is done via write protection, which basically
sets guest memory we want to log to be read-only, then when guest
performs write to that memory, write fault (EPT violation in case of
EPT is used) happens, in which we are able to log the dirty GFN. This
mechanism works but at cost of one write fault for each write from the
guest.

Strictly speaking, repeated writes to the same gfn after the first fault
are amortised until the logdirty is next queried, which makes typical
access patterns far less costly than a fault for every single write.

Indeed. I do mean first fault here.




PML Introduction


PML is a hardware-assisted efficient way, based on EPT mechanism, for
dirty logging. Briefly, PML logs dirty GPA automatically to a 4K PML
buffer when CPU changes EPT table's D-bit from 0 to 1. To accomplish
this, A new PML buffer base address (machine address), a PML index,
and a new PML buffer full VMEXIT were added to VMCS. Initially PML
index can be set to 511 (8 bytes for each GPA) to indicate the buffer
is empty, and CPU decreases PML index by 1 after logging GPA. Before
performing GPA logging, PML checks PML index to see if PML buffer has
been fully logged, in which case a PML buffer full VMEXIT happens, and
VMM should flush logged GPAs (to data structure keeps dirty GPAs) and
reset PML index so that further GPAs can be logged again.

The specification of PML can be found at:
http://www.intel.com/content/www/us/en/processors/page-modification-logging-vmm-white-paper.html


With PML, we don't have to use write protection but just clear D-bit
of EPT entry of guest memory to do dirty logging, with an additional
PML buffer full VMEXIT for 512 dirty GPAs. Theoretically, this can
reduce hypervisor overhead when guest is in dirty logging mode, and
therefore more CPU cycles can be allocated to guest, so it's expected
benchmarks in guest will have better performance comparing to non-PML.

One issue with basic EPT A/D tracking was the scan of the EPT tables.
Here, hardware will give us a list of affected gfns, but how is Xen
supposed to efficiently clear the dirty bits again?  Using EPT
misconfiguration is no better than the existing fault path.

See my reply to Jan's email.


Design
==

- PML feature is used globally

A new Xen boot parameter, say 'opt_enable_pml', will be introduced to
control PML feature detection, and PML feature will only be detected
if opt_enable_pml = 1. Once PML feature is detected, it will be used
for dirty logging for all domains globally. Currently we don't support
to use PML on basis of per-domain as it will require additional
control from XL tool.

Rather than adding in a new top level command line option for an ept
subfeature, it would be preferable to add an "ept=" option which has
"pml" as a sub boolean.

Which is good to me, if Jan agrees.

Jan, which do you prefer here?


- PML enable/disable for particular Domain

I do not believe that this is an interesting use case at the moment.
Currently, PML would be an implementation detail of how Xen manages to
provide the logdirty bitmap to the toolstack or device model, and need
not be exposed at all.

If in the future, a toolstack component wishes to use the pml for other
purposes, there is more infrastructure which needs adjusting than just
per-domain PML.
I did't mean to expose PML to toolstack here. In fact, this is I want to 
avoid now, PML should be hidden in Xen hypervisor completely, as you 
said, just another mechanism to provide logdirty bitmap to userspace.
Here I mean we need to enable PML for the domain (which means allocate 
PML buffer, initialize PML index, and turn PML on in VMCS) manually, as 
it's not turned on automatically after the PML feature detection.

Sorry for the confusion.




PML needs to be enabled (allocate PML buffer, initialize PML index,
PML base address, turn PML on VMCS, etc) for all vcpus of the domain,
as PML buffer and PML index are per-vcpu, but EPT table may be shared
by vcpus. Enabling PML on partial vcpus of the domain won't work. Also
PML will only be enabled for the domain when it is switched to dirty
logging mode, and it will be disabled when domain is switched back to
normal mode. As looks vcpu number won't be changed dynamically during
guest is running (correct me if I am wrong here), so we don't have to
consider enabling PML for new created vcpu when guest is in dirty
logging mode.

There are exactly d->max_vcpus worth of struct vcpus (and therefore
VMCSes) for a domain after creation, and will exist for the lifetime of
the domain.  There is no dynam

[Xen-devel] [libvirt test] 34464: tolerable all pass - PUSHED

2015-02-11 Thread xen . org
flight 34464 libvirt real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34464/

Failures :-/ but no regressions.

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass

version targeted for testing:
 libvirt  a7c9c7a6abfb18db1e6b0da0bd9ee680d915c992
baseline version:
 libvirt  633053af672e906f17ede49dc96e671b46da15bd


People who touched revisions under test:
  Ján Tomko 
  Luyao Huang 
  Martin Kletzander 
  Peter Krempa 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-libvirt pass
 test-armhf-armhf-libvirt pass
 test-amd64-i386-libvirt  pass



sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Pushing revision :

+ branch=libvirt
+ revision=a7c9c7a6abfb18db1e6b0da0bd9ee680d915c992
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getconfig Repos
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
++ repos=/export/home/osstest/repos
++ repos_lock=/export/home/osstest/repos/lock
++ '[' x '!=' x/export/home/osstest/repos/lock ']'
++ OSSTEST_REPOS_LOCK_LOCKED=/export/home/osstest/repos/lock
++ exec with-lock-ex -w /export/home/osstest/repos/lock ./ap-push libvirt 
a7c9c7a6abfb18db1e6b0da0bd9ee680d915c992
+ branch=libvirt
+ revision=a7c9c7a6abfb18db1e6b0da0bd9ee680d915c992
+ . cri-lock-repos
++ . cri-common
+++ . cri-getconfig
+++ umask 002
+++ getconfig Repos
+++ perl -e '
use Osstest;
readglobalconfig();
print $c{"Repos"} or die $!;
'
++ repos=/export/home/osstest/repos
++ repos_lock=/export/home/osstest/repos/lock
++ '[' x/export/home/osstest/repos/lock '!=' x/export/home/osstest/repos/lock 
']'
+ . cri-common
++ . cri-getconfig
++ umask 002
+ select_xenbranch
+ case "$branch" in
+ tree=libvirt
+ xenbranch=xen-unstable
+ '[' xlibvirt = xlinux ']'
+ linuxbranch=
+ '[' x = x ']'
+ qemuubranch=qemu-upstream-unstable
+ : tested/2.6.39.x
+ . ap-common
++ : osst...@xenbits.xensource.com
++ : git://xenbits.xen.org/xen.git
++ : osst...@xenbits.xensource.com:/home/xen/git/xen.git
++ : git://xenbits.xen.org/staging/qemu-xen-unstable.git
++ : git://git.kernel.org
++ : git://git.kernel.org/pub/scm/linux/kernel/git
++ : git
++ : git://libvirt.org/libvirt.git
++ : osst...@xenbits.xensource.com:/home/xen/git/libvirt.git
++ : git://xenbits.xen.org/libvirt.git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : git
++ : git://xenbits.xen.org/rumpuser-xen.git
++ : osst...@xenbits.xensource.com:/home/xen/git/rumpuser-xen.git
+++ besteffort_repo https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ cached_repo https://github.com/rumpkernel/rumpkernel-netbsd-src 
'[fetch=try]'
+++ local repo=https://github.com/rumpkernel/rumpkernel-netbsd-src
+++ local 'options=[fetch=try]'
 getconfig GitCacheProxy
 perl -e '
use Osstest;
readglobalconfig();
print $c{"GitCacheProxy"} or die $!;
'
+++ local cache=git://drall.uk.xensource.com:9419/
+++ '[' xgit://drall.uk.xensource.com:9419/ '!=' x ']'
+++ echo 
'git://drall.uk.xensource.com:9419/https://github.com/rumpkernel/rumpkernel-netbsd-src%20[fetch=try]'
++ : 
'git://drall.uk.xensource.com:9419/https://github.com/rumpkernel/rumpkernel-netbsd-src%20[fetch=try]'
++ : git
++ : git://git.seabios.org/seabios.git
++ : osst...@xenbits.xensource.com:/home/xen/git/osstest/seabios.git
++ : git://xenbits.xen.org/osstest/seabios.git
++ : https://github.com/tianocore/edk2.git
++ : osst...@xenbits.xensource.com:/home/xen/git/osstest/ovmf.git
++ : git

Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Kai Huang


On 02/11/2015 09:13 PM, Jan Beulich wrote:

On 11.02.15 at 12:52,  wrote:

On 11/02/15 08:28, Kai Huang wrote:

With PML, we don't have to use write protection but just clear D-bit
of EPT entry of guest memory to do dirty logging, with an additional
PML buffer full VMEXIT for 512 dirty GPAs. Theoretically, this can
reduce hypervisor overhead when guest is in dirty logging mode, and
therefore more CPU cycles can be allocated to guest, so it's expected
benchmarks in guest will have better performance comparing to non-PML.

One issue with basic EPT A/D tracking was the scan of the EPT tables.
Here, hardware will give us a list of affected gfns, but how is Xen
supposed to efficiently clear the dirty bits again?  Using EPT
misconfiguration is no better than the existing fault path.

Why not? The misconfiguration exit ought to clear the D bit for all
511 entries in the L1 table (and set it for the one entry that is
currently serving the access). All further D bit handling will then
be PML based.
Indeed, we clear D-bit in EPT misconfiguration. In my understanding, the 
sequences are as follows:


1) PML enabled for the domain.
2) ept_invalidate_emt (or ept_invalidate_emt_range) is called.
3) Guest accesses specific GPA (which has been invalidated by step 2), 
and EPT misconfig is triggered.
4) Then resolve_misconfig is called, which fixes up GFN (above GPA >> 
12) to p2m_ram_logdirty, and calls ept_p2m_type_to_flags, in which we 
clear D-bit of EPT entry (instead of clear W-bit) if p2m type is 
p2m_ram_logdirty. Then dirty logging of this GFN will be handled by PML.


The above 2) ~ 4) will be repeated when log-dirty radix tree is cleared.




- PML buffer flush

There are two places we need to flush PML buffer. The first place is
PML buffer full VMEXIT handler (apparently), and the second place is
in paging_log_dirty_op (either peek or clean), as vcpus are running
asynchronously along with paging_log_dirty_op is called from userspace
via hypercall, and it's possible there are dirty GPAs logged in vcpus'
PML buffers but not full. Therefore we'd better to flush all vcpus'
PML buffers before reporting dirty GPAs to userspace.

Why apparently?  It would be quite easy for a guest to dirty 512 frames
without otherwise taking a vmexit.

I silently replaced apparently with obviously while reading...


We handle above two cases by flushing PML buffer at the beginning of
all VMEXITs. This solves the first case above, and it also solves the
second case, as prior to paging_log_dirty_op, domain_pause is called,
which kicks vcpus (that are in guest mode) out of guest mode via
sending IPI, which cause VMEXIT, to them.

This also makes log-dirty radix tree more updated as PML buffer is
flushed on basis of all VMEXITs but not only PML buffer full VMEXIT.

My gut feeling is that this is substantial overhead on a common path,
but this largely depends on how the dirty bits can be cleared efficiently.

I agree on the overhead part, but I don't see what relation this has
to the dirty bit clearing - a PML buffer flush doesn't involve any
alterations of D bits.
No the flush is not related to the dirty bit clearing. The PML buffer 
flush just does following (which I should have clarified in my design, 
sorry):

1) read out PML index
2) Loop all GPAs logged in the PML buffer according to PML index, and 
update them to log-dirty radix tree.


I agree there's overhead on VMEXIT common path, but the overhead should 
not be substantial, comparing to the overhead of VMEXIT itself.


Thanks,
-Kai


Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 01/12] Add support of parsing grub which has 'submenu' primitive

2015-02-11 Thread Hu, Robert

> -Original Message-
> From: Ian Jackson [mailto:ian.jack...@eu.citrix.com]
> Sent: Wednesday, February 11, 2015 10:44 PM
> To: Hu, Robert
> Cc: xen-devel@lists.xen.org; jfeh...@suse.com; wei.l...@citrix.com;
> ian.campb...@citrix.com; Pang, LongtaoX
> Subject: Re: [PATCH OSSTEST 01/12] Add support of parsing grub which has
> 'submenu' primitive
> 
> Robert Ho writes ("[PATCH OSSTEST 01/12] Add support of parsing grub which
> has 'submenu' primitive"):
> >  From a hvm kernel build from Linux stable Kernel tree,
> >  the auto generated grub2 menu will have 'submenu' primitive, upon the
> >  'menuentry' items. Xen boot entries will be grouped into a submenu. This
> >  patch adds capability to support such grub formats. Also, this patch adjust
> >  some indent alignments.
> 
> Thanks for this submission.  Dealing with submenus is definitely
> something we want to do.
> 
> I haven't looked at the code in detail yet but I have a general
> question: we currently count menu entries and eventually write
> GRUB_DEFAULT=  into /etc/default/grub.
> 
> Does this work properly if the entry is in a submenu ?  I guess you
> have probably tested this but I thought I should ask...
> 
Yes, this minor change just get 'parsemenu' subroutine capability of 
recognizing 'submenu'.
The outer layer logic isn't affected.
Actually, the Xen boot menuentry we choose, is inside a submenu. It works and 
/etc/default/grub
is assigned properly.
> Can you please not adjust the whitespace ?  osstest in general doesn't
> have a requirement for any particular whitespace use, and certainly if
> there are to be any whitespace changes they ought to be in a separate
> patch.
I adjust those because some one in last version's reply told us that
osstest prefers white space substitution to tab, and traditionally 4
white space of 1 tab. (This align with my previous coding experience as well)
And I indeed find that this hunk of code doesn't looks well in my editor.
Its unalignment increases difficulty of reading.
I would suggest to adjust the this hunk's indentation and use white space
substitution to tab to have best suitability across different editors.
> 
> Thanks,
> Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.10 test] 34436: regressions - FAIL

2015-02-11 Thread xen . org
flight 34436 linux-3.10 real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34436/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-winxpsp3  7 windows-install fail REGR. vs. 26303

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 26303
 test-amd64-amd64-xl-winxpsp3  7 windows-install  fail   like 26303

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-credit2   5 xen-boot fail   never pass
 test-armhf-armhf-xl-midway5 xen-boot fail   never pass
 test-armhf-armhf-libvirt  5 xen-boot fail   never pass
 test-armhf-armhf-xl   5 xen-boot fail   never pass
 test-armhf-armhf-xl-multivcpu  5 xen-boot fail  never pass
 test-armhf-armhf-xl-sedf  5 xen-boot fail   never pass
 test-armhf-armhf-xl-sedf-pin  5 xen-boot fail   never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass

version targeted for testing:
 linux87dc7c99c72e49461fba277c81871525700821fb
baseline version:
 linuxbe67db109090b17b56eb8eb2190cd70700f107aa


904 people touched revisions under test,
not listing them all


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  fail
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-xl-qemut-win7-amd64  fail
 test-amd64-amd64-xl-qemuu-win7-amd64 fail
 test-amd64-i386-xl-qemu

Re: [Xen-devel] [edk2] [PATCH] MdeModulePkg: mark completion of PCI enumeration in PciEnumeratorLight

2015-02-11 Thread Ni, Ruiyu
Wei,
No you cannot install gEfiPciEnumerationCompleteProtocolGuid in 
PciEnumeratorLight().
For a real platform, PCI BUS is fully enumerated in PciEnumerator() and later 
if reconnect happens, it's light enumerated in PciEnumeratorLight(). The 
protocol should only be installed once in PeiEnumerator(). Your fix will cause 
this protocol installed every time a reconnect happens.
The protocol 's meaning is that the PCI BUS is fully enumerated. If the PCI BUS 
is fully enumerated before starting PciBus driver, light PCI enumeration is 
used.
For your OVMF/QEMU case, an alternative fix is to install this protocol in a 
platform driver when it detects that the PCI BUS is fully enumerated.

Thanks,
Ray

-Original Message-
From: Wei Liu [mailto:wei.l...@citrix.com] 
Sent: Thursday, February 12, 2015 4:24 AM
To: edk2-de...@lists.sourceforge.net
Cc: xen-devel@lists.xen.org; Laszlo
Subject: [edk2] [PATCH] MdeModulePkg: mark completion of PCI enumeration in 
PciEnumeratorLight

I had an issue when trying to boot Xen HVM guest with latest OVMF
master. Guest crashed with memory violation, and the bisection pointed
to 66b280df2 ("OvmfPkg: AcpiPlatformDxe: make dependency on PCI
enumeration explicit"). That commit made AcpiPlatformDxe depend on PCI
enumeration using gEfiPciEnumerationCompleteProtocolGuid, which is a
very reasonable change.

The real culprit is that Xen HVM is using PciEnumeratorLight which
doesn't install gEfiPciEnumerationCompleteProtocolGuid. This, in
combination with 66b280df2, makes AcpiPlatformDxe not able to be loaded,
resulting in guest crash.

The fix is to install gEfiPciEnumerationCompleteProtocolGuid in
PciEnumeratorLight.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Wei Liu 
Cc: Feng Tian 
Cc: Anthony Perard 
Cc: Laszlo Ersek 
Cc: Jordan Justen 
---
 MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c 
b/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
index 9e7ac74..7659585 100644
--- a/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
+++ b/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
@@ -2256,6 +2256,7 @@ PciEnumeratorLight (
 {
 
   EFI_STATUSStatus;
+  EFI_HANDLEHostBridgeHandle;
   EFI_PCI_ROOT_BRIDGE_IO_PROTOCOL   *PciRootBridgeIo;
   PCI_IO_DEVICE *RootBridgeDev;
   UINT16MinBus;
@@ -2288,6 +2289,11 @@ PciEnumeratorLight (
 return Status;
   }
 
+  //
+  // Get the host bridge handle
+  //
+  HostBridgeHandle = PciRootBridgeIo->ParentHandle;
+
   Status = PciRootBridgeIo->Configuration (PciRootBridgeIo, (VOID **) 
&Descriptors);
 
   if (EFI_ERROR (Status)) {
@@ -2348,7 +2354,14 @@ PciEnumeratorLight (
 Descriptors++;
   }
 
-  return EFI_SUCCESS;
+  Status = gBS->InstallProtocolInterface (
+  &HostBridgeHandle,
+  &gEfiPciEnumerationCompleteProtocolGuid,
+  EFI_NATIVE_INTERFACE,
+  NULL
+  );
+
+  return Status;
 }
 
 /**
-- 
1.9.1


--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
___
edk2-devel mailing list
edk2-de...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] modify the IO_TLB_SEGSIZE to io_tlb_segsize configurable as flexible requirement about SW-IOMMU.

2015-02-11 Thread Wang, Xiaoming
Dear Wilk:

> -Original Message-
> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> Sent: Thursday, February 12, 2015 4:49 AM
> To: Wang, Xiaoming
> Cc: David Vrabel; linux-m...@linux-mips.org; pebo...@tiscali.nl; Zhang,
> Dongxing; lau...@codeaurora.org; d.kasat...@samsung.com;
> heiko.carst...@de.ibm.com; linux-ker...@vger.kernel.org; ralf@linux-
> mips.org; ch...@chris-wilson.co.uk; takahiro.aka...@linaro.org;
> li...@horizon.com; xen-de...@lists.xenproject.org;
> boris.ostrov...@oracle.com; Liu, Chuansheng; a...@linux-foundation.org
> Subject: Re: [Xen-devel] [PATCH] modify the IO_TLB_SEGSIZE to
> io_tlb_segsize configurable as flexible requirement about SW-IOMMU.
> 
> On Wed, Feb 11, 2015 at 08:38:29AM +, Wang, Xiaoming wrote:
> > Dear David
> >
> > > -Original Message-
> > > From: David Vrabel [mailto:david.vra...@citrix.com]
> > > Sent: Tuesday, February 10, 2015 5:46 PM
> > > To: Wang, Xiaoming; Konrad Rzeszutek Wilk
> > > Cc: linux-m...@linux-mips.org; pebo...@tiscali.nl; Zhang, Dongxing;
> > > lau...@codeaurora.org; d.kasat...@samsung.com;
> > > heiko.carst...@de.ibm.com; linux-ker...@vger.kernel.org; ralf@linux-
> > > mips.org; ch...@chris-wilson.co.uk; takahiro.aka...@linaro.org;
> > > david.vra...@citrix.com; li...@horizon.com; xen-
> > > de...@lists.xenproject.org; boris.ostrov...@oracle.com; Liu,
> > > Chuansheng; a...@linux-foundation.org
> > > Subject: Re: [Xen-devel] [PATCH] modify the IO_TLB_SEGSIZE to
> > > io_tlb_segsize configurable as flexible requirement about SW-IOMMU.
> > >
> > > On 06/02/15 00:10, Wang, Xiaoming wrote:
> > > >
> > > >
> > > >> -Original Message-
> > > >> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> > > >> Sent: Friday, February 6, 2015 3:33 AM
> > > >> To: Wang, Xiaoming
> > > >> Cc: r...@linux-mips.org; boris.ostrov...@oracle.com;
> > > >> david.vra...@citrix.com; linux-m...@linux-mips.org; linux-
> > > >> ker...@vger.kernel.org; xen-de...@lists.xenproject.org;
> > > >> akpm@linux- foundation.org; li...@horizon.com;
> > > >> lau...@codeaurora.org; heiko.carst...@de.ibm.com;
> > > >> d.kasat...@samsung.com; takahiro.aka...@linaro.org;
> > > >> ch...@chris-wilson.co.uk; pebo...@tiscali.nl; Liu, Chuansheng;
> > > >> Zhang, Dongxing
> > > >> Subject: Re: [PATCH] modify the IO_TLB_SEGSIZE to io_tlb_segsize
> > > >> configurable as flexible requirement about SW-IOMMU.
> > > >>
> > > >> On Fri, Feb 06, 2015 at 07:01:14AM +0800, xiaomin1 wrote:
> > > >>> The maximum of SW-IOMMU is limited to 2^11*128 = 256K.
> > > >>> While in different platform and different requirements this
> > > >>> seems
> > > improper.
> > > >>> So modify the IO_TLB_SEGSIZE to io_tlb_segsize as configurable
> > > >>> is make
> > > >> sense.
> > > >>
> > > >> More details please. What is the issue you are hitting?
> > > >>
> > > > Example:
> > > > If 1M bytes are requied. There has an error like.
> > >
> > > Instead of allowing the bouncing of such large buffers, could the
> > > gadget driver be modified to submit the buffers to the hardware in
> smaller chunks?
> > >
> > > David
> >
> > Our target is try to make IO_TLB_SEGSIZE configurable.
> > Neither 256 bytes  or 1M bytes seems suitable value, I think.
> > It's better to use the tactics something like kmem_cache_create  in
> > kmalloc function.
> > But SW-IOMMU seems more lighter.
> > So we choose variable rather than function.
> 
> Would it be possible to understand why the gadget needs such large buffer?
> That is irrespective of the patchset you are proposing.
> 
> In regards to the pathchset - I don't see anything fundamentally wrong with
> the patch. What I am afraid is that this fixes the symptoms instead of the
> underlaying problem. The problem I think is that with this large 1MB requests
> you risk of using the SWIOTLB bounce buffer which can result in poor
> performance.
> 
> So eventually somebody will have to figure out why the performance is poor
> and have a hard time figuring what is wrong - as the symptoms have been
> removed.
> 
> Hence looking at potentially using an scatter gather mechanism and chop up
> the requests in smaller sizes might be an better option. But I don't know?
> Perhaps you are more familiar with the gadget and could tell me why it needs
> an 1MB size request?
> 
> 
The 1M size is requested when doing flash fastboot in 
system/core/fastbootd/commands/flash.c  defined by Google.
I listed a partial code from flash.c  here.
#define BUFFER_SIZE 1024 * 1024
int current_size = MIN(size - written, BUFFER_SIZE);
(gpt_mmap(&input, written + skip, current_size, data_fd))
mapping->size = ALIGN(size + location_diff, PAGE_SIZE);

> >
> > Xiaoming.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] virtual: Documentation: simplify and generalize paravirt_ops.txt

2015-02-11 Thread Rusty Russell
"Luis R. Rodriguez"  writes:
> From: "Luis R. Rodriguez" 
>
> The general documentation we have for pv_ops is currenty present
> on the IA64 docs, but since this documentation covers IA64 xen
> enablement and IA64 Xen support got ripped out a while ago
> through commit d52eefb47 present since v3.14-rc1 lets just
> simplify, generalize and move the pv_ops documentation to a
> shared place.

OK, I've applied this.

Thanks,
Rusty.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-11 Thread Linus Torvalds
On Feb 11, 2015 3:15 PM, "Jeremy Fitzhardinge"  wrote:
>
> Right now it needs to be a locked operation to prevent read-reordering.
> x86 memory ordering rules state that all writes are seen in a globally
> consistent order, and are globally ordered wrt reads *on the same
> addresses*, but reads to different addresses can be reordered wrt to
writes.

The modern x86 rules are actually much tighter than that.

Every store is a release, and every load is an acquire. So a non-atomic
store is actually a perfectly fine unlock. All preceding stores will be
seen by other cpu's before the unlock, and while reads can pass stores,
they only pass *earlier* stores.

For *taking* a lock you need an atomic access, because otherwise loads
inside the locked region could bleed out to before the store that takes the
lock.

 Linus
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-11 Thread Jeremy Fitzhardinge

On 02/11/2015 09:24 AM, Oleg Nesterov wrote:
> I agree, and I have to admit I am not sure I fully understand why
> unlock uses the locked add. Except we need a barrier to avoid the race
> with the enter_slowpath() users, of course. Perhaps this is the only
> reason?

Right now it needs to be a locked operation to prevent read-reordering.
x86 memory ordering rules state that all writes are seen in a globally
consistent order, and are globally ordered wrt reads *on the same
addresses*, but reads to different addresses can be reordered wrt to writes.

So, if the unlocking add were not a locked operation:

__add(&lock->tickets.head, TICKET_LOCK_INC);/* not locked */

if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
__ticket_unlock_slowpath(lock, prev);

Then the read of lock->tickets.tail can be reordered before the unlock,
which introduces a race:

/* read reordered here */
if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) /* false */
/* ... */;

/* other CPU sets SLOWPATH and blocks */

__add(&lock->tickets.head, TICKET_LOCK_INC);/* not locked */

/* other CPU hung */

So it doesn't *have* to be a locked operation. This should also work:

__add(&lock->tickets.head, TICKET_LOCK_INC);/* not locked */

lfence();   /* prevent read 
reordering */
if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG))
__ticket_unlock_slowpath(lock, prev);

but in practice a locked add is cheaper than an lfence (or at least was).

This *might* be OK, but I think it's on dubious ground:

__add(&lock->tickets.head, TICKET_LOCK_INC);/* not locked */

/* read overlaps write, and so is ordered */
if (unlikely(lock->head_tail & (TICKET_SLOWPATH_FLAG << TICKET_SHIFT))
__ticket_unlock_slowpath(lock, prev);

because I think Intel and AMD differed in interpretation about how
overlapping but different-sized reads & writes are ordered (or it simply
isn't architecturally defined).

If the slowpath flag is moved to head, then it would always have to be
locked anyway, because it needs to be atomic against other CPU's RMW
operations setting the flag.

J

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [linux-3.16 test] 34434: regressions - FAIL

2015-02-11 Thread xen . org
flight 34434 linux-3.16 real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34434/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-credit2  15 guest-localmigrate/x10fail REGR. vs. 34167

Tests which are failing intermittently (not blocking):
 test-amd64-amd64-xl-pvh-amd   5 xen-bootfail pass in 34363
 test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail pass in 34363
 test-amd64-amd64-libvirt 12 guest-start.2  fail in 34363 pass in 34434

Regressions which are regarded as allowable (not blocking):
 test-amd64-amd64-xl-sedf  5 xen-boot fail   like 34167
 test-armhf-armhf-libvirt 12 guest-start.2   fail in 34363 blocked in 34167
 test-armhf-armhf-xl-credit2   5 xen-boot  fail in 34363 like 34167

Tests which did not succeed, but are not blocking:
 test-amd64-i386-libvirt   9 guest-start  fail   never pass
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-sedf-pin 15 guest-localmigrate/x10   fail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-amd64-amd64-xl-multivcpu 15 guest-localmigrate/x10   fail  never pass
 test-amd64-i386-freebsd10-i386  7 freebsd-install  fail never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-i386-freebsd10-amd64  7 freebsd-install fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install  fail never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start   fail in 34363 never pass

version targeted for testing:
 linuxf9bbc2490930cfc28ec522ab55f5cb83cdd713a1
baseline version:
 linux19583ca584d6f574384e17fe7613dfaeadcdc4a6


991 people touched revisions under test,
not listing them all


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qem

[Xen-devel] [PATCH v2] vsprintf: Make sure argument to %pX specifier is valid

2015-02-11 Thread Boris Ostrovsky
If invalid pointer (i.e. something smaller than HYPERVISOR_VIRT_START)
is passed for %*ph/%pv/%ps/%pS format specifiers then print "(NULL)"

Signed-off-by: Boris Ostrovsky 
---
 xen/common/vsprintf.c |   23 ---
 1 files changed, 16 insertions(+), 7 deletions(-)

v2:
 * Print "(NULL)" instead of specifier-specific string
 * Consider all addresses under HYPERVISOR_VIRT_START as invalid. (I think
   this is true for both x86 and ARM but I don't have ARM platform to test).


diff --git a/xen/common/vsprintf.c b/xen/common/vsprintf.c
index 065cc42..b9542b5 100644
--- a/xen/common/vsprintf.c
+++ b/xen/common/vsprintf.c
@@ -270,6 +270,22 @@ static char *pointer(char *str, char *end, const char 
**fmt_ptr,
 const char *fmt = *fmt_ptr, *s;
 
 /* Custom %p suffixes. See XEN_ROOT/docs/misc/printk-formats.txt */
+
+switch ( fmt[1] )
+{
+case 'h':
+case 's':
+case 'S':
+case 'v':
+++*fmt_ptr;
+}
+
+if ( (unsigned long)arg < HYPERVISOR_VIRT_START )
+{
+char *s = "(NULL)";
+return string(str, end, s, -1, -1, 0);
+}
+
 switch ( fmt[1] )
 {
 case 'h': /* Raw buffer as hex string. */
@@ -277,9 +293,6 @@ static char *pointer(char *str, char *end, const char 
**fmt_ptr,
 const uint8_t *hex_buffer = arg;
 unsigned int i;
 
-/* Consumed 'h' from the format string. */
-++*fmt_ptr;
-
 /* Bound user count from %* to between 0 and 64 bytes. */
 if ( field_width <= 0 )
 return str;
@@ -306,9 +319,6 @@ static char *pointer(char *str, char *end, const char 
**fmt_ptr,
 unsigned long sym_size, sym_offset;
 char namebuf[KSYM_NAME_LEN+1];
 
-/* Advance parents fmt string, as we have consumed 's' or 'S' */
-++*fmt_ptr;
-
 s = symbols_lookup((unsigned long)arg, &sym_size, &sym_offset, 
namebuf);
 
 /* If the symbol is not found, fall back to printing the address */
@@ -335,7 +345,6 @@ static char *pointer(char *str, char *end, const char 
**fmt_ptr,
 {
 const struct vcpu *v = arg;
 
-++*fmt_ptr;
 if ( str < end )
 *str = 'd';
 str = number(str + 1, end, v->domain->domain_id, 10, -1, -1, 0);
-- 
1.7.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] modify the IO_TLB_SEGSIZE to io_tlb_segsize configurable as flexible requirement about SW-IOMMU.

2015-02-11 Thread Konrad Rzeszutek Wilk
On Wed, Feb 11, 2015 at 08:38:29AM +, Wang, Xiaoming wrote:
> Dear David
> 
> > -Original Message-
> > From: David Vrabel [mailto:david.vra...@citrix.com]
> > Sent: Tuesday, February 10, 2015 5:46 PM
> > To: Wang, Xiaoming; Konrad Rzeszutek Wilk
> > Cc: linux-m...@linux-mips.org; pebo...@tiscali.nl; Zhang, Dongxing;
> > lau...@codeaurora.org; d.kasat...@samsung.com;
> > heiko.carst...@de.ibm.com; linux-ker...@vger.kernel.org; ralf@linux-
> > mips.org; ch...@chris-wilson.co.uk; takahiro.aka...@linaro.org;
> > david.vra...@citrix.com; li...@horizon.com; xen-
> > de...@lists.xenproject.org; boris.ostrov...@oracle.com; Liu, Chuansheng;
> > a...@linux-foundation.org
> > Subject: Re: [Xen-devel] [PATCH] modify the IO_TLB_SEGSIZE to
> > io_tlb_segsize configurable as flexible requirement about SW-IOMMU.
> > 
> > On 06/02/15 00:10, Wang, Xiaoming wrote:
> > >
> > >
> > >> -Original Message-
> > >> From: Konrad Rzeszutek Wilk [mailto:konrad.w...@oracle.com]
> > >> Sent: Friday, February 6, 2015 3:33 AM
> > >> To: Wang, Xiaoming
> > >> Cc: r...@linux-mips.org; boris.ostrov...@oracle.com;
> > >> david.vra...@citrix.com; linux-m...@linux-mips.org; linux-
> > >> ker...@vger.kernel.org; xen-de...@lists.xenproject.org; akpm@linux-
> > >> foundation.org; li...@horizon.com; lau...@codeaurora.org;
> > >> heiko.carst...@de.ibm.com; d.kasat...@samsung.com;
> > >> takahiro.aka...@linaro.org; ch...@chris-wilson.co.uk;
> > >> pebo...@tiscali.nl; Liu, Chuansheng; Zhang, Dongxing
> > >> Subject: Re: [PATCH] modify the IO_TLB_SEGSIZE to io_tlb_segsize
> > >> configurable as flexible requirement about SW-IOMMU.
> > >>
> > >> On Fri, Feb 06, 2015 at 07:01:14AM +0800, xiaomin1 wrote:
> > >>> The maximum of SW-IOMMU is limited to 2^11*128 = 256K.
> > >>> While in different platform and different requirements this seems
> > improper.
> > >>> So modify the IO_TLB_SEGSIZE to io_tlb_segsize as configurable is
> > >>> make
> > >> sense.
> > >>
> > >> More details please. What is the issue you are hitting?
> > >>
> > > Example:
> > > If 1M bytes are requied. There has an error like.
> > 
> > Instead of allowing the bouncing of such large buffers, could the gadget
> > driver be modified to submit the buffers to the hardware in smaller chunks?
> > 
> > David
> 
> Our target is try to make IO_TLB_SEGSIZE configurable.
> Neither 256 bytes  or 1M bytes seems suitable value, I think.
> It's better to use the tactics something like
> kmem_cache_create  in kmalloc function.
> But SW-IOMMU seems more lighter.
> So we choose variable rather than function.

Would it be possible to understand why the gadget needs such
large buffer? That is irrespective of the patchset you are proposing.

In regards to the pathchset - I don't see anything fundamentally
wrong with the patch. What I am afraid is that this fixes the
symptoms instead of the underlaying problem. The problem I think
is that with this large 1MB requests you risk of using the
SWIOTLB bounce buffer which can result in poor performance.

So eventually somebody will have to figure out why the performance
is poor and have a hard time figuring what is wrong - as the
symptoms have been removed.

Hence looking at potentially using an scatter gather mechanism
and chop up the requests in smaller sizes might be an better
option. But I don't know? Perhaps you are more familiar with the
gadget and could tell me why it needs an 1MB size request?


> 
> Xiaoming.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] MdeModulePkg: mark completion of PCI enumeration in PciEnumeratorLight

2015-02-11 Thread Wei Liu
I had an issue when trying to boot Xen HVM guest with latest OVMF
master. Guest crashed with memory violation, and the bisection pointed
to 66b280df2 ("OvmfPkg: AcpiPlatformDxe: make dependency on PCI
enumeration explicit"). That commit made AcpiPlatformDxe depend on PCI
enumeration using gEfiPciEnumerationCompleteProtocolGuid, which is a
very reasonable change.

The real culprit is that Xen HVM is using PciEnumeratorLight which
doesn't install gEfiPciEnumerationCompleteProtocolGuid. This, in
combination with 66b280df2, makes AcpiPlatformDxe not able to be loaded,
resulting in guest crash.

The fix is to install gEfiPciEnumerationCompleteProtocolGuid in
PciEnumeratorLight.

Contributed-under: TianoCore Contribution Agreement 1.0
Signed-off-by: Wei Liu 
Cc: Feng Tian 
Cc: Anthony Perard 
Cc: Laszlo Ersek 
Cc: Jordan Justen 
---
 MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c 
b/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
index 9e7ac74..7659585 100644
--- a/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
+++ b/MdeModulePkg/Bus/Pci/PciBusDxe/PciEnumeratorSupport.c
@@ -2256,6 +2256,7 @@ PciEnumeratorLight (
 {
 
   EFI_STATUSStatus;
+  EFI_HANDLEHostBridgeHandle;
   EFI_PCI_ROOT_BRIDGE_IO_PROTOCOL   *PciRootBridgeIo;
   PCI_IO_DEVICE *RootBridgeDev;
   UINT16MinBus;
@@ -2288,6 +2289,11 @@ PciEnumeratorLight (
 return Status;
   }
 
+  //
+  // Get the host bridge handle
+  //
+  HostBridgeHandle = PciRootBridgeIo->ParentHandle;
+
   Status = PciRootBridgeIo->Configuration (PciRootBridgeIo, (VOID **) 
&Descriptors);
 
   if (EFI_ERROR (Status)) {
@@ -2348,7 +2354,14 @@ PciEnumeratorLight (
 Descriptors++;
   }
 
-  return EFI_SUCCESS;
+  Status = gBS->InstallProtocolInterface (
+  &HostBridgeHandle,
+  &gEfiPciEnumerationCompleteProtocolGuid,
+  EFI_NATIVE_INTERFACE,
+  NULL
+  );
+
+  return Status;
 }
 
 /**
-- 
1.9.1


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] x86/xen: Make sure X2APIC_ENABLE bit of MSR_IA32_APICBASE is not set

2015-02-11 Thread Boris Ostrovsky
Commit d524165cb8db ("x86/apic: Check x2apic early") tests X2APIC_ENABLE
bit of MSR_IA32_APICBASE when CONFIG_X86_X2APIC is off and panics
the kernel when this bit is set.

Xen's PV guests will pass this MSR read to the hypervisor which will
return its version of the MSR, where this bit might be set. Make sure
we clear it before returning MSR value to the caller.

Signed-off-by: Boris Ostrovsky 
---
 arch/x86/xen/enlighten.c | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 78a881b..c0b0cce 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1070,6 +1070,23 @@ static inline void xen_write_cr8(unsigned long val)
BUG_ON(val);
 }
 #endif
+
+static u64 xen_read_msr_safe(unsigned int msr, int *err)
+{
+   u64 val;
+
+   val = native_read_msr_safe(msr, err);
+   switch (msr) {
+   case MSR_IA32_APICBASE:
+#ifdef CONFIG_X86_X2APIC
+   if (!(cpuid_ecx(1) & (1 << (X86_FEATURE_X2APIC & 31
+#endif
+   val &= ~X2APIC_ENABLE;
+   break;
+   }
+   return val;
+}
+
 static int xen_write_msr_safe(unsigned int msr, unsigned low, unsigned high)
 {
int ret;
@@ -1240,7 +1257,7 @@ static const struct pv_cpu_ops xen_cpu_ops __initconst = {
 
.wbinvd = native_wbinvd,
 
-   .read_msr = native_read_msr_safe,
+   .read_msr = xen_read_msr_safe,
.write_msr = xen_write_msr_safe,
 
.read_tsc = native_read_tsc,
-- 
2.1.0


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [qemu-mainline test] 34427: regressions - FAIL

2015-02-11 Thread xen . org
flight 34427 qemu-mainline real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34427/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 7 windows-install fail REGR. vs. 33480
 test-amd64-i386-xl-qemuu-debianhvm-amd64 7 debian-hvm-install fail REGR. vs. 
33480
 test-amd64-i386-xl-win7-amd64  7 windows-install  fail REGR. vs. 33480
 test-amd64-i386-xl-winxpsp3   7 windows-install   fail REGR. vs. 33480
 test-amd64-i386-xl-qemuu-winxpsp3  7 windows-install  fail REGR. vs. 33480
 test-amd64-i386-qemuu-rhel6hvm-intel  7 redhat-installfail REGR. vs. 33480
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 debian-hvm-install fail REGR. vs. 33480
 test-amd64-i386-xl-winxpsp3-vcpus1  7 windows-install fail REGR. vs. 33480
 test-amd64-i386-xl-qemuu-win7-amd64  7 windows-installfail REGR. vs. 33480
 test-amd64-i386-freebsd10-amd64  8 guest-startfail REGR. vs. 33480
 test-amd64-i386-freebsd10-i386  8 guest-start fail REGR. vs. 33480
 test-amd64-i386-rhel6hvm-amd  7 redhat-installfail REGR. vs. 33480
 test-amd64-i386-qemuu-rhel6hvm-amd  7 redhat-install  fail REGR. vs. 33480
 test-amd64-i386-rhel6hvm-intel  7 redhat-install  fail REGR. vs. 33480
 build-amd64-pvops 5 kernel-build  fail REGR. vs. 33480

Regressions which are regarded as allowable (not blocking):
 test-armhf-armhf-libvirt 12 guest-start.2fail blocked in 33480
 test-amd64-i386-libvirt   9 guest-start  fail   like 33480
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 33480

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-win7-amd64  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-winxpsp3  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-ovmf-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pvh-intel  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-debianhvm-amd64  1 build-check(1)blocked n/a
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-qemuu-winxpsp3  1 build-check(1)   blocked n/a
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-amd64-amd64-libvirt  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemuu-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-pvh-amd   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-pcipt-intel  1 build-check(1)   blocked  n/a
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-sedf  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-multivcpu  1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-credit2   1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-sedf-pin  1 build-check(1)   blocked  n/a
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3  1 build-check(1)   blocked n/a
 test-amd64-amd64-pair 1 build-check(1)   blocked  n/a
 test-amd64-amd64-xl-qemut-win7-amd64  1 build-check(1) blocked n/a
 test-amd64-amd64-xl-qemut-debianhvm-amd64  1 build-check(1)blocked n/a
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass

version targeted for testing:
 qemuu89db21771782fd6050335e73542064f1187c9ced
baseline version:
 qemuu1e42c353469cb58ca4f3b450eea4211af7d0b147


People who touched revisions under test:
  Alberto Garcia 
  Alex Suykov 
  Alex Williamson 
  Alexander Graf 
  Alistair Francis 
  Amit Shah 
  Andreas Färber 
  Aurelien Jarno 
  Avi Kivity 
  Bastian Koppelmann 
  Ben Taylor 
  Benjamin Herrenschmidt 
  Bharata B Rao 
  Blue Swirl 
  Chen Fan 
  Christian Borntraeger 
  Christophe Lyon 
  Cornelia Huck 
  Denis V. Lunev 
  Dinar Valeev 
  Don Koch 
  Don Slutz 
  Dr. David Alan Gilbert 
  Ed Swierk 
  Eduardo Habkost 
  Eduardo Otubo 
  Fabrice Bellard 
  Fam Zheng 
  Felix Janda 
  Francesco Romani 
  Frank Blaschka 
  Gerd Hoffmann 
  Gonglei 
  Greg Bellows 
  Guan Xuetao 
  Igor Mammedov 
  Ildar Isaev 
  Jan Kiszka 
  Jason Wang 
  Jeff Cody 
  Jiri Slaby 
  John Arbu

Re: [Xen-devel] [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-11 Thread Raghavendra K T

On 02/11/2015 11:08 PM, Oleg Nesterov wrote:

On 02/11, Raghavendra K T wrote:


On 02/10/2015 06:56 PM, Oleg Nesterov wrote:


In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg
the whole .head_tail. Plus obviously more boring changes. This needs a
separate patch even _if_ this can work.


Correct, but apart from this, before doing xadd in unlock,
we would have to make sure lsb bit is cleared so that we can live with 1
bit overflow to tail which is unused. now either or both of head,tail
lsb bit may be set after unlock.


Sorry, can't understand... could you spell?

If TICKET_SLOWPATH_FLAG lives in .head arch_spin_unlock() could simply do

head = xadd(&lock->tickets.head, TICKET_LOCK_INC);

if (head & TICKET_SLOWPATH_FLAG)
__ticket_unlock_kick(head);

so it can't overflow to .tail?



You are right.
I totally forgot we can get rid of tail operations :)



And we we do this, probably it makes sense to add something like

bool tickets_equal(__ticket_t one, __ticket_t two)
{
return (one ^ two) & ~TICKET_SLOWPATH_FLAG;
}



Very nice idea. I was tired of ~TICKET_SLOWPATH_FLAG usage all over in
the current (complex :)) implementation. These two suggestions helps
alot.


and change kvm_lock_spinning() to use tickets_equal(tickets.head, want), plus
it can have more users in asm/spinlock.h.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH] xen, apic: Setup our own APIC driver and validator for APIC IDs.

2015-02-11 Thread Konrad Rzeszutek Wilk
On Wed, Feb 11, 2015 at 09:53:26AM +, David Vrabel wrote:
> On 10/02/15 20:33, Konrad Rzeszutek Wilk wrote:
> > On Thu, Jan 22, 2015 at 10:00:55AM +, David Vrabel wrote:
> >> On 21/01/15 21:56, Konrad Rzeszutek Wilk wrote:
> >>> +static struct apic xen_apic = {
> >>> + .name = "Xen",
> >>> + .probe = probe_xen,
> >>> + /* The rest is copied from the default. */
> >>
> >> Explicitly initialize all required members here.  memcpy'ing from the
> >> default makes it far too unclear which ops this apic driver actually
> >> provides.
> > 
> > RFC (boots under PV, PVHVM, PV dom0):

And it boots under the 288 CPU machine (the original problem)
.. thought it exposes two other issues:


(XEN) SMP: Allowing 288 CPUs (0 hotplug CPUs)
(XEN) Brought up 288 CPUs
..
(XEN) Dom0 has maximum 255 VCPUs
(XEN) xentrace: p157 mfn 225524 offset 35896
(XEN) xentrace: p255 mfn 21fe3e offset 58142
[0.00] smpboot: Allowing 288 CPUs, 0 hotplug CPUs
[0.00] xen_filter_cpu_maps: CPU255 is not up!
..
[0.00] xen_filter_cpu_maps: CPU287 is not up!
[0.00] xen_filter_cpu_maps: nr_cpu_ids: 288, subtract: 33
[0.00]  RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=255.

... with the result that we can't bring up the 256->287 CPUs up.

It looks as if we a limiting Dom0 to 255. That seems to be due to:

> > 
> > From 27702ef618af068736d13aeadcbcacd2a6780e82 Mon Sep 17 00:00:00 2001
> > From: Konrad Rzeszutek Wilk 
> > Date: Fri, 9 Jan 2015 17:55:52 -0500
> > Subject: [PATCH] xen,apic: Setup our own APIC driver and validator for APIC
> >  IDs.
> > 
> > Via CPUID masking and the different apic-> overrides we
> > effectively make PV guests only but with the default APIC
> > driver. That is OK as an PV guest should never access any
> > APIC registers. However, the APIC is also used to limit the
> > amount of CPUs if the APIC IDs are incorrect - and since we
> > mask the x2APIC from the CPUID - any APIC IDs above 0xFF
> > are deemed incorrect by the default APIC routines.
> > 
> > As such add a new routine to check for APIC ID which will
> > be only used if the CPUID (native one) tells us the system
> > is using x2APIC.
> > 
> > This allows us to boot with more than 255 CPUs if running
> > as initial domain.
> 
> This looks quite reasonable to me.  What order are apic driver tried in?

No order. Or rather the order is based on how the compiler stashes
them in.

>  Do we need a mechanism to ensure that this one is tried before any for
> real hardware?

There are two probe mechanism - the .probe and then later it is:
x86_platform.apic_post_init which we can also utilize to make sure
the APIC is set to Xen.

Let me add that in.

> 
> > +static struct apic xen_apic = {
> > +   .name   = "Xen PV",
> > +   .probe  = probe_xen,
> > +   .acpi_madt_oem_check= xen_madt_oem_check,
> > +   .apic_id_valid  = xen_id_always_valid,
> > +   .apic_id_registered = xen_id_always_registered,
> > +
> > +   .irq_delivery_mode  = 0xbeef, /* used in 
> > native_compose_msi_msg only */
> > +   .irq_dest_mode  = 0xbeef, /* used in 
> > native_compose_msi_msg only */
> 
> Omit members that are unused, leaving them as 0 or NULL.
> 
> > +   .target_cpus= default_target_cpus,
> > +   .disable_esr= 0,
> > +   .dest_logical   = 0, /* default_send_IPI_ use it but we 
> > use our own. */
> > +   .check_apicid_used  = default_check_apicid_used, /* Used on 
> > 32-bit */
> > +
> > +   .vector_allocation_domain   = flat_vector_allocation_domain,
> > +   .init_apic_ldr  = xen_noop, /* setup_local_APIC calls 
> > it */
> > +
> > +   .ioapic_phys_id_map = default_ioapic_phys_id_map, /* Used 
> > on 32-bit */
> > +   .setup_apic_routing = NULL,
> > +   .cpu_present_to_apicid  = default_cpu_present_to_apicid,
> > +   .apicid_to_cpu_present  = physid_set_mask_of_physid, /* Used on 
> > 32-bit */
> > +   .check_phys_apicid_present  = default_check_phys_apicid_present, /* 
> > smp_sanity_check needs it */
> > +   .phys_pkg_id= xen_phys_pkg_id, /* detect_ht */
> > +
> > +   .get_apic_id= xen_get_apic_id,
> > +   .set_apic_id= xen_set_apic_id, /* Can be NULL on 
> > 32-bit. */
> > +   .apic_id_mask   = 0xFF << 24, /* Used by 
> > verify_local_APIC. Match with what xen_get_apic_id does. */
> > +
> > +   .cpu_mask_to_apicid_and = flat_cpu_mask_to_apicid_and,
> > +
> > +   .send_IPI_mask  = xen_send_IPI_mask,
> > +   .send_IPI_mask_allbutself   = xen_send_IPI_mask_allbutself,
> > +   .send_IPI_allbutself= xen_send_IPI_allbutself,
> > +   .send_IPI_all   = xen_send_IPI_all,
> > +   .send_IPI_self  = xen_send_IPI_self,
> > +
> > +   .wait_for_init_deassert =

Re: [Xen-devel] [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-11 Thread Oleg Nesterov
On 02/11, Raghavendra K T wrote:
>
> On 02/10/2015 06:56 PM, Oleg Nesterov wrote:
>
>> In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg
>> the whole .head_tail. Plus obviously more boring changes. This needs a
>> separate patch even _if_ this can work.
>
> Correct, but apart from this, before doing xadd in unlock,
> we would have to make sure lsb bit is cleared so that we can live with 1
> bit overflow to tail which is unused. now either or both of head,tail
> lsb bit may be set after unlock.

Sorry, can't understand... could you spell?

If TICKET_SLOWPATH_FLAG lives in .head arch_spin_unlock() could simply do

head = xadd(&lock->tickets.head, TICKET_LOCK_INC);

if (head & TICKET_SLOWPATH_FLAG)
__ticket_unlock_kick(head);

so it can't overflow to .tail?

But probably I missed your concern.



And we we do this, probably it makes sense to add something like

bool tickets_equal(__ticket_t one, __ticket_t two)
{
return (one ^ two) & ~TICKET_SLOWPATH_FLAG;
}

and change kvm_lock_spinning() to use tickets_equal(tickets.head, want), plus
it can have more users in asm/spinlock.h.

Oleg.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [ovmf test] 34431: regressions - FAIL

2015-02-11 Thread xen . org
flight 34431 ovmf real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34431/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-qemuu-ovmf-amd64 7 debian-hvm-install fail REGR. vs. 33686
 test-amd64-i386-xl-qemuu-ovmf-amd64  7 debian-hvm-install fail REGR. vs. 33686
 test-amd64-i386-pair   17 guest-migrate/src_host/dst_host fail REGR. vs. 33686

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt  12 guest-start.2fail blocked in 33686

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-qemuu-winxpsp3  7 windows-install  fail never pass

version targeted for testing:
 ovmf 6cffee0cb04e0605126d9436e2acf073aa0679bf
baseline version:
 ovmf 447d264115c476142f884af0be287622cd244423


People who touched revisions under test:
  "Gao, Liming" 
  "Long, Qin" 
  "Yao, Jiewen" 
  Aaron Pop 
  Abner Chang 
  Alex Williamson 
  Anderw Fish 
  Andrew Fish 
  Anthony PERARD 
  Ard Biesheuvel 
  Ari Zigler 
  Brendan Jackman 
  Bruce Cran 
  Cecil Sheng 
  Chao Zhang 
  Chao, Zhang 
  Chen Fan 
  Chris Phillips 
  Chris Ruffin 
  Cinnamon Shia 
  Daryl McDaniel  
  Daryl McDaniel 
  daryl.mcdaniel 
  daryl.mcdan...@intel.com
  darylm503 
  David Wei 
  David Woodhouse 
  Deric Cole 
  Dong Eric 
  Dong Guo 
  Dong, Guo 
  Elvin Li 
  Eric Dong 
  Eugene Cohen 
  Feng Tian 
  Feng, Bob C 
  Fu Siyuan 
  Fu, Siyuan 
  Gabriel Somlo 
  Gao, Liming 
  Gao, Liming liming.gao 
  Gao, Liming liming@intel.com
  Garrett Kirkendall 
  Gary Lin 
  Grzegorz Milos 
  Hao Wu 
  Harry Liebel 
  Hess Chen 
  Hot Tian 
  isakov-sl 
  isakov...@bk.ru
  Jaben Carsey 
  jcarsey 
  jcarsey 
  Jeff Bobzin (jeff.bobzin 
  Jeff Bobzin (jeff.bob...@insyde.com)
  Jeff Fan 
  Jiewen Yao 
  Joe Peterson 
  Jordan Justen 
  jyao1 
  jyao1 
  Kinney, Michael D 
  Larry Cleeton 
  Laszlo Ersek 
  Leandro G. Biss Becker 
  Lee Leahy 
  Leif Lindholm 
  leroy.p.leahy 
  leroy.p.le...@intel.com
  lhauch 
  Li, Elvin 
  Liming Gao 
  Long Qin 
  Long, Qin  
  Long, Qin 
  lpleahy  leroy.p.leahy 
  lpleahy  leroy.p.le...@intel.com
  Mang Guo 
  Mark Salter 
  Matt Fleming 
  Mauro Faccenda 
  Michael Casadevall 
  Michael Kinney  
  Michael Kinney 
  Mike Maslenkin 
  Ni Ruiyu 
  Nikolai Saoukh 
  Olivier Martin 
  Olivier Martin olivier.martin 
  oliviermartin 
  Paolo Bonzini 
  Parmeshwr Prasad 
  Paulo Alcantara 
  Peter Jones 
  Qin Long 
  Qiu Shumin 
  Qiu, Shumin 
  qlong 
  Randy Pawell 
  Reece R. Pollack 
  Reza Jelveh 
  Ronald Cron 
  Roy Franz 
  Ruiyu Ni 
  Ruiyu Ni 
  Ryan Harkin 
  Samer El-Haj-Mahmoud 
  Samer El-Haj-Mahmoud 
  Samer El-Haj-Mahmoud elhaj 
  Samer El-Haj-Mahmoud el...@hp.com
  Samuel Thibault 
  Scott Duplichan 
  Seiji Aguchi 
  Sergey Isakov 
  Shifei Lu 
  Shumin Qiu 
  Star Zeng 
  Stefan Kaeser 
  Steven Kinney 
  Steven Smith 
  Tapan Shah 
  Tian, Feng 
  Tian, Hot 
  Tim He 
  Tycho Nightingale 
  Victor Gouveia 
  Wang, Yu 
  Wu Jiaxin 
  Wu Jiaxin 
  Yao Jiewen 
  Yao, Jiewen 
  Ye Ting  
  Ye Ting 
  Yi Li 
  Yingke D Liu 
  Yingke Liu 
  Zeng, Star 


jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-li

Re: [Xen-devel] [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-11 Thread Oleg Nesterov
On 02/10, Jeremy Fitzhardinge wrote:
>
> On 02/10/2015 05:26 AM, Oleg Nesterov wrote:
> > On 02/10, Raghavendra K T wrote:
> >> Unfortunately xadd could result in head overflow as tail is high.
> >>
> >> The other option was repeated cmpxchg which is bad I believe.
> >> Any suggestions?
> > Stupid question... what if we simply move SLOWPATH from .tail to .head?
> > In this case arch_spin_unlock() could do xadd(tickets.head) and check
> > the result
>
> Well, right now, "tail" is manipulated by locked instructions by CPUs
> who are contending for the ticketlock, but head can be manipulated
> unlocked by the CPU which currently owns the ticketlock. If SLOWPATH
> moved into head, then non-owner CPUs would be touching head, requiring
> everyone to use locked instructions on it.
>
> That's the theory, but I don't see much (any?) code which depends on that.
>
> Ideally we could find a way so that pv ticketlocks could use a plain
> unlocked add for the unlock like the non-pv case, but I just don't see a
> way to do it.

I agree, and I have to admit I am not sure I fully understand why unlock
uses the locked add. Except we need a barrier to avoid the race with the
enter_slowpath() users, of course. Perhaps this is the only reason?

Anyway, I suggested this to avoid the overflow if we use xadd(), and I
guess we need the locked insn anyway if we want to eliminate the unsafe
read-after-unlock...

> > BTW. If we move "clear slowpath" into "lock" path, then probably trylock
> > should be changed too? Something like below, we just need to clear SLOWPATH
> > before cmpxchg.
>
> How important / widely used is trylock these days?

I am not saying this is that important. Just this looks more consistent imo
and we can do this for free.

Oleg.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 3/3] libxl: libxl__device_from_disk should retrieve backend from xenstore

2015-02-11 Thread Jim Fehlig
Wei Liu wrote:
> On Tue, Feb 10, 2015 at 11:01:46AM +, Ian Jackson wrote:
>   
>> Wei Liu writes ("[PATCH 3/3] libxl: libxl__device_from_disk should retrieve 
>> backend from xenstore"):
>> 
>>> ... if backend is not set by caller.
>>>   
>> Acked-by: Ian Jackson 
>>
>> as far as it goes, but I think you may want a more radical change -
>> see below.
>>
>> 
>>> Also change the function to use "goto" idiom while I was there.
>>>   
>> (Although usually it would be better to split this kind of thing into
>> a pre-patch, in this case it's small and easily reviewed.)
>>
>> Is the backend type the only missing or potentially-wrong
>> information ?  ISTM that perhaps the caller might not know the target,
>> either.
>>
>> What should happen if the caller specifies a different target in disk
>> to the one the device is actually using ?  The documentation should
>> specify which of the fields are important.
>>
>> 
>
> I'm not sure because it's not documented.
>
> We should take a step back to define the important fields first.
>
>   
>> Maybe libxl_device_disk_remove needs to call libxl_vdev_to_device_disk
>> and check that the supplied disk struct is plausible somehow.  In that
>> case it might be nice for the caller to be able to fill in only the
>> vdev.
>>
>> 
>
> If so we need to make clear in the documentation. I'm of course fine
> with this behaviour.
>
> Jim, does libvirt (as an example of libxl user) actually cares
> specifying every fields in that struct? The other user (xl) doesn't seem
> to care that much.
>   

At minimum, libvirt will populate the pdev_path, vdev, backend, and
format fields. If backend and format (which, in libvirt-speack
correspond to the 'name' and 'type' attributes on the optional 
element) are not specified, they are set to LIBXL_DISK_BACKEND_UNKNOWN
and LIBXL_DISK_FORMAT_RAW respectively.

Regards,
Jim


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 12/12] Changes to test step of xen install

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 12/12] Changes to test step of xen install"):
>  This patch accomodates ts-xen-install to nested L1 xen

Ah yes, here is the meat.  I have run out of time today but will reply
tomorrow with some design observations.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 10/12] Compose the main body of test-nested test job.

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 10/12] Compose the main body of test-nested 
test job."):
>  Compose the main body of test-nested test job. 

Ah, this is what I was missing earlier.  You really need to order this
so that things come after things which depend on them.

Typically:
 * cleanups
 * define new TestSupport facilities
 * define new ts-* scripts if any
 * define new recipies
 * updates to make-flight to define new jobs.

> +proc need-hosts/test-nested {} {return host}
> +proc run-job/test-nested {} {
> +run-ts . = ts-debian-hvm-install + host + nested + nested_L1

ts-debian-hvm-install takes only two arguments.  You are passing 3.
I guess this is in further patches...

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [rumpuserxen test] 34448: regressions - FAIL

2015-02-11 Thread xen . org
flight 34448 rumpuserxen real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34448/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 build-i386-rumpuserxen6 xen-build fail REGR. vs. 33866
 build-amd64-rumpuserxen   6 xen-build fail REGR. vs. 33866

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-rumpuserxen-amd64  1 build-check(1)   blocked n/a
 test-amd64-i386-rumpuserxen-i386  1 build-check(1)   blocked  n/a

version targeted for testing:
 rumpuserxen  8af836e751ed191f3e2918668649710dd307e0b5
baseline version:
 rumpuserxen  30d72f3fc5e35cd53afd82c8179cc0e0b11146ad


People who touched revisions under test:
  Ian Jackson 
  Martin Lucina 


jobs:
 build-amd64  pass
 build-i386   pass
 build-amd64-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  fail
 build-i386-rumpuserxen   fail
 test-amd64-amd64-rumpuserxen-amd64   blocked 
 test-amd64-i386-rumpuserxen-i386 blocked 



sg-report-flight on osstest.cam.xci-test.com
logs: /home/xc_osstest/logs
images: /home/xc_osstest/images

Logs, config files, etc. are available at
http://www.chiark.greenend.org.uk/~xensrcts/logs

Test harness code can be found at
http://xenbits.xensource.com/gitweb?p=osstest.git;a=summary


Not pushing.


commit 8af836e751ed191f3e2918668649710dd307e0b5
Author: Martin Lucina 
Date:   Tue Feb 10 10:49:23 2015 +0100

Pull in latest buildrump.sh and src-netbsd

For fixes to libc compat symbols missing (issue #21)

commit e28e2b9daf7ab2922913889d90ec438b9bee3d56
Author: Ian Jackson 
Date:   Wed Feb 4 16:29:26 2015 +

app-tools: Support old -D__RUMPUSER_XEN__ for now

Released versions of Xen (Xen 4.5) rely on __RUMPUSER_XEN__ being
defined.

A patch to change this in Xen upstream exists and will be backported,
but until that makes it through to a stable point release of Xen 4.5,
we should support both #defines.

This commit partially reverts 91d56232d987
   Renaming platform macros, app-tools and autoconf target string

Signed-off-by: Ian Jackson 
CC: Martin Lucina 
CC: Ian Campbell 
CC: Wei Liu 

commit 05e06b0fe52918d6575e33b7d7551d85c93f7aff
Author: Martin Lucina 
Date:   Mon Feb 2 18:01:52 2015 +0100

Sync Travis CI configuration with app-tools rename

Signed-off-by: Martin Lucina 

commit 3b36d1f55a08e1849ccd5424afb0fbe29647bd6c
Author: Martin Lucina 
Date:   Mon Feb 2 18:00:36 2015 +0100

Remove even older rumpxen-app-* variants of app-tools

Signed-off-by: Martin Lucina 

commit 91d56232d987f5df594723ed46b9000b4d43e21a
Author: Martin Lucina 
Date:   Mon Feb 2 17:52:41 2015 +0100

Renaming platform macros, app-tools and autoconf target string

As discussed at: http://thread.gmane.org/gmane.comp.rumpkernel.user/739

This commit renames the platform macros, app-tools and autoconf target
string to be consistent with current naming of the entire stack:

app-tools/rumpapp-xen-* -> app-tools/rumprun-xen-*
$ARCH-rumpxen-netbsd -> $ARCH-rumprun-netbsd
-D__RUMPUSER_XEN__ -D__RUMPAPP__ -> -D__RUMPRUN__

Signed-off-by: Martin Lucina 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 08/12] Add test job for nest test case

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 08/12] Add test job for nest test case"):
> This patch adds creation of the nested test job; when
>  job creation procedure is invoked.
...
> +  job_create_test test-$xenarch$kern-$dom0arch-nested test-nested xl \
> + $xenarch $dom0arch \

Have I missed the patch where the recipe is defined ?

> +nested_image=$NESTED_OS_IMAGE \
> +nested2_image=$NESTED_OS_IMAGE \

I don't seem to have seen definities for these either.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 09/12] Add build hvm job for nested test use

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 09/12] Add build hvm job for nested test 
use"):
>  Add build-debain-hvm build job. The $TREE_LINUX and
>  $REVISION_LINU can be designaged in standalone.config.

What is this for ?  It seems very similar to the build-$arch-pvops
job.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v8 4/7] xen: Add vmware_port support

2015-02-11 Thread Andrew Cooper
On 11/02/15 07:56, Jan Beulich wrote:
 On 10.02.15 at 20:30,  wrote:
>> While coding this is up I have hit issues that I need input on:
>>
>> As a HVM_PARAM_ item, I would assume I should be following
>> what HVM_PARAM_VIRIDIAN does.  It has this comment:
>>
>> case HVM_PARAM_VIRIDIAN:
>> /* This should only ever be set once by the tools and
>> read by the guest. */
>>
>> Which is almost true.  However the code allows you to change from 0 to
>> non-zero any time in the life of the DomU.  I am assuming that this is
>> why xc_domain_save() and xc_domain_restore() save and restore this
>> HVM_PARAM_ item.
>>
>> With the enable of vmware_port the same way, I feel it would be a bug
>> to allow the enable after "create" to not also adjust QEMU.  Currently
>> there is no way for the hypervisor to tell QEMU to enable vmware_port
>> handling.  So to avoid adding this code to xen and QEMU, it looks to
>> me that adding code to make this a true write only 1 time would be
>> needed so that you cannot use the hyper call to change later.
>>
>> So, should I extend this change to cover other HVM_PARAM_?
>>
>> Is all this additional code (xc_domain_save(), xc_domain_restore(),
>> write only 1 time) still better then a domain_create() flag?
> I suppose for your case it's indeed the right approach. Which other
> params this may be true for as well I can't immediately say, but I'd
> certainly like to ask for adjustments to others to be in separate
> patches (and perhaps even a separate series), with proper
> rationale for each of them. I guess Andrew will have further input
> for you on this matter...

My recommendation is still to use a creation flag.  The described
problem is exactly the reason why I dislike the use of hvmparams for
booleans like this which really do need to be consistent for the
lifetime of the guest.

I had hoped to see whether I could fix some of this up as part of the
fixes to guest cpuid handling, but that work is still a while off and
not of practical consideration for the short term.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 07/12] For hvm guest configuration, config console to 'hvc0'

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 07/12] For hvm guest configuration, config 
console to 'hvc0'"):
> ---
>  Osstest/TestSupport.pm | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/Osstest/TestSupport.pm b/Osstest/TestSupport.pm
> index c23bbc7..864805e 100644
> --- a/Osstest/TestSupport.pm
> +++ b/Osstest/TestSupport.pm
> @@ -1753,7 +1753,11 @@ sub target_kernkind_check ($) {
>  if ($kernkind eq 'pvops') {
>  store_runvar($pfx."rootdev", 'xvda') if $isguest;
>  store_runvar($pfx."console", 'hvc0');
> -} elsif ($kernkind !~ m/2618/) {
> +}
> +elsif ($kernkind eq 'hvm'){
> +store_runvar($pfx."console", 'hvc0');   #nested hvm guest shall 
> not append console=xvc0; I guess this applies to all hvm guests.
> +}
> +elsif ($kernkind !~ m/2618/) {

I don't understand why this is necessary.  Surely all the kernels here
are pvops so the kernkind should be 'pvops' in all cases and the
console will be set to hvc0 anyway ?

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Jan Beulich
>>> On 11.02.15 at 17:33,  wrote:
> On 11/02/15 13:13, Jan Beulich wrote:
> On 11.02.15 at 12:52,  wrote:
>>> On 11/02/15 08:28, Kai Huang wrote:
 We handle above two cases by flushing PML buffer at the beginning of
 all VMEXITs. This solves the first case above, and it also solves the
 second case, as prior to paging_log_dirty_op, domain_pause is called,
 which kicks vcpus (that are in guest mode) out of guest mode via
 sending IPI, which cause VMEXIT, to them.

 This also makes log-dirty radix tree more updated as PML buffer is
 flushed on basis of all VMEXITs but not only PML buffer full VMEXIT.
>>> My gut feeling is that this is substantial overhead on a common path,
>>> but this largely depends on how the dirty bits can be cleared efficiently.
>> I agree on the overhead part, but I don't see what relation this has
>> to the dirty bit clearing - a PML buffer flush doesn't involve any
>> alterations of D bits.
> 
> I admit that I was off-by-one level when considering the
> misconfiguration overhead.  It would be inefficient (but not unsafe as
> far as I can tell) to clear all D bits at once; the PML could end up
> with repeated gfns in it, or different vcpus could end up with the same
> gfn, depending on the exact access pattern, which will add to the flush
> overhead.

Why would that be? A misconfiguration exit means no access to
a given range was possible at all before, i.e. all subordinate pages
would have the D bit clear if they were reachable. What you
describe would - afaict - be a problem only if we didn't go over the
whole guest address space at once.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PML (Page Modification Logging) design for Xen

2015-02-11 Thread Andrew Cooper
On 11/02/15 13:13, Jan Beulich wrote:
 On 11.02.15 at 12:52,  wrote:
>> On 11/02/15 08:28, Kai Huang wrote:
>>> With PML, we don't have to use write protection but just clear D-bit
>>> of EPT entry of guest memory to do dirty logging, with an additional
>>> PML buffer full VMEXIT for 512 dirty GPAs. Theoretically, this can
>>> reduce hypervisor overhead when guest is in dirty logging mode, and
>>> therefore more CPU cycles can be allocated to guest, so it's expected
>>> benchmarks in guest will have better performance comparing to non-PML.
>> One issue with basic EPT A/D tracking was the scan of the EPT tables. 
>> Here, hardware will give us a list of affected gfns, but how is Xen
>> supposed to efficiently clear the dirty bits again?  Using EPT
>> misconfiguration is no better than the existing fault path.
> Why not? The misconfiguration exit ought to clear the D bit for all
> 511 entries in the L1 table (and set it for the one entry that is
> currently serving the access). All further D bit handling will then
> be PML based.
>
>>> - PML buffer flush
>>>
>>> There are two places we need to flush PML buffer. The first place is
>>> PML buffer full VMEXIT handler (apparently), and the second place is
>>> in paging_log_dirty_op (either peek or clean), as vcpus are running
>>> asynchronously along with paging_log_dirty_op is called from userspace
>>> via hypercall, and it's possible there are dirty GPAs logged in vcpus'
>>> PML buffers but not full. Therefore we'd better to flush all vcpus'
>>> PML buffers before reporting dirty GPAs to userspace.
>> Why apparently?  It would be quite easy for a guest to dirty 512 frames
>> without otherwise taking a vmexit.
> I silently replaced apparently with obviously while reading...
>
>>> We handle above two cases by flushing PML buffer at the beginning of
>>> all VMEXITs. This solves the first case above, and it also solves the
>>> second case, as prior to paging_log_dirty_op, domain_pause is called,
>>> which kicks vcpus (that are in guest mode) out of guest mode via
>>> sending IPI, which cause VMEXIT, to them.
>>>
>>> This also makes log-dirty radix tree more updated as PML buffer is
>>> flushed on basis of all VMEXITs but not only PML buffer full VMEXIT.
>> My gut feeling is that this is substantial overhead on a common path,
>> but this largely depends on how the dirty bits can be cleared efficiently.
> I agree on the overhead part, but I don't see what relation this has
> to the dirty bit clearing - a PML buffer flush doesn't involve any
> alterations of D bits.

I admit that I was off-by-one level when considering the
misconfiguration overhead.  It would be inefficient (but not unsafe as
far as I can tell) to clear all D bits at once; the PML could end up
with repeated gfns in it, or different vcpus could end up with the same
gfn, depending on the exact access pattern, which will add to the flush
overhead.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 20/21] libxlu: introduce new APIs

2015-02-11 Thread Ian Jackson
Wei Liu writes ("[PATCH v4 20/21] libxlu: introduce new APIs"):
> These APIs can be used to manipulate XLU_ConfigValue and XLU_ConfigList.
...
> +const char *xlu_cfg_value_get_string(const XLU_ConfigValue *value)
> +{
> +assert(value->type == XLU_STRING);
> +return value->u.string;
> +}

Most of the existing xlu_cfg_... functions return null (or -1) setting
EINVAL if the type of the supplied config item is not correct.

But these new functions are not really suitable for use directly
because they crash on incorrect configuration input.

Wouldn't it be better if these functions had calling conventions
similar to xlu_cfg_get_string et al (returning errno values, taking
dont_warn, etc.) ?

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 19/21] libxlu: nested list support

2015-02-11 Thread Ian Jackson
Wei Liu writes ("[PATCH v4 19/21] libxlu: nested list support"):
> 1. Extend grammar of parser.
> 2. Adjust internal functional to accept XLU_ConfigValue instead of

 ^functions

Otherwise,

Acked-by: Ian Jackson 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 18/21] libxlu: rework internal representation of setting

2015-02-11 Thread Ian Jackson
Wei Liu writes ("[PATCH v4 18/21] libxlu: rework internal representation of 
setting"):
> This patches does following things:
...
> +void xlu__cfg_list_append(CfgParseContext *ctx,
> +  XLU_ConfigValue *list,
> +  char *atom)
> +{
> +XLU_ConfigValue **new_val = NULL;
> +XLU_ConfigValue *val = NULL;
>  if (ctx->err) return;
...
> -if (set->nvalues >= set->avalues) {
> -int new_avalues;
> -char **new_values;
> -
> -if (set->avalues > INT_MAX / 100) { ctx->err= ERANGE; return; }
> -new_avalues= set->avalues * 4;
> -new_values= realloc(set->values,
> -sizeof(*new_values) * new_avalues);
> -if (!new_values) { ctx->err= errno; free(atom); return; }
> -set->values= new_values;
> -set->avalues= new_avalues;

This is a standard expanding-array pattern which arranges not to
realloc the array each time.

> -}
> -set->values[set->nvalues++]= atom;
> +new_val = realloc(list->u.list.values,
> +  sizeof(*new_val) * (list->u.list.nvalues+1));
> +if (!new_val) goto xe;

But you replace it here with one which has quadradic performance.

I don't know whether people are going to specify lists with hundreds
or thousands of elements, but it doesn't seem impossible.

I think you should retain the existing avalues stuff.

Apart from that this all looks good to me.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [xen-unstable test] 34426: regressions - FAIL

2015-02-11 Thread xen . org
flight 34426 xen-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34426/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-amd64-xl-pvh-intel  5 xen-boot fail REGR. vs. 34341

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-pair17 guest-migrate/src_host/dst_host fail like 34341

Tests which did not succeed, but are not blocking:
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-amd64-i386-libvirt  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-amd64-amd64-libvirt 10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-armhf-armhf-xl-credit2  10 migrate-support-checkfail   never pass
 test-amd64-i386-xl-qemuu-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-qemuu-winxpsp3 14 guest-stopfail never pass
 test-amd64-amd64-xl-qemuu-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-xl-win7-amd64 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-i386-xl-win7-amd64 14 guest-stop   fail  never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-i386-xl-winxpsp3-vcpus1 14 guest-stop   fail never pass
 test-amd64-i386-xl-winxpsp3  14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-amd64-xl-winxpsp3 14 guest-stop   fail   never pass
 test-amd64-amd64-xl-qemuu-winxpsp3 14 guest-stop   fail never pass

version targeted for testing:
 xen  8bc64413be51c91b8f5e790b71e0d94727f9f004
baseline version:
 xen  001324547356af86875fad5003f679571a6b8f1c


People who touched revisions under test:
  Ian Jackson 
  Wei Liu 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-oldkern  pass
 build-i386-oldkern   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 build-amd64-rumpuserxen  pass
 build-i386-rumpuserxen   pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd64-i386-qemuu-rhel6hvm-amd   pass
 test-amd64-amd64-xl-qemut-debianhvm-amd64pass
 test-amd64-i386-xl-qemut-debianhvm-amd64 pass
 test-amd64-amd64-xl-qemuu-debianhvm-amd64pass
 test-amd64-i386-xl-qemuu-debianhvm-amd64 pass
 test-amd64-i386-freebsd10-amd64  pass
 test-amd64-amd64-xl-qemuu-ovmf-amd64 pass
 test-amd64-i386-xl-qemuu-ovmf-amd64  pass
 test-amd64-amd64-rumpuserxen-amd64   pass
 test-amd64-amd64-xl-qemut-win7-amd64 fail
 test-amd64-i386-

Re: [Xen-devel] [PATCH] x86: simplify non-atomic bitops

2015-02-11 Thread Jan Beulich
>>> On 11.02.15 at 16:14,  wrote:
> On 11/02/15 13:39, Jan Beulich wrote:
>> @@ -55,12 +54,9 @@ static inline void set_bit(int nr, volat
>>   * If it's called on the same region of memory simultaneously, the effect
>>   * may be that only one operation succeeds.
>>   */
>> -static inline void __set_bit(int nr, volatile void *addr)
>> +static inline void __set_bit(int nr, void *addr)
>>  {
>> -asm volatile (
>> -"btsl %1,%0"
>> -: "=m" (ADDR)
>> -: "Ir" (nr), "m" (ADDR) : "memory");
>> +asm volatile ( "btsl %1,%0" : "+m" (ADDR) : "Ir" (nr) : "memory" );
> 
> You presumably want to s/ADDR/addr/ to avoid re-gaining the volatile
> attribute on the pointer?

I don't think it makes a difference, but yes, that would get things
into more consistent shape. Looking at it again the memory
operand could also legitimately be just an input, as we have a
memory clobber.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] stubdom vtpm build failure in staging

2015-02-11 Thread Olaf Hering
On Wed, Jan 28, Xu, Quan wrote:

> Thanks, I will check and fix it tomorrow. It is 23:12 PM Pacific time now.

Any progress?
These typedefs are duplicated in stubdom/vtpmmgr/tcg.h and supported
compilers do not cope with current staging:

# for i in `grep -w typedef stubdom/vtpmmgr/tcg.h | sed -n '/;/{s@^.* 
@@;s@;@@p}'`
# do
# if test -n "`git grep -wn $i|grep -w typedef|grep -v 
stubdom/vtpmmgr/tcg.h`"
# then
# echo $i
# fi
# done

BYTE
BOOL
UINT16
UINT32
UINT64
TPM_HANDLE
TPM_ALGORITHM_ID

TPMI_RH_HIERARCHY_AUTH and TPM_ALG_ID are defined twice in the same file.

This change works for me:

---
 stubdom/vtpmmgr/odd_types.h  | 11 +++
 stubdom/vtpmmgr/tcg.h|  9 +
 stubdom/vtpmmgr/tpm2_types.h | 11 +--
 3 files changed, 13 insertions(+), 18 deletions(-)
 create mode 100644 stubdom/vtpmmgr/odd_types.h

diff --git a/stubdom/vtpmmgr/odd_types.h b/stubdom/vtpmmgr/odd_types.h
new file mode 100644
index 000..d72da9b
--- /dev/null
+++ b/stubdom/vtpmmgr/odd_types.h
@@ -0,0 +1,11 @@
+#ifndef VTPM_ODD_TYPES
+#define VTPM_ODD_TYPES 1
+typedef unsigned char BYTE;
+typedef unsigned char BOOL;
+typedef uint16_t UINT16;
+typedef uint32_t UINT32;
+typedef uint64_t UINT64;
+typedef UINT32 TPM_HANDLE;
+typedef UINT32 TPM_ALGORITHM_ID;
+#endif
+
diff --git a/stubdom/vtpmmgr/tcg.h b/stubdom/vtpmmgr/tcg.h
index 7321ec6..cac1bbc 100644
--- a/stubdom/vtpmmgr/tcg.h
+++ b/stubdom/vtpmmgr/tcg.h
@@ -401,16 +401,10 @@
 
 
 // *** TYPEDEFS *
-typedef unsigned char BYTE;
-typedef unsigned char BOOL;
-typedef uint16_t UINT16;
-typedef uint32_t UINT32;
-typedef uint64_t UINT64;
-
+#include "odd_types.h"
 typedef UINT32 TPM_RESULT;
 typedef UINT32 TPM_PCRINDEX;
 typedef UINT32 TPM_DIRINDEX;
-typedef UINT32 TPM_HANDLE;
 typedef TPM_HANDLE TPM_AUTHHANDLE;
 typedef TPM_HANDLE TCPA_HASHHANDLE;
 typedef TPM_HANDLE TCPA_HMACHANDLE;
@@ -422,7 +416,6 @@ typedef UINT32 TPM_COMMAND_CODE;
 typedef UINT16 TPM_PROTOCOL_ID;
 typedef BYTE TPM_AUTH_DATA_USAGE;
 typedef UINT16 TPM_ENTITY_TYPE;
-typedef UINT32 TPM_ALGORITHM_ID;
 typedef UINT16 TPM_KEY_USAGE;
 typedef UINT16 TPM_STARTUP_TYPE;
 typedef UINT32 TPM_CAPABILITY_AREA;
diff --git a/stubdom/vtpmmgr/tpm2_types.h b/stubdom/vtpmmgr/tpm2_types.h
index ac2830d..63564cd 100644
--- a/stubdom/vtpmmgr/tpm2_types.h
+++ b/stubdom/vtpmmgr/tpm2_types.h
@@ -83,12 +83,8 @@
 #defineMAX_ECC_KEY_BYTES((MAX_ECC_KEY_BITS + 7) / 8)
 
 
-typedef unsigned char BYTE;
-typedef unsigned char BOOL;
+#include "odd_types.h"
 typedef uint8_t   UINT8;
-typedef uint16_t  UINT16;
-typedef uint32_t  UINT32;
-typedef uint64_t  UINT64;
 
 // TPM2 command code
 
@@ -216,7 +212,6 @@ typedef UINT16 TPM_ST;
 
 
 // TPM Handle types
-typedef UINT32 TPM_HANDLE;
 typedef UINT8 TPM_HT;
 
 
@@ -233,7 +228,6 @@ typedef UINT32 TPM_RH;
 #defineTPM_RH_LAST   (TPM_RH)(0x400C)
 
 // Table 4 -- DocumentationClarity Types 
-typedef UINT32TPM_ALGORITHM_ID;
 typedef UINT32TPM_MODIFIER_INDICATOR;
 typedef UINT32TPM_SESSION_OFFSET;
 typedef UINT16TPM_KEY_SIZE;
@@ -261,8 +255,6 @@ typedef BYTE TPMA_LOCALITY;
 // Table 37 -- TPMI_YES_NO Type 
 typedef BYTE TPMI_YES_NO;
 
-typedef TPM_HANDLE TPMI_RH_HIERARCHY_AUTH;
-
 // Table 38 -- TPMI_DH_OBJECT Type 
 typedef TPM_HANDLE TPMI_DH_OBJECT;
 
@@ -304,7 +296,6 @@ typedef TPM_HANDLE TPMI_RH_LOCKOUT;
 
 // Table 7 -- TPM_ALG_ID
 typedef UINT16 TPM_ALG_ID;
-typedef UINT16 TPM_ALG_ID;
 
 #defineTPM2_ALG_ERROR (TPM_ALG_ID)(0x) // a: ; D:
 #defineTPM2_ALG_FIRST (TPM_ALG_ID)(0x0001) // a: ; D:

Olaf

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Processed: Re: [RFC] Tweaking the release process for Xen 4.6

2015-02-11 Thread xen
Processing commands for x...@bugs.xenproject.org:

> create ^
Created new bug #48 rooted at `<20150210150424.ga32...@zion.uk.xensource.com>'
Title: `Re: [RFC] Tweaking the release process for Xen 4.6'
> title it Tweaking the release process for Xen 4.6
Set title for #48 to `Tweaking the release process for Xen 4.6'
> Thanks
Command failed: Unknown command `Thanks'. at 
/srv/xen-devel-bugs/lib/emesinae/control.pl line 457,  line 3.
Stop processing here.

Modified/created Bugs:
 - 48: http://bugs.xenproject.org/xen/bug/48 (new)

---
Xen Hypervisor Bug Tracker
See http://wiki.xen.org/wiki/Reporting_Bugs_against_Xen for information on 
reporting bugs
Contact xen-bugs-ow...@bugs.xenproject.org with any infrastructure issues

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] x86: simplify non-atomic bitops

2015-02-11 Thread Andrew Cooper
On 11/02/15 13:39, Jan Beulich wrote:
> - being non-atomic, their pointer arguments shouldn't be volatile-
>   qualified
> - their (half fake) memory operands can be a single "+m" instead of
>   being both an output and an input
>
> Signed-off-by: Jan Beulich 
> ---
> v2: Drop "+m" related sentence from comment at the top of the file as
> being wrong (the referenced indication in gcc's documentation got
> removed quite some time ago too).
>
> --- a/xen/include/asm-x86/bitops.h
> +++ b/xen/include/asm-x86/bitops.h
> @@ -14,8 +14,7 @@
>   * operand is both read from and written to. Since the operand is in fact a
>   * word array, we also specify "memory" in the clobbers list to indicate that
>   * words other than the one directly addressed by the memory operand may be
> - * modified. We don't use "+m" because the gcc manual says that it should be
> - * used only when the constraint allows the operand to reside in a register.
> + * modified.
>   */
>  
>  #define ADDR (*(volatile long *) addr)
> @@ -55,12 +54,9 @@ static inline void set_bit(int nr, volat
>   * If it's called on the same region of memory simultaneously, the effect
>   * may be that only one operation succeeds.
>   */
> -static inline void __set_bit(int nr, volatile void *addr)
> +static inline void __set_bit(int nr, void *addr)
>  {
> -asm volatile (
> -"btsl %1,%0"
> -: "=m" (ADDR)
> -: "Ir" (nr), "m" (ADDR) : "memory");
> +asm volatile ( "btsl %1,%0" : "+m" (ADDR) : "Ir" (nr) : "memory" );

You presumably want to s/ADDR/addr/ to avoid re-gaining the volatile
attribute on the pointer?

~Andrew

>  }
>  #define __set_bit(nr, addr) ({  \
>  if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
> @@ -95,12 +91,9 @@ static inline void clear_bit(int nr, vol
>   * If it's called on the same region of memory simultaneously, the effect
>   * may be that only one operation succeeds.
>   */
> -static inline void __clear_bit(int nr, volatile void *addr)
> +static inline void __clear_bit(int nr, void *addr)
>  {
> -asm volatile (
> -"btrl %1,%0"
> -: "=m" (ADDR)
> -: "Ir" (nr), "m" (ADDR) : "memory");
> +asm volatile ( "btrl %1,%0" : "+m" (ADDR) : "Ir" (nr) : "memory" );
>  }
>  #define __clear_bit(nr, addr) ({\
>  if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
> @@ -116,12 +109,9 @@ static inline void __clear_bit(int nr, v
>   * If it's called on the same region of memory simultaneously, the effect
>   * may be that only one operation succeeds.
>   */
> -static inline void __change_bit(int nr, volatile void *addr)
> +static inline void __change_bit(int nr, void *addr)
>  {
> -asm volatile (
> -"btcl %1,%0"
> -: "=m" (ADDR)
> -: "Ir" (nr), "m" (ADDR) : "memory");
> +asm volatile ( "btcl %1,%0" : "+m" (ADDR) : "Ir" (nr) : "memory" );
>  }
>  #define __change_bit(nr, addr) ({   \
>  if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
> @@ -181,14 +171,14 @@ static inline int test_and_set_bit(int n
>   * If two examples of this operation race, one can appear to succeed
>   * but actually fail.  You must protect multiple accesses with a lock.
>   */
> -static inline int __test_and_set_bit(int nr, volatile void *addr)
> +static inline int __test_and_set_bit(int nr, void *addr)
>  {
>  int oldbit;
>  
>  asm volatile (
>  "btsl %2,%1\n\tsbbl %0,%0"
> -: "=r" (oldbit), "=m" (ADDR)
> -: "Ir" (nr), "m" (ADDR) : "memory");
> +: "=r" (oldbit), "+m" (ADDR)
> +: "Ir" (nr) : "memory" );
>  return oldbit;
>  }
>  #define __test_and_set_bit(nr, addr) ({ \
> @@ -228,14 +218,14 @@ static inline int test_and_clear_bit(int
>   * If two examples of this operation race, one can appear to succeed
>   * but actually fail.  You must protect multiple accesses with a lock.
>   */
> -static inline int __test_and_clear_bit(int nr, volatile void *addr)
> +static inline int __test_and_clear_bit(int nr, void *addr)
>  {
>  int oldbit;
>  
>  asm volatile (
>  "btrl %2,%1\n\tsbbl %0,%0"
> -: "=r" (oldbit), "=m" (ADDR)
> -: "Ir" (nr), "m" (ADDR) : "memory");
> +: "=r" (oldbit), "+m" (ADDR)
> +: "Ir" (nr) : "memory" );
>  return oldbit;
>  }
>  #define __test_and_clear_bit(nr, addr) ({   \
> @@ -244,14 +234,14 @@ static inline int __test_and_clear_bit(i
>  })
>  
>  /* WARNING: non atomic and it can be reordered! */
> -static inline int __test_and_change_bit(int nr, volatile void *addr)
> +static inline int __test_and_change_bit(int nr, void *addr)
>  {
>  int oldbit;
>  
>  asm volatile (
>  "btcl %2,%1\n\tsbbl %0,%0"
> -: "=r" (oldbit), "=m" (ADDR)
> -: "Ir" (nr), "m" (ADDR) : "memory");
> +: "=r" (oldbit), "+m" (ADDR)
> +: "Ir" (nr) : "memory" );
>  return oldbit;
>  }
>  #define __test_and_change_bit(nr, addr) ({ 

Re: [Xen-devel] [PATCH v3 3/3] hvmemul_do_io: Do not retry if no ioreq server exists for this I/O.

2015-02-11 Thread Paul Durrant
> -Original Message-
> From: Jan Beulich [mailto:jbeul...@suse.com]
> Sent: 11 February 2015 13:37
> To: Paul Durrant
> Cc: Andrew Cooper; Ian Campbell; Wei Liu; George Dunlap; Ian Jackson;
> Stefano Stabellini; xen-devel@lists.xen.org; Don Slutz; Keir (Xen.org)
> Subject: Re: [PATCH v3 3/3] hvmemul_do_io: Do not retry if no ioreq server
> exists for this I/O.
> 
> >>> On 10.02.15 at 23:52,  wrote:
> > This saves a VMENTRY and a VMEXIT since we no longer retry the
> > ioport read on backing DM not handling a given ioreq.
> >
> > There are 2 case about "no ioreq server exists for this I/O":
> >
> > 1) No ioreq servers (PVH case)
> > 2) No ioreq servers for this I/O (non PVH case)
> >
> > The routine hvm_has_dm() used to check for the empty list, the PVH
> > case (#1).
> >
> > By changing from hvm_has_dm() to hvm_select_ioreq_server() both
> > cases are considered.  Doing it this way allows
> > hvm_send_assist_req() to only have 2 possible return values.
> >
> > The key part of skipping the retry is to do "rc = X86EMUL_OKAY"
> > which is what the error path on the call to hvm_has_dm() does in
> > hvmemul_do_io() (the only call on hvm_has_dm()).
> >
> > Since this case is no longer handled in hvm_send_assist_req(), move
> > the call to hvm_complete_assist_req() into hvmemul_do_io().
> >
> > As part of this change, do the work of hvm_complete_assist_req() in
> > the PVH case.  Acting more like real hardware looks to be better.
> >
> > Adding "rc = X86EMUL_OKAY" in the failing case of
> > hvm_send_assist_req() would break what was done in commit
> > bac0999325056a3b3a92f7622df7ffbc5388b1c3 and commit
> > f20f3c8ece5c10fa7626f253d28f570a43b23208.  We are currently doing
> > the succeeding case of hvm_send_assist_req() and retying the I/O.
> >
> > Since hvm_select_ioreq_server() has already been called, switch to
> > using hvm_send_assist_req_to_ioreq_server().
> >
> > Since there is no longer any calls to hvm_send_assist_req(), drop
> > that routine and rename hvm_send_assist_req_to_ioreq_server() to
> > hvm_send_assist_req.
> >
> > Since hvm_send_assist_req() is an extern, add an ASSERT() on s.
> >
> > Signed-off-by: Don Slutz 
> > ---
> > Reviewed-by: Paul Durrant 
> 
> So Paul, does you R-b stand despite the code changes in v3?
> 

Yes, I am happy with the changes. Getting rid of hvm_has_dm() does make things 
better.

  Paul

> Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] introduce and use relaxed cpumask bitops

2015-02-11 Thread Andrew Cooper
On 11/02/15 13:42, Jan Beulich wrote:
> Using atomic (LOCKed on x86) bitops for certain of the operations on
> cpumask_t is overkill when the variables aren't concurrently accessible
> (e.g. local function variables, or due to explicit locking). Introduce
> alternatives using non-atomic bitops and use them where appropriate.
>
> Note that this
> - adds a volatile qualifier to cpumask_test_and_{clear,set}_cpu()
>   (should have been there from the beginning, like is the case for
>   cpumask_{clear,set}_cpu())
> - replaces several cpumask_clear()+cpumask_set_cpu(, n) pairs by the
>   simpler cpumask_copy(, cpumask_of(n)) (or just cpumask_of(n) if we
>   can do without copying)
>
> Signed-off-by: Jan Beulich 
> Acked-by: George Dunlap 

Reviewed-by: Andrew Cooper 

> ---
> v2: Make naming of new functions consistent with exisiting ones.
>
> --- a/xen/arch/x86/hpet.c
> +++ b/xen/arch/x86/hpet.c
> @@ -158,7 +158,7 @@ static void evt_do_broadcast(cpumask_t *
>  {
>  unsigned int cpu = smp_processor_id();
>  
> -if ( cpumask_test_and_clear_cpu(cpu, mask) )
> +if ( __cpumask_test_and_clear_cpu(cpu, mask) )
>  raise_softirq(TIMER_SOFTIRQ);
>  
>  cpuidle_wakeup_mwait(mask);
> @@ -197,7 +197,7 @@ again:
>  continue;
>  
>  if ( deadline <= now )
> -cpumask_set_cpu(cpu, &mask);
> +__cpumask_set_cpu(cpu, &mask);
>  else if ( deadline < next_event )
>  next_event = deadline;
>  }
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1450,7 +1450,7 @@ void desc_guest_eoi(struct irq_desc *des
>  
>  cpumask_copy(&cpu_eoi_map, action->cpu_eoi_map);
>  
> -if ( cpumask_test_and_clear_cpu(smp_processor_id(), &cpu_eoi_map) )
> +if ( __cpumask_test_and_clear_cpu(smp_processor_id(), &cpu_eoi_map) )
>  {
>  __set_eoi_ready(desc);
>  spin_unlock(&desc->lock);
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -3216,7 +3216,7 @@ long do_mmuext_op(
>  for_each_online_cpu(cpu)
>  if ( !cpumask_intersects(&mask,
>   per_cpu(cpu_sibling_mask, cpu)) 
> )
> -cpumask_set_cpu(cpu, &mask);
> +__cpumask_set_cpu(cpu, &mask);
>  flush_mask(&mask, FLUSH_CACHE);
>  }
>  else
> --- a/xen/arch/x86/platform_hypercall.c
> +++ b/xen/arch/x86/platform_hypercall.c
> @@ -489,7 +489,7 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PA
>  
>  if ( !idletime )
>  {
> -cpumask_clear_cpu(cpu, cpumap);
> +__cpumask_clear_cpu(cpu, cpumap);
>  continue;
>  }
>  
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -179,7 +179,7 @@ static void smp_send_timer_broadcast_ipi
>  
>  if ( cpumask_test_cpu(cpu, &mask) )
>  {
> -cpumask_clear_cpu(cpu, &mask);
> +__cpumask_clear_cpu(cpu, &mask);
>  raise_softirq(TIMER_SOFTIRQ);
>  }
>  
> --- a/xen/common/core_parking.c
> +++ b/xen/common/core_parking.c
> @@ -75,11 +75,10 @@ static unsigned int core_parking_perform
>  if ( core_weight < core_tmp )
>  {
>  core_weight = core_tmp;
> -cpumask_clear(&core_candidate_map);
> -cpumask_set_cpu(cpu, &core_candidate_map);
> +cpumask_copy(&core_candidate_map, cpumask_of(cpu));
>  }
>  else if ( core_weight == core_tmp )
> -cpumask_set_cpu(cpu, &core_candidate_map);
> +__cpumask_set_cpu(cpu, &core_candidate_map);
>  }
>  
>  for_each_cpu(cpu, &core_candidate_map)
> @@ -88,11 +87,10 @@ static unsigned int core_parking_perform
>  if ( sibling_weight < sibling_tmp )
>  {
>  sibling_weight = sibling_tmp;
> -cpumask_clear(&sibling_candidate_map);
> -cpumask_set_cpu(cpu, &sibling_candidate_map);
> +cpumask_copy(&sibling_candidate_map, cpumask_of(cpu));
>  }
>  else if ( sibling_weight == sibling_tmp )
> -cpumask_set_cpu(cpu, &sibling_candidate_map);
> +__cpumask_set_cpu(cpu, &sibling_candidate_map);
>  }
>  
>  cpu = cpumask_first(&sibling_candidate_map);
> @@ -135,11 +133,10 @@ static unsigned int core_parking_power(u
>  if ( core_weight > core_tmp )
>  {
>  core_weight = core_tmp;
> -cpumask_clear(&core_candidate_map);
> -cpumask_set_cpu(cpu, &core_candidate_map);
> +cpumask_copy(&core_candidate_map, cpumask_of(cpu));
>  }
>  else if ( core_weight == core_tmp )
> -cpumask_set_cpu(cpu, &core_candidate_map);
> +__cpumask_set_cpu(cpu, &core_candidate_map);
>  }
>  
>  for_each_cpu(cpu, &core

Re: [Xen-devel] Query: Boot time allocation of irq descriptors

2015-02-11 Thread Jan Beulich
>>> On 11.02.15 at 16:03,  wrote:
> On Wed, Feb 11, 2015 at 8:25 PM, Andrew Cooper
>  wrote:
>> On 11/02/15 14:50, Vijay Kilari wrote:
>>> Hi ,
>>>
>>>   I just glaced at the x86 code, here nr_irqs are set to 1024, which 
> includes
>>> normal irq's and MSI's. Memory for these descriptors are allocated at boot 
> time.
>>> is it correct?
>>>
>>> int __init init_irq_data(void)
>>> {
>>>
>>> ...
>>> for (vector = 0; vector < NR_VECTORS; ++vector)
>>> this_cpu(vector_irq)[vector] = INT_MIN;
>>>
>>> irq_desc = xzalloc_array(struct irq_desc, nr_irqs);
>>>
>>> ...
>>> }
>>>
>>>
>>> In xen/include/asm-x86/irq.h
>>>
>>> #define MSI_IRQ(irq)   ((irq) >= nr_irqs_gsi && (irq) < nr_irqs)
>>
>> What do you think is incorrect about it?
> 
> Nothing wrong with it.
> 
>   I am trying to add MSI support for arm64 where the number of
> MSI interrupts are not limited to 1024. It can support large number of 
> MSI's.

So where did you spot this 1024? What I find (and wrote) is

if ( nr_irqs == 0 )
nr_irqs = cpu_has_apic ?
  max(16U + num_present_cpus() * NR_DYNAMIC_VECTORS,
  8 * nr_irqs_gsi) :
  nr_irqs_gsi;
else if ( nr_irqs < 16 )
nr_irqs = 16;

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2 2/3] x86/traps: Avoid interleaved writes when updating potentially-live descriptors

2015-02-11 Thread Andrew Cooper
Signed-off-by: Andrew Cooper 
CC: Jan Beulich 
CC: Tim Deegan 

---

v2:
 * Use _write_gate_lower() instead of opencoding write_atomic()
 * Drop write_atomic() in _write_gate_lower().  It doesn't appear to make a
   practical difference (although certainly does cause a difference to the
   register scheduling).
---
 xen/include/asm-x86/processor.h |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 2773ea8..c5c647a 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -444,9 +444,12 @@ struct __packed __cacheline_aligned tss_struct {
  * descriptor table entry. */
 static always_inline void set_ist(idt_entry_t *idt, unsigned long ist)
 {
+idt_entry_t new = *idt;
+
 /* IST is a 3 bit field, 32 bits into the IDT entry. */
 ASSERT(ist <= IST_MAX);
-idt->a = (idt->a & ~(7UL << 32)) | (ist << 32);
+new.a = (idt->a & ~(7UL << 32)) | (ist << 32);
+_write_gate_lower(idt, &new);
 }
 
 #define IDT_ENTRIES 256
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 05/12] Add and expose some testsupport APIs

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 05/12] Add and expose some testsupport APIs"):
> When install L2 guest, we will need to invoke
>  'select_ether' to get guest MAC address. So here expose select_ether().

I'm not sure whether you actually need to do this.  I will look at the
rest of your series to see why prepareguest() isn't suitable.  But
this part of the patch is fine in principle.

> And

These seem like two indepenedent patches.  They should be split up.

>  also, we added another function 'guest_editconfig_cd' and expose it.
>   This function bascically changes guest boot device sequence and
>  alter its on_reboot behavior to restart.

I don't understand why guest_editconfig_nocd isn't sufficient.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 03/12] Designate vif device model to e1000

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 03/12] Designate vif device model to e1000"):
> Designate vif model to 'e1000', otherwise, with default
>  device model, the L1 eth0 interface disappear, hence xenbridge cannot work.
>  Maybe this limitation can be removed later after some fix it. For now, we
>  have to accomodate to it.

I don't understand this, I'm afraid.  Can you please explain the bug
in more detail in the commit message ?

It is definitely not acceptable to change the default network card for
all guests in prepareguest_part_xencfg.  It would be OK to provide a
guest-specific runvar to specify the guest network card, and it might
be OK to set that in the nested-specific test job creation.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 02/12] Increase boot timer to accomodate to nest test

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 02/12] Increase boot timer to accomodate to 
nest test"):
> In nested test case, guest boot will take more time.
>  Increase the timer to 200 seconds.

Can we make this conditional somehow ?

I think it should probably be picked up from a runvar.

We don't currently have timeouts directly in runvars and we probably
don't want them in make-flight.  So perhaps we should have a runvar
named after the host indicating that it is a nested virt host.

Obviously that would mean this patch would have to come later in the
series.  I'll have to look at the rest of the series to have a clearer
idea what the right thing looks like.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Query: Boot time allocation of irq descriptors

2015-02-11 Thread Jan Beulich
>>> On 11.02.15 at 15:50,  wrote:
>   I just glaced at the x86 code, here nr_irqs are set to 1024, which 
> includes
> normal irq's and MSI's. Memory for these descriptors are allocated at boot 
> time.
> is it correct?
> 
> int __init init_irq_data(void)
> {
> 
> ...
> for (vector = 0; vector < NR_VECTORS; ++vector)
> this_cpu(vector_irq)[vector] = INT_MIN;
> 
> irq_desc = xzalloc_array(struct irq_desc, nr_irqs);

One day we'd like to allocate at least the MSI ones only on demand...

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 06/12] Manipulate $ho IP assignment for nest L2 situation

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 06/12] Manipulate $ho IP assignment for nest 
L2 situation"):
>  In L2 installation context, its host (L1) IP address is not queried
> from DNS, but from previous step of L1 installation, in which, L1 IP
> is stored in run var.

> -$ho->{IpStatic} = get_host_property($ho,'ip-addr');
> +if ($name eq 'nested') {

This is definitely the wrong test.

It would be easier to read this series if you introduced the framework
first, and then applied all the specific differences afterwards.

Instead of keying off $name I think you probably need to make a
variant of selecthost that takes an existing guest ($gho) and converts
it into a useable host ($ho).

It would probably be necessary to split out the bulk of the existing
selecthost into a core function.

I think you also want a general way to specify how the L1's host
properties are set.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 04/12] Just some indentation adustments.

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 04/12] Just some indentation adustments."):
> -   target_putfilecontents_root_stash
> +  target_putfilecontents_root_stash

This seems to be just tab/space changes.  I don't think we really need
to bother about these.  Do you find the discrepancies very annoying
for some reason ?

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Query: Boot time allocation of irq descriptors

2015-02-11 Thread Vijay Kilari
On Wed, Feb 11, 2015 at 8:25 PM, Andrew Cooper
 wrote:
> On 11/02/15 14:50, Vijay Kilari wrote:
>> Hi ,
>>
>>   I just glaced at the x86 code, here nr_irqs are set to 1024, which includes
>> normal irq's and MSI's. Memory for these descriptors are allocated at boot 
>> time.
>> is it correct?
>>
>> int __init init_irq_data(void)
>> {
>>
>> ...
>> for (vector = 0; vector < NR_VECTORS; ++vector)
>> this_cpu(vector_irq)[vector] = INT_MIN;
>>
>> irq_desc = xzalloc_array(struct irq_desc, nr_irqs);
>>
>> ...
>> }
>>
>>
>> In xen/include/asm-x86/irq.h
>>
>> #define MSI_IRQ(irq)   ((irq) >= nr_irqs_gsi && (irq) < nr_irqs)
>
> What do you think is incorrect about it?

Nothing wrong with it.

  I am trying to add MSI support for arm64 where the number of
MSI interrupts are not limited to 1024. It can support large number of MSI's.
Just trying to understand how many MSI's that x86 supports and is there
any generic code that I can use for arm64.

Looks like I have to do something for arm64

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] tools: require at least pixman 0.21.8 for qemu-xen

2015-02-11 Thread Olaf Hering
Avoid late build failure in openSUSE 11.4, it has just pixman-0.20:


[  211s] ERROR: pixman >= 0.21.8 not present. Your options:
[  211s]  (1) Preferred: Install the pixman devel package (any recent
[  211s]  distro should have packages as Xorg needs pixman too).
[  211s]  (2) Fetch the pixman submodule, using:
[  211s]  git submodule update --init pixman


Please run autogen.sh after applying this patch.

Signed-off-by: Olaf Hering 
Cc: Ian Jackson 
Cc: Stefano Stabellini 
Cc: Ian Campbell 
Cc: Wei Liu 
---
 tools/configure| 18 +-
 tools/configure.ac |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/configure b/tools/configure
index e7dac75..035ce5b 100755
--- a/tools/configure
+++ b/tools/configure
@@ -7724,12 +7724,12 @@ if test -n "$pixman_CFLAGS"; then
 pkg_cv_pixman_CFLAGS="$pixman_CFLAGS"
  elif test -n "$PKG_CONFIG"; then
 if test -n "$PKG_CONFIG" && \
-{ { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists 
--print-errors \"pixman-1\""; } >&5
-  ($PKG_CONFIG --exists --print-errors "pixman-1") 2>&5
+{ { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists 
--print-errors \"pixman-1 >= 0.21.8\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "pixman-1 >= 0.21.8") 2>&5
   ac_status=$?
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
-  pkg_cv_pixman_CFLAGS=`$PKG_CONFIG --cflags "pixman-1" 2>/dev/null`
+  pkg_cv_pixman_CFLAGS=`$PKG_CONFIG --cflags "pixman-1 >= 0.21.8" 2>/dev/null`
  test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
@@ -7741,12 +7741,12 @@ if test -n "$pixman_LIBS"; then
 pkg_cv_pixman_LIBS="$pixman_LIBS"
  elif test -n "$PKG_CONFIG"; then
 if test -n "$PKG_CONFIG" && \
-{ { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists 
--print-errors \"pixman-1\""; } >&5
-  ($PKG_CONFIG --exists --print-errors "pixman-1") 2>&5
+{ { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists 
--print-errors \"pixman-1 >= 0.21.8\""; } >&5
+  ($PKG_CONFIG --exists --print-errors "pixman-1 >= 0.21.8") 2>&5
   ac_status=$?
   $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
   test $ac_status = 0; }; then
-  pkg_cv_pixman_LIBS=`$PKG_CONFIG --libs "pixman-1" 2>/dev/null`
+  pkg_cv_pixman_LIBS=`$PKG_CONFIG --libs "pixman-1 >= 0.21.8" 2>/dev/null`
  test "x$?" != "x0" && pkg_failed=yes
 else
   pkg_failed=yes
@@ -7767,14 +7767,14 @@ else
 _pkg_short_errors_supported=no
 fi
 if test $_pkg_short_errors_supported = yes; then
-   pixman_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors 
--cflags --libs "pixman-1" 2>&1`
+   pixman_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors 
--cflags --libs "pixman-1 >= 0.21.8" 2>&1`
 else
-   pixman_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs 
"pixman-1" 2>&1`
+   pixman_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs 
"pixman-1 >= 0.21.8" 2>&1`
 fi
# Put the nasty error message in config.log where it belongs
echo "$pixman_PKG_ERRORS" >&5
 
-   as_fn_error $? "Package requirements (pixman-1) were not met:
+   as_fn_error $? "Package requirements (pixman-1 >= 0.21.8) were not met:
 
 $pixman_PKG_ERRORS
 
diff --git a/tools/configure.ac b/tools/configure.ac
index 03dadd7..4e0abdb 100644
--- a/tools/configure.ac
+++ b/tools/configure.ac
@@ -320,7 +320,7 @@ esac
 dnl The following are only required when upstream QEMU is built
 AS_IF([test "x$qemu_xen" = "xy"], [
 PKG_CHECK_MODULES(glib, [glib-2.0 >= 2.12])
-PKG_CHECK_MODULES(pixman, pixman-1)
+PKG_CHECK_MODULES(pixman, [pixman-1 >= 0.21.8])
 ])
 AX_CHECK_FETCHER
 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] Query: Boot time allocation of irq descriptors

2015-02-11 Thread Andrew Cooper
On 11/02/15 14:50, Vijay Kilari wrote:
> Hi ,
>
>   I just glaced at the x86 code, here nr_irqs are set to 1024, which includes
> normal irq's and MSI's. Memory for these descriptors are allocated at boot 
> time.
> is it correct?
>
> int __init init_irq_data(void)
> {
>
> ...
> for (vector = 0; vector < NR_VECTORS; ++vector)
> this_cpu(vector_irq)[vector] = INT_MIN;
>
> irq_desc = xzalloc_array(struct irq_desc, nr_irqs);
>
> ...
> }
>
>
> In xen/include/asm-x86/irq.h
>
> #define MSI_IRQ(irq)   ((irq) >= nr_irqs_gsi && (irq) < nr_irqs)

What do you think is incorrect about it?

~Andrew

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Query: Boot time allocation of irq descriptors

2015-02-11 Thread Vijay Kilari
Hi ,

  I just glaced at the x86 code, here nr_irqs are set to 1024, which includes
normal irq's and MSI's. Memory for these descriptors are allocated at boot time.
is it correct?

int __init init_irq_data(void)
{

...
for (vector = 0; vector < NR_VECTORS; ++vector)
this_cpu(vector_irq)[vector] = INT_MIN;

irq_desc = xzalloc_array(struct irq_desc, nr_irqs);

...
}


In xen/include/asm-x86/irq.h

#define MSI_IRQ(irq)   ((irq) >= nr_irqs_gsi && (irq) < nr_irqs)


Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST 01/12] Add support of parsing grub which has 'submenu' primitive

2015-02-11 Thread Ian Jackson
Robert Ho writes ("[PATCH OSSTEST 01/12] Add support of parsing grub which has 
'submenu' primitive"):
>  From a hvm kernel build from Linux stable Kernel tree,
>  the auto generated grub2 menu will have 'submenu' primitive, upon the
>  'menuentry' items. Xen boot entries will be grouped into a submenu. This
>  patch adds capability to support such grub formats. Also, this patch adjust
>  some indent alignments.

Thanks for this submission.  Dealing with submenus is definitely
something we want to do.

I haven't looked at the code in detail yet but I have a general
question: we currently count menu entries and eventually write
GRUB_DEFAULT=  into /etc/default/grub.

Does this work properly if the entry is in a submenu ?  I guess you
have probably tested this but I thought I should ask...

Can you please not adjust the whitespace ?  osstest in general doesn't
have a requirement for any particular whitespace use, and certainly if
there are to be any whitespace changes they ought to be in a separate
patch.

Thanks,
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH OSSTEST v6 9/9] mfi-common, make-flight: create XSM test jobs

2015-02-11 Thread Ian Jackson
Wei Liu writes ("Re: [PATCH OSSTEST v6 9/9] mfi-common, make-flight: create XSM 
test jobs"):
> Here is the updated version:
...
> Duplicate Debian PV and HVM test jobs for XSM testing.

Thanks.  This series (v7, then) is currently in the osstest
self-push-gate.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [OSSTEST PATCH v2 10/10] rump kernel tests: Repeat the xenstorels test 50 times

2015-02-11 Thread Ian Jackson
Ian Campbell writes ("Re: [OSSTEST PATCH 10/10] rump kernel tests: Repeat the 
xenstorels test 50 times"):
> On Fri, 2015-02-06 at 19:17 +, Ian Jackson wrote:
> > Add a new step which uses repeat-ts to run
> > ts-rumpuserxen-demo-xenstorels many times.
> 
> Acked-by: Ian Campbell 

Thanks.

After reviewing the output of my full-flight adhoc test, I decided
that the order of steps ought to be different, so here is a v2 of this
patch.

(The rest of the series, including the testid fiddling, worked
correctly.)

Thanks,
Ian.

>From f9b70e699a4175ff871f165cbf70ff07847c3abd Mon Sep 17 00:00:00 2001
From: Ian Jackson 
Date: Fri, 6 Feb 2015 17:09:40 +
Subject: [OSSTEST PATCH] rump kernel tests: Repeat the xenstorels test 50
 times

Add a new step which uses repeat-ts to run
ts-rumpuserxen-demo-xenstorels many times.

We have to run ts-guest-destroy-hard after each time, to destroy the
guest which the demo script leaves lying about.

Strategically placed `+'s in the repeat-ts command line arrange that
the testid ends up being
   rumpuserxen-demo-xenstorels/xenstorels.repeat

Signed-off-by: Ian Jackson 
---
v2: Run the test after, rather than before, the explicit
 ts-guest-destroy-hard.  That will avoid blocking the single
 destroy test if the repeat fails.

No longer specify to tolerate failures of the post-run-demo
 destroy, as if the test passes so must the destroy.  Now by-hand
 testing may need a different ts-repeat-test rune, but in practice
 by-hand testing will probably involve a shell loop or something
 anyway.
---
 sg-run-job |3 +++
 1 file changed, 3 insertions(+)

diff --git a/sg-run-job b/sg-run-job
index 0a49c93..94d091b 100755
--- a/sg-run-job
+++ b/sg-run-job
@@ -328,6 +328,9 @@ proc run-job/test-rumpuserxen {} {
 run-ts . =   ts-rumpuserxen-demo-setup  + host + $g
 run-ts . =   ts-rumpuserxen-demo-xenstorels + host + $g
 run-ts . =   ts-guest-destroy-hard  + host + $g
+repeat-ts 50 =.repeat \
+ ts-rumpuserxen-demo-xenstorels + host + $g \; \
+ +   ts-guest-destroy-hardhost   $g   +
 }
 
 #-- builds --
-- 
1.7.10.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH RFC 33/35] arm : acpi enable efi for acpi

2015-02-11 Thread Julien Grall

Hi Jan,

On 11/02/2015 18:31, Jan Beulich wrote:

On 11.02.15 at 10:57,  wrote:

Hi Ian,

On 05/02/2015 20:05, Ian Campbell wrote:

On Thu, 2015-02-05 at 11:58 +, Jan Beulich wrote:

On 05.02.15 at 06:31,  wrote:

--- a/xen/common/efi/runtime.c
+++ b/xen/common/efi/runtime.c
@@ -11,7 +11,13 @@ DEFINE_XEN_GUEST_HANDLE(CHAR16);
#ifndef COMPAT

#ifdef CONFIG_ARM  /* Disabled until runtime services implemented */


This comment seems irrelevant now.


+
+#if defined(CONFIG_ARM_64) && defined(CONFIG_ACPI)


#ifdef CONFIG_ACPI


This is common code, and I can't see ACPI and EFI being always in the
same supported state (or else we could drop one of the two).


EFI without ACPI is certainly a possibility on ARM64.


We would need to defined a new protocol in order to boot ACPI without EFI.

Currently the ACPI fetch the rsdp pointer in 2 differents way depending
of efi_enabled:
* efi_enabled == 1 => Use EFI to get the pointer
* efi_enabled == 0 => Use the x86 legacy mode

On ARM64, we have to use the first one.


How that when not booting from EFI? Surely you can't use x86
legacy mode, but if there is the possibility of ACPI without EFI
(the opposite of what Ian indicated would be a possibility), then
there ought to be another method to find RSDP on ARM too.


I should have been more clear on my previous mail. I didn't try to 
justify this patch (I think it's wrong too), but I wanted to expose our 
current use-case for ACPI.


We would, obviously, have to implement it when a new way to get ACPI is 
coming up. Currently, ACPI can only be retrieved via EFI on ARM64.


So if efi_enabled is not set, we should bail out rather than trying to 
use the legacy mode. It would help to support correctly ARM64 platform 
with only DT support (and not EFI).


Regards,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] Xen OVMF regression

2015-02-11 Thread Wei Liu
Hi Anthony and Laszlo

The following commit caused Xen hvm guest failed to boot.

commit 66b280df282ae82888d2eb416bfeda3f65afa386
Author: Laszlo Ersek 
Date:   Thu Nov 20 09:58:28 2014 +

OvmfPkg: AcpiPlatformDxe: make dependency on PCI enumeration explicit

The ACPI payload that OVMF downloads from QEMU via fw_cfg depends on the
PCI enumaration and resource assignment performed by
MdeModulePkg/Bus/Pci/PciBusDxe.

...

The patch itself is simple and well reasoned. I couldn't immediately see
the culprit that caused Xen guest to crash.

All this commit does is it changed dependency of a module. So I think
it triggered some sort of domino effect that affected other Xen
components. My bet is that there is some missing dependencies in those
two Xen modules (XenBusDxe and XenBlkDxe), but I can't say for sure.

After this patch, following errors showed up:

(d44)  BlockSize : 512
(d44)  LastBlock : 1E997FF
(d44) XenBus: BAR at F200
(XEN) multi.c:3322:d44v0 write to pagetable during event injection: 
cr2=0xeff62848, mfn=0x1d3b62
(XEN) multi.c:3322:d44v0 write to pagetable during event injection: 
cr2=0xeff62838, mfn=0x1d3b62

Eventually guest crashed with:

(XEN) sh error: sh_remove_shadows(): can't find all shadows of mfn 1d3b01 
(shadow_flags=2000)
(XEN) domain_crash called from common.c:2645
(XEN) Domain 44 (vcpu#0) crashed on cpu#0:
(XEN) [ Xen-4.6-unstable  x86_64  debug=y  Tainted:C ]
(XEN) CPU:0
(XEN) RIP:0033:[<7fb1c58b84f7>]
(XEN) RFLAGS: 00010202   CONTEXT: hvm guest
(XEN) rax: 0003   rbx: 7fff6f87ff00   rcx: 7fb1c58bb6b1
(XEN) rdx: 7fff6f87ff28   rsi: 7fb1c58a64e0   rdi: 7fff6f87ff08
(XEN) rbp: 7fb1c58a64e0   rsp: 7fff6f87fe50   r8:  0002
(XEN) r9:  000d6370025e   r10: 6f6a   r11: 6eff
(XEN) r12: 7fb1c58a4a78   r13: 7fb1c58a4000   r14: 7fb1c58a4a78
(XEN) r15: 7fb1c58a4380   cr0: 8005003b   cr4: 06f0
(XEN) cr3: efa2a000   cr2: 7fb1c5ac30d0
(XEN) ds:    es:    fs:    gs:    ss: 002b   cs: 0033

Any idea how this can be fixed?

Wei.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH] domctl: do away with tool stack based retrying

2015-02-11 Thread Andrew Cooper
On 11/02/15 13:47, Jan Beulich wrote:
> XEN_DOMCTL_destroydomain so far is being special cased in libxc to
> reinvoke the operation when getting back EAGAIN. Quite a few other
> domctl-s have gained continuations, so I see no reason not to use them
> here too.
>
> Signed-off-by: Jan Beulich 

In particular, it ought to be much more efficient as it avoids the
kernel/user context switches, and associated TLB flushes.

Reviewed-by: Andrew Cooper 

>
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -112,14 +112,10 @@ int xc_domain_unpause(xc_interface *xch,
>  int xc_domain_destroy(xc_interface *xch,
>uint32_t domid)
>  {
> -int ret;
>  DECLARE_DOMCTL;
>  domctl.cmd = XEN_DOMCTL_destroydomain;
>  domctl.domain = (domid_t)domid;
> -do {
> -ret = do_domctl(xch, &domctl);
> -} while ( ret && (errno == EAGAIN) );
> -return ret;
> +return do_domctl(xch, &domctl);
>  }
>  
>  int xc_domain_shutdown(xc_interface *xch,
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -617,13 +617,9 @@ int domain_kill(struct domain *d)
>  case DOMDYING_dying:
>  rc = domain_relinquish_resources(d);
>  if ( rc != 0 )
> -{
> -if ( rc == -ERESTART )
> -rc = -EAGAIN;
>  break;
> -}
>  if ( cpupool_move_domain(d, cpupool0) )
> -return -EAGAIN;
> +return -ERESTART;
>  for_each_vcpu ( d, v )
>  unmap_vcpu_info(v);
>  d->is_dying = DOMDYING_dead;
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -692,10 +692,11 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xe
>  break;
>  
>  case XEN_DOMCTL_destroydomain:
> -{
>  ret = domain_kill(d);
> -}
> -break;
> +if ( ret == -ERESTART )
> +ret = hypercall_create_continuation(
> +__HYPERVISOR_domctl, "h", u_domctl);
> +break;
>  
>  case XEN_DOMCTL_setnodeaffinity:
>  {
>
>
>
>
>
> ___
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH] domctl: do away with tool stack based retrying

2015-02-11 Thread Jan Beulich
XEN_DOMCTL_destroydomain so far is being special cased in libxc to
reinvoke the operation when getting back EAGAIN. Quite a few other
domctl-s have gained continuations, so I see no reason not to use them
here too.

Signed-off-by: Jan Beulich 

--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -112,14 +112,10 @@ int xc_domain_unpause(xc_interface *xch,
 int xc_domain_destroy(xc_interface *xch,
   uint32_t domid)
 {
-int ret;
 DECLARE_DOMCTL;
 domctl.cmd = XEN_DOMCTL_destroydomain;
 domctl.domain = (domid_t)domid;
-do {
-ret = do_domctl(xch, &domctl);
-} while ( ret && (errno == EAGAIN) );
-return ret;
+return do_domctl(xch, &domctl);
 }
 
 int xc_domain_shutdown(xc_interface *xch,
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -617,13 +617,9 @@ int domain_kill(struct domain *d)
 case DOMDYING_dying:
 rc = domain_relinquish_resources(d);
 if ( rc != 0 )
-{
-if ( rc == -ERESTART )
-rc = -EAGAIN;
 break;
-}
 if ( cpupool_move_domain(d, cpupool0) )
-return -EAGAIN;
+return -ERESTART;
 for_each_vcpu ( d, v )
 unmap_vcpu_info(v);
 d->is_dying = DOMDYING_dead;
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -692,10 +692,11 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xe
 break;
 
 case XEN_DOMCTL_destroydomain:
-{
 ret = domain_kill(d);
-}
-break;
+if ( ret == -ERESTART )
+ret = hypercall_create_continuation(
+__HYPERVISOR_domctl, "h", u_domctl);
+break;
 
 case XEN_DOMCTL_setnodeaffinity:
 {



domctl: do away with tool stack based retrying

XEN_DOMCTL_destroydomain so far is being special cased in libxc to
reinvoke the operation when getting back EAGAIN. Quite a few other
domctl-s have gained continuations, so I see no reason not to use them
here too.

Signed-off-by: Jan Beulich 

--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -112,14 +112,10 @@ int xc_domain_unpause(xc_interface *xch,
 int xc_domain_destroy(xc_interface *xch,
   uint32_t domid)
 {
-int ret;
 DECLARE_DOMCTL;
 domctl.cmd = XEN_DOMCTL_destroydomain;
 domctl.domain = (domid_t)domid;
-do {
-ret = do_domctl(xch, &domctl);
-} while ( ret && (errno == EAGAIN) );
-return ret;
+return do_domctl(xch, &domctl);
 }
 
 int xc_domain_shutdown(xc_interface *xch,
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -617,13 +617,9 @@ int domain_kill(struct domain *d)
 case DOMDYING_dying:
 rc = domain_relinquish_resources(d);
 if ( rc != 0 )
-{
-if ( rc == -ERESTART )
-rc = -EAGAIN;
 break;
-}
 if ( cpupool_move_domain(d, cpupool0) )
-return -EAGAIN;
+return -ERESTART;
 for_each_vcpu ( d, v )
 unmap_vcpu_info(v);
 d->is_dying = DOMDYING_dead;
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -692,10 +692,11 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xe
 break;
 
 case XEN_DOMCTL_destroydomain:
-{
 ret = domain_kill(d);
-}
-break;
+if ( ret == -ERESTART )
+ret = hypercall_create_continuation(
+__HYPERVISOR_domctl, "h", u_domctl);
+break;
 
 case XEN_DOMCTL_setnodeaffinity:
 {
___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] introduce and use relaxed cpumask bitops

2015-02-11 Thread Jan Beulich
Using atomic (LOCKed on x86) bitops for certain of the operations on
cpumask_t is overkill when the variables aren't concurrently accessible
(e.g. local function variables, or due to explicit locking). Introduce
alternatives using non-atomic bitops and use them where appropriate.

Note that this
- adds a volatile qualifier to cpumask_test_and_{clear,set}_cpu()
  (should have been there from the beginning, like is the case for
  cpumask_{clear,set}_cpu())
- replaces several cpumask_clear()+cpumask_set_cpu(, n) pairs by the
  simpler cpumask_copy(, cpumask_of(n)) (or just cpumask_of(n) if we
  can do without copying)

Signed-off-by: Jan Beulich 
Acked-by: George Dunlap 
---
v2: Make naming of new functions consistent with exisiting ones.

--- a/xen/arch/x86/hpet.c
+++ b/xen/arch/x86/hpet.c
@@ -158,7 +158,7 @@ static void evt_do_broadcast(cpumask_t *
 {
 unsigned int cpu = smp_processor_id();
 
-if ( cpumask_test_and_clear_cpu(cpu, mask) )
+if ( __cpumask_test_and_clear_cpu(cpu, mask) )
 raise_softirq(TIMER_SOFTIRQ);
 
 cpuidle_wakeup_mwait(mask);
@@ -197,7 +197,7 @@ again:
 continue;
 
 if ( deadline <= now )
-cpumask_set_cpu(cpu, &mask);
+__cpumask_set_cpu(cpu, &mask);
 else if ( deadline < next_event )
 next_event = deadline;
 }
--- a/xen/arch/x86/irq.c
+++ b/xen/arch/x86/irq.c
@@ -1450,7 +1450,7 @@ void desc_guest_eoi(struct irq_desc *des
 
 cpumask_copy(&cpu_eoi_map, action->cpu_eoi_map);
 
-if ( cpumask_test_and_clear_cpu(smp_processor_id(), &cpu_eoi_map) )
+if ( __cpumask_test_and_clear_cpu(smp_processor_id(), &cpu_eoi_map) )
 {
 __set_eoi_ready(desc);
 spin_unlock(&desc->lock);
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3216,7 +3216,7 @@ long do_mmuext_op(
 for_each_online_cpu(cpu)
 if ( !cpumask_intersects(&mask,
  per_cpu(cpu_sibling_mask, cpu)) )
-cpumask_set_cpu(cpu, &mask);
+__cpumask_set_cpu(cpu, &mask);
 flush_mask(&mask, FLUSH_CACHE);
 }
 else
--- a/xen/arch/x86/platform_hypercall.c
+++ b/xen/arch/x86/platform_hypercall.c
@@ -489,7 +489,7 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PA
 
 if ( !idletime )
 {
-cpumask_clear_cpu(cpu, cpumap);
+__cpumask_clear_cpu(cpu, cpumap);
 continue;
 }
 
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -179,7 +179,7 @@ static void smp_send_timer_broadcast_ipi
 
 if ( cpumask_test_cpu(cpu, &mask) )
 {
-cpumask_clear_cpu(cpu, &mask);
+__cpumask_clear_cpu(cpu, &mask);
 raise_softirq(TIMER_SOFTIRQ);
 }
 
--- a/xen/common/core_parking.c
+++ b/xen/common/core_parking.c
@@ -75,11 +75,10 @@ static unsigned int core_parking_perform
 if ( core_weight < core_tmp )
 {
 core_weight = core_tmp;
-cpumask_clear(&core_candidate_map);
-cpumask_set_cpu(cpu, &core_candidate_map);
+cpumask_copy(&core_candidate_map, cpumask_of(cpu));
 }
 else if ( core_weight == core_tmp )
-cpumask_set_cpu(cpu, &core_candidate_map);
+__cpumask_set_cpu(cpu, &core_candidate_map);
 }
 
 for_each_cpu(cpu, &core_candidate_map)
@@ -88,11 +87,10 @@ static unsigned int core_parking_perform
 if ( sibling_weight < sibling_tmp )
 {
 sibling_weight = sibling_tmp;
-cpumask_clear(&sibling_candidate_map);
-cpumask_set_cpu(cpu, &sibling_candidate_map);
+cpumask_copy(&sibling_candidate_map, cpumask_of(cpu));
 }
 else if ( sibling_weight == sibling_tmp )
-cpumask_set_cpu(cpu, &sibling_candidate_map);
+__cpumask_set_cpu(cpu, &sibling_candidate_map);
 }
 
 cpu = cpumask_first(&sibling_candidate_map);
@@ -135,11 +133,10 @@ static unsigned int core_parking_power(u
 if ( core_weight > core_tmp )
 {
 core_weight = core_tmp;
-cpumask_clear(&core_candidate_map);
-cpumask_set_cpu(cpu, &core_candidate_map);
+cpumask_copy(&core_candidate_map, cpumask_of(cpu));
 }
 else if ( core_weight == core_tmp )
-cpumask_set_cpu(cpu, &core_candidate_map);
+__cpumask_set_cpu(cpu, &core_candidate_map);
 }
 
 for_each_cpu(cpu, &core_candidate_map)
@@ -148,11 +145,10 @@ static unsigned int core_parking_power(u
 if ( sibling_weight > sibling_tmp )
 {
 sibling_weight = sibling_tmp;
-cpumask_clear(&sibling_candidate_map);
-cpumask_set_cpu(cpu, &sibling_candidate_map);
+  

[Xen-devel] [qemu-upstream-unstable test] 34396: regressions - FAIL

2015-02-11 Thread xen . org
flight 34396 qemu-upstream-unstable real [real]
http://www.chiark.greenend.org.uk/~xensrcts/logs/34396/

Regressions :-(

Tests which did not succeed and are blocking,
including tests which could not be run:
 test-amd64-i386-freebsd10-i386 11 guest-localmigrate  fail REGR. vs. 33488
 test-amd64-i386-freebsd10-amd64 11 guest-localmigrate fail REGR. vs. 33488
 test-amd64-i386-xl-win7-amd64 10 guest-localmigrate   fail REGR. vs. 33488
 test-amd64-amd64-xl-win7-amd64 10 guest-localmigrate  fail REGR. vs. 33488
 test-amd64-i386-xl-winxpsp3-vcpus1 10 guest-localmigrate  fail REGR. vs. 33488
 test-amd64-i386-xl-winxpsp3  10 guest-localmigratefail REGR. vs. 33488
 test-amd64-i386-rhel6hvm-amd 6 leak-check/basis(6) running in 34247 
[st=running!]
 test-amd64-amd64-xl-winxpsp3 10 guest-localmigrate fail in 34247 REGR. vs. 
33488

Tests which are failing intermittently (not blocking):
 test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail pass in 34247
 test-amd64-amd64-xl-winxpsp3  7 windows-install fail pass in 34319

Regressions which are regarded as allowable (not blocking):
 test-amd64-i386-libvirt   9 guest-start  fail   like 33488
 test-amd64-amd64-libvirt  9 guest-start  fail   like 33488
 test-amd64-amd64-xl-qemuu-debianhvm-amd64 10 guest-localmigrate fail REGR. vs. 
33488
 test-amd64-i386-xl-qemuu-debianhvm-amd64 10 guest-localmigrate fail REGR. vs. 
33488
 test-amd64-amd64-xl-qemuu-ovmf-amd64 10 guest-localmigrate fail REGR. vs. 33488
 test-amd64-i386-xl-qemuu-win7-amd64 10 guest-localmigrate fail REGR. vs. 33488
 test-amd64-i386-xl-qemuu-winxpsp3-vcpus1 10 guest-localmigrate fail REGR. vs. 
33488
 test-amd64-i386-xl-qemuu-winxpsp3 10 guest-localmigrate   fail REGR. vs. 33488
 test-amd64-i386-xl-qemuu-ovmf-amd64 10 guest-localmigrate fail REGR. vs. 33488
 test-amd64-amd64-xl-qemuu-win7-amd64 10 guest-localmigrate fail REGR. vs. 33488
 test-amd64-amd64-xl-qemuu-winxpsp3 10 guest-localmigrate  fail REGR. vs. 33488
 test-armhf-armhf-libvirt 13 guest-destroy   fail in 34247 blocked in 33488
 test-armhf-armhf-xl-multivcpu 14 leak-check/check fail in 34247 blocked in 
33488

Tests which did not succeed, but are not blocking:
 test-amd64-amd64-xl-pvh-intel  9 guest-start  fail  never pass
 test-armhf-armhf-libvirt 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf-pin 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-sedf 10 migrate-support-checkfail   never pass
 test-armhf-armhf-xl-multivcpu 10 migrate-support-checkfail  never pass
 test-armhf-armhf-xl  10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pvh-amd   9 guest-start  fail   never pass
 test-armhf-armhf-xl-midway   10 migrate-support-checkfail   never pass
 test-amd64-amd64-xl-pcipt-intel  9 guest-start fail never pass
 test-armhf-armhf-xl-credit2   5 xen-boot fail   never pass
 test-amd64-i386-xl-qemut-winxpsp3 14 guest-stopfail never pass
 test-amd64-i386-xl-qemut-winxpsp3-vcpus1 14 guest-stop fail never pass
 test-amd64-amd64-xl-qemut-winxpsp3 14 guest-stop   fail never pass
 test-amd64-i386-xl-qemut-win7-amd64 14 guest-stop  fail never pass
 test-amd64-amd64-xl-qemut-win7-amd64 14 guest-stop fail never pass
 test-amd64-amd64-libvirt 10 migrate-support-check fail in 34247 never pass

version targeted for testing:
 qemuube11dc1e9172f91e798a8f831b30c14b479e08e8
baseline version:
 qemuu0d37748342e29854db7c9f6c47d7f58c6cfba6b2


People who touched revisions under test:
  Don Slutz 
  Paul Durrant 
  Stefano Stabellini 


jobs:
 build-amd64  pass
 build-armhf  pass
 build-i386   pass
 build-amd64-libvirt  pass
 build-armhf-libvirt  pass
 build-i386-libvirt   pass
 build-amd64-pvopspass
 build-armhf-pvopspass
 build-i386-pvops pass
 test-amd64-amd64-xl  pass
 test-armhf-armhf-xl  pass
 test-amd64-i386-xl   pass
 test-amd64-amd64-xl-pvh-amd  fail
 test-amd64-i386-rhel6hvm-amd pass
 test-amd64-i386-qemut-rhel6hvm-amd   pass
 test-amd

[Xen-devel] [PATCH] x86: simplify non-atomic bitops

2015-02-11 Thread Jan Beulich
- being non-atomic, their pointer arguments shouldn't be volatile-
  qualified
- their (half fake) memory operands can be a single "+m" instead of
  being both an output and an input

Signed-off-by: Jan Beulich 
---
v2: Drop "+m" related sentence from comment at the top of the file as
being wrong (the referenced indication in gcc's documentation got
removed quite some time ago too).

--- a/xen/include/asm-x86/bitops.h
+++ b/xen/include/asm-x86/bitops.h
@@ -14,8 +14,7 @@
  * operand is both read from and written to. Since the operand is in fact a
  * word array, we also specify "memory" in the clobbers list to indicate that
  * words other than the one directly addressed by the memory operand may be
- * modified. We don't use "+m" because the gcc manual says that it should be
- * used only when the constraint allows the operand to reside in a register.
+ * modified.
  */
 
 #define ADDR (*(volatile long *) addr)
@@ -55,12 +54,9 @@ static inline void set_bit(int nr, volat
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static inline void __set_bit(int nr, volatile void *addr)
+static inline void __set_bit(int nr, void *addr)
 {
-asm volatile (
-"btsl %1,%0"
-: "=m" (ADDR)
-: "Ir" (nr), "m" (ADDR) : "memory");
+asm volatile ( "btsl %1,%0" : "+m" (ADDR) : "Ir" (nr) : "memory" );
 }
 #define __set_bit(nr, addr) ({  \
 if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
@@ -95,12 +91,9 @@ static inline void clear_bit(int nr, vol
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static inline void __clear_bit(int nr, volatile void *addr)
+static inline void __clear_bit(int nr, void *addr)
 {
-asm volatile (
-"btrl %1,%0"
-: "=m" (ADDR)
-: "Ir" (nr), "m" (ADDR) : "memory");
+asm volatile ( "btrl %1,%0" : "+m" (ADDR) : "Ir" (nr) : "memory" );
 }
 #define __clear_bit(nr, addr) ({\
 if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
@@ -116,12 +109,9 @@ static inline void __clear_bit(int nr, v
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static inline void __change_bit(int nr, volatile void *addr)
+static inline void __change_bit(int nr, void *addr)
 {
-asm volatile (
-"btcl %1,%0"
-: "=m" (ADDR)
-: "Ir" (nr), "m" (ADDR) : "memory");
+asm volatile ( "btcl %1,%0" : "+m" (ADDR) : "Ir" (nr) : "memory" );
 }
 #define __change_bit(nr, addr) ({   \
 if ( bitop_bad_size(addr) ) __bitop_bad_size(); \
@@ -181,14 +171,14 @@ static inline int test_and_set_bit(int n
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static inline int __test_and_set_bit(int nr, volatile void *addr)
+static inline int __test_and_set_bit(int nr, void *addr)
 {
 int oldbit;
 
 asm volatile (
 "btsl %2,%1\n\tsbbl %0,%0"
-: "=r" (oldbit), "=m" (ADDR)
-: "Ir" (nr), "m" (ADDR) : "memory");
+: "=r" (oldbit), "+m" (ADDR)
+: "Ir" (nr) : "memory" );
 return oldbit;
 }
 #define __test_and_set_bit(nr, addr) ({ \
@@ -228,14 +218,14 @@ static inline int test_and_clear_bit(int
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static inline int __test_and_clear_bit(int nr, volatile void *addr)
+static inline int __test_and_clear_bit(int nr, void *addr)
 {
 int oldbit;
 
 asm volatile (
 "btrl %2,%1\n\tsbbl %0,%0"
-: "=r" (oldbit), "=m" (ADDR)
-: "Ir" (nr), "m" (ADDR) : "memory");
+: "=r" (oldbit), "+m" (ADDR)
+: "Ir" (nr) : "memory" );
 return oldbit;
 }
 #define __test_and_clear_bit(nr, addr) ({   \
@@ -244,14 +234,14 @@ static inline int __test_and_clear_bit(i
 })
 
 /* WARNING: non atomic and it can be reordered! */
-static inline int __test_and_change_bit(int nr, volatile void *addr)
+static inline int __test_and_change_bit(int nr, void *addr)
 {
 int oldbit;
 
 asm volatile (
 "btcl %2,%1\n\tsbbl %0,%0"
-: "=r" (oldbit), "=m" (ADDR)
-: "Ir" (nr), "m" (ADDR) : "memory");
+: "=r" (oldbit), "+m" (ADDR)
+: "Ir" (nr) : "memory" );
 return oldbit;
 }
 #define __test_and_change_bit(nr, addr) ({  \


x86: simplify non-atomic bitops

- being non-atomic, their pointer arguments shouldn't be volatile-
  qualified
- their (half fake) memory operands can be a single "+m" instead of
  being both an output and an input

Signed-off-by: Jan Beulich 
---
v2: Drop "+m" related sentence from comment at the top of the file as
being wrong (the referenced indication in gcc'

Re: [Xen-devel] [PATCH v3 3/3] hvmemul_do_io: Do not retry if no ioreq server exists for this I/O.

2015-02-11 Thread Jan Beulich
>>> On 10.02.15 at 23:52,  wrote:
> This saves a VMENTRY and a VMEXIT since we no longer retry the
> ioport read on backing DM not handling a given ioreq.
> 
> There are 2 case about "no ioreq server exists for this I/O":
> 
> 1) No ioreq servers (PVH case)
> 2) No ioreq servers for this I/O (non PVH case)
> 
> The routine hvm_has_dm() used to check for the empty list, the PVH
> case (#1).
> 
> By changing from hvm_has_dm() to hvm_select_ioreq_server() both
> cases are considered.  Doing it this way allows
> hvm_send_assist_req() to only have 2 possible return values.
> 
> The key part of skipping the retry is to do "rc = X86EMUL_OKAY"
> which is what the error path on the call to hvm_has_dm() does in
> hvmemul_do_io() (the only call on hvm_has_dm()).
> 
> Since this case is no longer handled in hvm_send_assist_req(), move
> the call to hvm_complete_assist_req() into hvmemul_do_io().
> 
> As part of this change, do the work of hvm_complete_assist_req() in
> the PVH case.  Acting more like real hardware looks to be better.
> 
> Adding "rc = X86EMUL_OKAY" in the failing case of
> hvm_send_assist_req() would break what was done in commit
> bac0999325056a3b3a92f7622df7ffbc5388b1c3 and commit
> f20f3c8ece5c10fa7626f253d28f570a43b23208.  We are currently doing
> the succeeding case of hvm_send_assist_req() and retying the I/O.
> 
> Since hvm_select_ioreq_server() has already been called, switch to
> using hvm_send_assist_req_to_ioreq_server().
> 
> Since there is no longer any calls to hvm_send_assist_req(), drop
> that routine and rename hvm_send_assist_req_to_ioreq_server() to
> hvm_send_assist_req.
> 
> Since hvm_send_assist_req() is an extern, add an ASSERT() on s.
> 
> Signed-off-by: Don Slutz 
> ---
> Reviewed-by: Paul Durrant 

So Paul, does you R-b stand despite the code changes in v3?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >