date:20110203

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Avi Kivity

On 02/02/2011 05:52 PM, Jan Kiszka wrote:

>>
>  If there is no problem in the logic of this commit (and I do not see
>  one yet) then we somewhere miss kicking vcpu when interrupt, that should be
>  handled, arrives?

I'm not yet confident about the logic of the kernel patch: mov to cr8 is
serializing. If the guest raises the tpr and then signals this with a
succeeding, non vm-exiting instruction to the other vcpus, one of those
could inject an interrupt with a higher priority than the previous tpr,
but a lower one than current tpr. QEMU user space would accept this
interrupt - and would likely surprise the guest. Do I miss something?

apic_get_interrupt() is only called from the vcpu thread, so it should 
see a correct tpr.

The only difference I can see with the patch is that we may issue a 
spurious cpu_interrupt().  But that shouldn't do anything bad, should it?

--
error compiling committee.c: too many arguments to function

[Qemu-devel] [PING 0.14] Missing vnc patch

2011-02-03 Thread Corentin Chary

vnc: qemu can die if the client is disconnected while updating screen
http://patchwork.ozlabs.org/patch/80334/

(what about a "stable" alias instead of [PING x.x] ? Like sta...@kernel.org)

Thanks,
-- 
Corentin Chary
http://xf.iksaif.net

[Qemu-devel] Support for BMIPS4380 CPU?

2011-02-03 Thread xeros


Hi,
I'd like to ask about support for newer CPUs in QEMU, like this one:
system type : BCM3556C0 DTV platform
processor   : 0
cpu model   : Broadcom BMIPS4380 V4.4  FPU V0.1
BogoMIPS: 403.45
wait instruction: yes
microsecond timers  : yes
tlb_entries : 32
extra interrupt vector  : yes
hardware watchpoint : no
ASEs implemented: mips16
shadow register sets: 1
core: 0
VCED exceptions : not available
VCEI exceptions : not available

processor   : 1
cpu model   : Broadcom BMIPS4380 V4.4  FPU V0.1
BogoMIPS: 403.45
wait instruction: yes
microsecond timers  : yes
tlb_entries : 32
extra interrupt vector  : yes
hardware watchpoint : no
ASEs implemented: mips16
shadow register sets: 1
core: 0
VCED exceptions : not available
VCEI exceptions : not available

I've tried all the available -cpu and -M options combinations (in QEMU 
from 0.11 to 0.13) but still no luck to even start booting the Linux 
kernel compiled for this CPU.
When I boot the other Linux system image prepared for 'malta' machine and 
try to run any even static compiled binaries from machine which uses the 
not yet supported CPU I get 'Illegal instruction' message.


'file' command output on such binaries is:
busybox: ELF 32-bit LSB executable, MIPS, MIPS32 version 1 (SYSV), 
statically linked, stripped


While the ones compiled for system in 'malta' QEMU virtual machine is:
busybox-orig-qemu: ELF 32-bit LSB executable, MIPS, MIPS-I version 1 
(SYSV), statically linked, stripped


Thank you in advance.

--
xeros

[Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access

2011-02-03 Thread Alexander Graf


On 02.02.2011, at 23:08, Scott Wood wrote:

> On Wed, 2 Feb 2011 22:33:41 +0100
> Alexander Graf  wrote:
> 
>> 
>> On 02.02.2011, at 21:33, Yoder Stuart-B08248 wrote:
>> 
>>> Below is a proposal for a new API for PPC to allow KVM clients
>>> to set MMU state in a vcpu.
>>> 
>>> BookE processors have one or more software managed TLBs and
>>> currently there is no mechanism for Qemu to initialize
>>> or access them.  This is needed for normal initialization
>>> as well as debug.
>>> 
>>> There are 4 APIs:
>>> 
>>> -KVM_PPC_SET_MMU_TYPE allows the client to negotiate the type
>>> of MMU with KVM-- the type determines the size and format
>>> of the data in the other APIs
>> 
>> This should be done through the PVR hint in sregs, no? Usually a single CPU 
>> type only has a single MMU type.
> 
> Well, for one, we don't have sregs or a PVR hint on Book E yet. :-)

Ah, right. The BookE code just passes its host PVR to the guest. :)

> But also, there could be differing levels of support -- e.g. on e500mc,
> we have no plans to support exposing the hardware virtualization
> features in a nested manner (nor am I sure that it's reasonably
> possible).  But if someone does it, that would be a change in the
> interface between Qemu and KVM to allow the extra fields to be set,
> with no change in PVR.
> 
> Likewise, a new chip could introduce new capabilities, but still be
> capable of working the old way.
> 
> Plus, basing it on PVR means Qemu needs to be updated every time
> there's a new chip with a new PVR.

Ok, convinced. We need a way to choose an mmu model.

> 
>>> -KVM_PPC_INVALIDATE_TLB invalidates all TLB entries in all
>>> TLBs in the vcpu
>>> 
>>> -KVM_PPC_SET_TLBE sets a TLB entry-- the Power architecture
>>> specifies the format of the MMU data passed in
>> 
>> This seems to fine-grained. I'd prefer a list of all TLB entries to be 
>> pushed in either direction. What's the foreseeable number of TLB entries 
>> within the next 10 years?
> 
> I have no idea what things will look like 10 years down the road, but
> currently e500mc has 576 entries (512 TLB0, 64 TLB1).

That sums up to 64 * 576 bytes, which is 36kb. Ouch. Certainly nothing we want 
to transfer every time qemu feels like resolving an EA.

> 
>> Having the whole stack available would make the sync with qemu easier and 
>> also allows us to only do a single ioctl for all the TLB management. Thanks 
>> to the PVR we know the size of the TLB, so we don't have to shove that 
>> around.
> 
> No, we don't know the size (or necessarily even the structure) of the
> TLB.  KVM may provide a guest TLB that is larger than what hardware has,
> as a cache to reduce the number of TLB misses that have to go to the
> guest (we do this now in another hypervisor).
> 
> Plus sometimes it's just simpler -- why bother halving the size of the
> guest TLB when running on e500v1?

Makes sense. So we basically need an ioctl that tells KVM the MMU type and TLB 
size. Remember, the userspace tool is the place for policies :). Maybe this 
even needs to be potentially runtime switchable, in case you boot off with 
u-boot in the guest, load a kernel and the kernel activates some PV extensions.

> 
>>> KVM_PPC_INVALIDATE_TLB
>>> --
>>> 
>>> Capability: KVM_CAP_PPC_MMU
>>> Architectures: powerpc
>>> Type: vcpu ioctl
>>> Parameters: none
>>> Returns: 0 on success, -1 on error
>>> 
>>> Invalidates all TLB entries in all TLBs of the vcpu.
>> 
>> The only reason we need to do this is because there's no proper reset 
>> function in qemu for the e500 tlb. I'd prefer to have that there and push 
>> the TLB contents down on reset.
> 
> The other way to look at it is that there's no need for a reset
> function if all the state is properly settable. :-)

You make it sound as if it was hard to implement a reset function in qemu :). 
Really, that's where it belongs.

> 
> Which we want anyway for debugging (and migration, though I wonder if
> anyone would actually use that with embedded hardware).

We certainly should not close the door on migration either way. So all the 
state has to be 100% user space receivable.

> 
>> Haven't fully made up my mind on the tlb entry structure yet. Maybe 
>> something like
>> 
>> struct kvm_ppc_booke_tlbe {
>>__u64 data[8];
>> };
>> 
>> would be enough? The rest is implementation dependent anyways. Exposing 
>> those details to user space doesn't buy us anything. By keeping it generic 
>> we can at least still build against older kernel headers :).
> 
> If it's not exposed to userspace, how is userspace going to
> interpret/fill in the data?

It can overlay cast according to the MMU type. So userspace still has to know 
the layout of the tlbe, but it doesn't have to be defined with a huge amount of 
anonymous unions. An alternative would be to explicitly define each mmu type's 
entries:

struct kvm_ppc_booke_tlbe {
union {
struct {
...
} tlbe_e500;
struct {
...
} tlbe_e50

[Qemu-devel] Re: RFC: New API for PPC for vcpu mmu access

2011-02-03 Thread Alexander Graf


On 02.02.2011, at 23:34, Yoder Stuart-B08248 wrote:

> 
> 
>> -Original Message-
>> From: Alexander Graf [mailto:ag...@suse.de]
>> Sent: Wednesday, February 02, 2011 3:34 PM
>> To: Yoder Stuart-B08248
>> Cc: kvm-...@vger.kernel.org; k...@vger.kernel.org; qemu-devel@nongnu.org
>> Subject: Re: RFC: New API for PPC for vcpu mmu access
>> 
>> 
>> On 02.02.2011, at 21:33, Yoder Stuart-B08248 wrote:
>> 
>>> Below is a proposal for a new API for PPC to allow KVM clients to set
>>> MMU state in a vcpu.
>>> 
>>> BookE processors have one or more software managed TLBs and currently
>>> there is no mechanism for Qemu to initialize or access them.  This is
>>> needed for normal initialization as well as debug.
>>> 
>>> There are 4 APIs:
>>> 
>>> -KVM_PPC_SET_MMU_TYPE allows the client to negotiate the type of MMU
>>> with KVM-- the type determines the size and format of the data in the
>>> other APIs
>> 
>> This should be done through the PVR hint in sregs, no? Usually a single CPU
>> type only has a single MMU type.
>> 
>>> -KVM_PPC_INVALIDATE_TLB invalidates all TLB entries in all TLBs in the
>>> vcpu
>>> 
>>> -KVM_PPC_SET_TLBE sets a TLB entry-- the Power architecture specifies
>>> the format of the MMU data passed in
>> 
>> This seems to fine-grained. I'd prefer a list of all TLB entries to be
>> pushed in either direction. What's the foreseeable number of TLB entries
>> within the next 10 years?
>> 
>> Having the whole stack available would make the sync with qemu easier and
>> also allows us to only do a single ioctl for all the TLB management. Thanks
>> to the PVR we know the size of the TLB, so we don't have to shove that
>> around.
> 
> Yes, we thought about that approach but the idea here, as Scott 
> described, was to provide an API that could work if user space
> is unaware of the geometry of the TLB.

Userspace shouldn't be unaware of the TLB, that's the whole point :). In 
principle, all state should be fetchable from userspace - so it has to know the 
geometry.

> Take a look at Power ISA Version 2.06.1 (on power.org) at the definition
> of TLBnCFG in Book E.  The NENTRY and ASSOC fields now have meaning that
> allow TLB geometries that cannot be described in the TLBnCFG
> registers.

It's certainly not easy to assemble all the required information in userspace, 
but we need to do so nevertheless - having the state available simply buys us a 
lot in terms of flexibility.

> I think the use case where this API would be used the most
> would be from a gdb stub that needed to look up an effective
> address.

I agree. As I stated in my previous mail, there's probably good rationale to 
explicitly tune that path. That doesn't mean that we have to leave out the 
generic one. It should only be an acceleration.

After all, this whole flexibility thing with all the potential possibilities is 
KVM's strong point. We should not close the doors on those :).


Alex

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka

On 2011-02-03 08:42, Gleb Natapov wrote:
> On Wed, Feb 02, 2011 at 05:51:32PM +0100, Jan Kiszka wrote:
 Just did so, and I can no longer reproduce the problem. Hmm...

>>> If there is no problem in the logic of this commit (and I do not see
>>> one yet) then we somewhere miss kicking vcpu when interrupt, that 
>>> should be
>>> handled, arrives?
>>
>> I'm not yet confident about the logic of the kernel patch: mov to cr8 is
>> serializing. If the guest raises the tpr and then signals this with a
>> succeeding, non vm-exiting instruction to the other vcpus, one of those
>> could inject an interrupt with a higher priority than the previous tpr,
>> but a lower one than current tpr. QEMU user space would accept this
>> interrupt - and would likely surprise the guest. Do I miss something?
>>
> Injection happens by vcpu thread on cpu entry:
> run->request_interrupt_window = kvm_arch_try_push_interrupts(env);
> and tpr is synced on vcpu exit, so I do not yet see how what you describe
> above may happen since during injection vcpu should see correct tpr.

 Hmm, maybe this is the key: Once we call into apic_get_interrupt
 (because CPU_INTERRUPT_HARD was set as described above) and we find a
 pending irq below the tpr, we inject a spurious vector instead.

>>> That should be easy to verify. I expect Windows to BSOD upon receiving
>>> spurious vector though.
>>
>> I hacked spurious irq injection away, but the issue remains. At the same
>> time, Windows is receiving tons of spurious interrupts without any
>> complaints, even without that tpr optimization in the kernel. So this is
>> obviously not yet the key.
>>
>> Let's try your idea that we miss a wakeup.
>>
> That is unlikely too. If vcpu missed wakeup, "info cpus" would solve the
> hang since it would kick vcpu out of the kernel and missed interrupt would be
> injected on re-entry.

Yeah, and it wouldn't explain the various BSOFs I'm seeing (you get an
even broader spectrum when trying the Windows installations DVDs).

We are probably digging at the wrong site.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka

On 2011-02-03 09:18, Avi Kivity wrote:
> On 02/02/2011 05:52 PM, Jan Kiszka wrote:

>>>  If there is no problem in the logic of this commit (and I do not see
>>>  one yet) then we somewhere miss kicking vcpu when interrupt, that should be
>>>  handled, arrives?
>>
>> I'm not yet confident about the logic of the kernel patch: mov to cr8 is
>> serializing. If the guest raises the tpr and then signals this with a
>> succeeding, non vm-exiting instruction to the other vcpus, one of those
>> could inject an interrupt with a higher priority than the previous tpr,
>> but a lower one than current tpr. QEMU user space would accept this
>> interrupt - and would likely surprise the guest. Do I miss something?
> 
> apic_get_interrupt() is only called from the vcpu thread, so it should 
> see a correct tpr.
> 
> The only difference I can see with the patch is that we may issue a 
> spurious cpu_interrupt().  But that shouldn't do anything bad, should it?

I tested this yesterday, and it doesn't confuse Windows. It actually
receives quite a few spurious IRQs in normal operation, w/ or w/o the
kernel's tpr optimization.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [Bug 712337] [NEW] connecthon basic test5 failed with qemu 0.14 on Virtfs path in guest

2011-02-03 Thread Madhuri

Public bug reported:

connecthon basic test named test5 is failing with bigfile write failed
bad address on .L passthru and .L mapped Virtfs path in guest. with
fedora12

Bug is with latest qemu-0.14.0-rc0

connecthon tarball /root/project_CI/client/tests/connecthon/cthon04.tgz
02/03 08:55:09 INFO |kvm_subpro:0880| 11:55:08 ERROR| [stderr]  ./test5: 
(/root/mount3/test2011-02-0311:55) 'bigfile' write failed : Bad address
02/03 08:55:09 INFO |kvm_subpro:0880| 11:55:08 ERROR| Test failed: Command 
<./runtests -N 100 -b -t /root/mount3/test2011-02-0311:55> failed, rc=1, 
Command returned non-zero exit status
02/03 08:55:09 INFO |kvm_subpro:0880| * Command: 
02/03 08:55:09 INFO |kvm_subpro:0880| ./runtests -N 100 -b -t 
/root/mount3/test2011-02-0311:55
02/03 08:55:09 INFO |kvm_subpro:0880| Exit status: 1
02/03 08:55:09 INFO |kvm_subpro:0880| Duration: 0
02/03 08:55:09 INFO |kvm_subpro:0880| 
02/03 08:55:09 INFO |kvm_subpro:0880| stdout:
02/03 08:55:09 INFO |kvm_subpro:0880| ... Pass 1 ...
02/03 08:55:09 INFO |kvm_subpro:0880| 
02/03 08:55:09 INFO |kvm_subpro:0880| Starting BASIC tests: test directory 
/root/mount3/test2011-02-0311:55 (arg: -t)
02/03 08:55:09 INFO |kvm_subpro:0880| 
02/03 08:55:09 INFO |kvm_subpro:0880| ./test1: File and directory creation test
02/03 08:55:09 INFO |kvm_subpro:0880|   created 155 files 62 directories 5 
levels deep in 0.6  seconds
02/03 08:55:09 INFO |kvm_subpro:0880|   ./test1 ok.
02/03 08:55:09 INFO |kvm_subpro:0880| 
02/03 08:55:09 INFO |kvm_subpro:0880| ./test2: File and directory removal test
02/03 08:55:09 INFO |kvm_subpro:0880|   removed 155 files 62 directories 5 
levels deep in 0.4  seconds
02/03 08:55:09 INFO |kvm_subpro:0880|   ./test2 ok.
02/03 08:55:09 INFO |kvm_subpro:0880| 
02/03 08:55:09 INFO |kvm_subpro:0880| ./test3: lookups across mount point
02/03 08:55:09 INFO |kvm_subpro:0880|   500 getcwd and stat calls in 0.0  
seconds
02/03 08:55:09 INFO |kvm_subpro:0880|   ./test3 ok.
02/03 08:55:09 INFO |kvm_subpro:0880| 
02/03 08:55:09 INFO |kvm_subpro:0880| ./test4: setattr, getattr, and lookup
02/03 08:55:09 INFO |kvm_subpro:0880|   1000 chmods and stats on 10 files in 
0.24 seconds
02/03 08:55:09 INFO |kvm_subpro:0880|   ./test4 ok.
02/03 08:55:09 INFO |kvm_subpro:0880| 
02/03 08:55:09 INFO |kvm_subpro:0880| ./test5: read and write
02/03 08:55:09 INFO |kvm_subpro:0880| basic tests failed
02/03 08:55:09 INFO |kvm_subpro:0880| stderr:
02/03 08:55:09 INFO |kvm_subpro:0880|   ./test5: 
(/root/mount3/test2011-02-0311:55) 'bigfile' write failed : Bad address
02/03 08:55:09 INFO |kvm_subpro:0880| 11:55:08 INFO | Test finished after 1 
iterations.
02/03 08:55:10 INFO |kvm_subpro:0880| 11:55:09 ERROR| child process failed
02/03 08:55:10 INFO |kvm_subpro:0880| 11:55:09 INFO |   FAIL
connecthon.itera-pass-dotl-100-test-bt  connecthon.itera-pass-dotl-100-test-bt  
timestamp=1296752109

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/712337

Title:
  connecthon basic test5 failed with qemu 0.14 on Virtfs path in guest

Status in QEMU:
  New

Bug description:
  connecthon basic test named test5 is failing with bigfile write failed
  bad address on .L passthru and .L mapped Virtfs path in guest. with
  fedora12

  Bug is with latest qemu-0.14.0-rc0

  connecthon tarball /root/project_CI/client/tests/connecthon/cthon04.tgz
  02/03 08:55:09 INFO |kvm_subpro:0880| 11:55:08 ERROR| [stderr]
./test5: (/root/mount3/test2011-02-0311:55) 'bigfile' write failed : Bad address
  02/03 08:55:09 INFO |kvm_subpro:0880| 11:55:08 ERROR| Test failed: Command 
<./runtests -N 100 -b -t /root/mount3/test2011-02-0311:55> failed, rc=1, 
Command returned non-zero exit status
  02/03 08:55:09 INFO |kvm_subpro:0880| * Command: 
  02/03 08:55:09 INFO |kvm_subpro:0880| ./runtests -N 100 -b -t 
/root/mount3/test2011-02-0311:55
  02/03 08:55:09 INFO |kvm_subpro:0880| Exit status: 1
  02/03 08:55:09 INFO |kvm_subpro:0880| Duration: 0
  02/03 08:55:09 INFO |kvm_subpro:0880| 
  02/03 08:55:09 INFO |kvm_subpro:0880| stdout:
  02/03 08:55:09 INFO |kvm_subpro:0880| ... Pass 1 ...
  02/03 08:55:09 INFO |kvm_subpro:0880| 
  02/03 08:55:09 INFO |kvm_subpro:0880| Starting BASIC tests: test directory 
/root/mount3/test2011-02-0311:55 (arg: -t)
  02/03 08:55:09 INFO |kvm_subpro:0880| 
  02/03 08:55:09 INFO |kvm_subpro:0880| ./test1: File and directory creation 
test
  02/03 08:55:09 INFO |kvm_subpro:0880| created 155 files 62 
directories 5 levels deep in 0.6  seconds
  02/03 08:55:09 INFO |kvm_subpro:0880| ./test1 ok.
  02/03 08:55:09 INFO |kvm_subpro:0880| 
  02/03 08:55:09 INFO |kvm_subpro:0880| ./test2: File and directory removal test
  02/03 08:55:09 INFO |kvm_subpro:0880| removed 155 files 62 
directories 5 levels deep in 0.4  seconds
  02/03 08:55:09 INFO |kvm_subpro:0880| ./test2 ok.
  02/03 08:

[Qemu-devel] RFC: usb redirection protocol v0.2

2011-02-03 Thread Hans de Goede


Hi All,

Attached is version 0.2 of the protocol for the usb redirection stuff
I've been working on. This contains a number of changes based on Gerd
Hoffmann's recommendations and some things which I noticed while
actually implementing things.

Note that this version still is missing a packet type for interrupt
transfers. I've not gotten around to thinking much about those yet,
so rather then adding something adhoc, I've just left it out for now.

Note you can also access this draft here:
http://people.fedoraproject.org/~jwrdegoede/usb-redirection-protocol-0.2.txt

Regards,

Hans
USB Network Redirection protocol description version 0.2 (unreleased/wip)

Revisions
-

Version 0.1
-Initial version (released as initial RFC without a version number)

Version 0.2
-Remove usb_redir_report_descriptor packet, as it is not possible to get
 the cached descriptors from the OS on all platforms and we can do without
-Replace vm-host with usb-guest
-Replace the synchroneous / asynchroneous commands nomenclature with
 control / data packets
-Move the packet id to the main packet header shared by all packets
-Add note: "All integers in the protocol are send over the pipe in least
 significant byte first order."
-Add note: "All structs are packed"
-s/data_size/length/
-Add a usb_redir_cancel_data_packet packet
-Add usb_redir_reset and usb_redir_reset_status packets


USB redirerection protocol (draft)
--

The protocol described in this document is meant for tunneling usb transfers
to a single usb device. Note: not an entire hub, only a single device.

The most significant use case for this is taking a usb device attached to
some machine "a" which acts as a client / viewer to a virtual machine "v"
hosted on another machine "b", and make the usb device show up inside the
virtual machine as if it were attached directly to the virtual machine "v".

The described protocol assumes a reliable ordered bidirectional transport is
available, for example a tcp socket. All integers in the protocol are send
over the pipe in least significant byte first order. All structs send over
the pipe are packed (no padding).

Definitions:
usb-device: The usb-device whose usb transfers are being tunneled.
usb-guest: The entity connecting to the usb-device and using it as if
connected directly to it. For example a virtual machine running a guest
os which accesses a usb-device over the network as if it is part of the
virtual machine.
usb-host: The entity making the usb-device available for use by a usb-guest.
For example a deamon on a machine which "exports" the usb-device over the
network which then "appears" inside a virtual machine on another machine.


Basic packet structure / communication
--

Each packet exchanged between the usb-guest and the usb-host starts with a
usb_redir_header, followed by an optional packet type specific header
follow by optional additional data.

The usb_redir_header each packet starts with looks as follows:

struct usb_redir_header {
uint32_t type;
uint32_t length;
uint32_t id;
}

type:This identifies the type of packet, from the type enum
length:  Length of the optional type specific packet header + the optional
 additional data. Can be 0.
id:  A unique id, generated by the usb-guest when sending a packet,
 the usb-host will use the same id in its response packet, allowing
 the usb-guest to match responses to its original requests.

There are 2 types of packets:

1) control packets
2) data packets

Control packets are handled synchroneously inside the usb-host, it will hand
the request over to the host os and then *wait* for a response. The usb-host
will thus stop processing further packets. Where as for data packets the
usb-host hands them over to the host os with the request to let the usb-host
process know when there is a respone from the usb-device.

Note that control packets should only be send to the usb-host when no data
packets are pending on the device / interface / endpoint affected by the
control packet. Any pending data packets will get dropped, and any active
iso streams / allocated bulk streams will get stopped / free-ed.


Packet type list


control packets:
usb_redir_hello
usb_redir_reset
usb_redir_reset_status
usb_redir_set_configuration
usb_redir_get_configuration
usb_redir_configuration_status
usb_redir_set_alt_setting
usb_redir_get_alt_setting
usb_redir_alt_setting_status
usb_redir_start_iso_stream
usb_redir_stop_iso_stream
usb_redir_iso_stream_status
usb_redir_alloc_bulk_streams
usb_redir_free_bulk_streams
usb_redir_bulk_streams_status
usb_redir_cancel_data_packet

data packets:
usb_redir_control_packet
usb_redir_bulk_packet
usb_redir_iso_packet


usb_redir_hello
---

usb_redir_header.type:usb_redir_hello
usb_redir_header.length:  

struct usb_redir_hello_header {
char version[64];
uint32_t capabilities[0];
}

No pac

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Avi Kivity


On 02/03/2011 11:32 AM, Jan Kiszka wrote:

On 2011-02-03 09:18, Avi Kivity wrote:
>  On 02/02/2011 05:52 PM, Jan Kiszka wrote:

>>>   If there is no problem in the logic of this commit (and I do not see
>>>   one yet) then we somewhere miss kicking vcpu when interrupt, that should 
be
>>>   handled, arrives?
>>
>>  I'm not yet confident about the logic of the kernel patch: mov to cr8 is
>>  serializing. If the guest raises the tpr and then signals this with a
>>  succeeding, non vm-exiting instruction to the other vcpus, one of those
>>  could inject an interrupt with a higher priority than the previous tpr,
>>  but a lower one than current tpr. QEMU user space would accept this
>>  interrupt - and would likely surprise the guest. Do I miss something?
>
>  apic_get_interrupt() is only called from the vcpu thread, so it should
>  see a correct tpr.
>
>  The only difference I can see with the patch is that we may issue a
>  spurious cpu_interrupt().  But that shouldn't do anything bad, should it?

I tested this yesterday, and it doesn't confuse Windows. It actually
receives quite a few spurious IRQs in normal operation, w/ or w/o the
kernel's tpr optimization.


I don't see why there should be any spurious interrupts in normal 
operation.  From the docs, these happen due to an INTA cycle racing with 
raising the TPR, but in ioapic mode, there shouldn't be any INTA cycles.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Marcelo Tosatti

On Thu, Feb 03, 2011 at 10:32:25AM +0100, Jan Kiszka wrote:
> On 2011-02-03 09:18, Avi Kivity wrote:
> > On 02/02/2011 05:52 PM, Jan Kiszka wrote:
> 
> >>>  If there is no problem in the logic of this commit (and I do not see
> >>>  one yet) then we somewhere miss kicking vcpu when interrupt, that should 
> >>> be
> >>>  handled, arrives?
> >>
> >> I'm not yet confident about the logic of the kernel patch: mov to cr8 is
> >> serializing. If the guest raises the tpr and then signals this with a
> >> succeeding, non vm-exiting instruction to the other vcpus, one of those
> >> could inject an interrupt with a higher priority than the previous tpr,
> >> but a lower one than current tpr. QEMU user space would accept this
> >> interrupt - and would likely surprise the guest. Do I miss something?
> > 
> > apic_get_interrupt() is only called from the vcpu thread, so it should 
> > see a correct tpr.
> > 
> > The only difference I can see with the patch is that we may issue a 
> > spurious cpu_interrupt().  But that shouldn't do anything bad, should it?
> 
> I tested this yesterday, and it doesn't confuse Windows. It actually
> receives quite a few spurious IRQs in normal operation, w/ or w/o the
> kernel's tpr optimization.

http://www.mail-archive.com/kvm@vger.kernel.org/msg41681.html

tpr of a vcpu should always be inspected in vcpu context, instead of 
iothread context?

Re: [Qemu-devel] [PING 0.14] Missing patches (mostly fixes)

2011-02-03 Thread Markus Armbruster

Also:

From: Gleb Natapov 
Subject: [Qemu-devel] [PATCH] do not pass NULL to strdup.
Date: Wed, 2 Feb 2011 17:34:34 +0200
Message-ID: <20110202153434.gp14...@redhat.com>

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka

On 2011-02-03 11:04, Marcelo Tosatti wrote:
> On Thu, Feb 03, 2011 at 10:32:25AM +0100, Jan Kiszka wrote:
>> On 2011-02-03 09:18, Avi Kivity wrote:
>>> On 02/02/2011 05:52 PM, Jan Kiszka wrote:
>>
>  If there is no problem in the logic of this commit (and I do not see
>  one yet) then we somewhere miss kicking vcpu when interrupt, that should 
> be
>  handled, arrives?

 I'm not yet confident about the logic of the kernel patch: mov to cr8 is
 serializing. If the guest raises the tpr and then signals this with a
 succeeding, non vm-exiting instruction to the other vcpus, one of those
 could inject an interrupt with a higher priority than the previous tpr,
 but a lower one than current tpr. QEMU user space would accept this
 interrupt - and would likely surprise the guest. Do I miss something?
>>>
>>> apic_get_interrupt() is only called from the vcpu thread, so it should 
>>> see a correct tpr.
>>>
>>> The only difference I can see with the patch is that we may issue a 
>>> spurious cpu_interrupt().  But that shouldn't do anything bad, should it?
>>
>> I tested this yesterday, and it doesn't confuse Windows. It actually
>> receives quite a few spurious IRQs in normal operation, w/ or w/o the
>> kernel's tpr optimization.
> 
> http://www.mail-archive.com/kvm@vger.kernel.org/msg41681.html

Don't get the scenario yet: We do not inject (or set isr) over the
context of apic_set_irq caller.

> 
> tpr of a vcpu should always be inspected in vcpu context, instead of 
> iothread context?

Maybe this is true for the in-kernel model, but I don't see the issue
(anymore) for the way user space works.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka

On 2011-02-03 11:01, Avi Kivity wrote:
> On 02/03/2011 11:32 AM, Jan Kiszka wrote:
>> On 2011-02-03 09:18, Avi Kivity wrote:
>>>  On 02/02/2011 05:52 PM, Jan Kiszka wrote:
>>
>   If there is no problem in the logic of this commit (and I do not see
>   one yet) then we somewhere miss kicking vcpu when interrupt, that 
> should be
>   handled, arrives?

  I'm not yet confident about the logic of the kernel patch: mov to cr8 is
  serializing. If the guest raises the tpr and then signals this with a
  succeeding, non vm-exiting instruction to the other vcpus, one of those
  could inject an interrupt with a higher priority than the previous tpr,
  but a lower one than current tpr. QEMU user space would accept this
  interrupt - and would likely surprise the guest. Do I miss something?
>>>
>>>  apic_get_interrupt() is only called from the vcpu thread, so it should
>>>  see a correct tpr.
>>>
>>>  The only difference I can see with the patch is that we may issue a
>>>  spurious cpu_interrupt().  But that shouldn't do anything bad, should it?
>>
>> I tested this yesterday, and it doesn't confuse Windows. It actually
>> receives quite a few spurious IRQs in normal operation, w/ or w/o the
>> kernel's tpr optimization.
> 
> I don't see why there should be any spurious interrupts in normal 
> operation.  From the docs, these happen due to an INTA cycle racing with 
> raising the TPR, but in ioapic mode, there shouldn't be any INTA cycles.
> 

I added an instrumentation to the line of apic_get_interrupt that
returns the spurious vector, and it triggered fairly often. Just didn't
examined why this happens even without the tpr optimization.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: KVM call minutes for Feb 1

2011-02-03 Thread Marcelo Tosatti

On Tue, Feb 01, 2011 at 06:34:50PM +0100, Jan Kiszka wrote:
> On 2011-02-01 18:20, Anthony Liguori wrote:
> > On 02/01/2011 11:03 AM, Jan Kiszka wrote:
> >> On 2011-02-01 17:53, Anthony Liguori wrote:
> >>
> >>> On 02/01/2011 10:36 AM, Jan Kiszka wrote:
> >>>  
>  On 2011-02-01 16:54, Chris Wright wrote:
> 
> 
> > KVM upstream merge: status, plans, coordination
> > - Jan has a git tree, consolidating
> > - qemu-kvm io threading is still an issue
> > - Anthony wants to just merge
> > - concerns with non-x86 arch and merge
> > - concerns with big-bang patch merge and following stability
> > - post 0.14 conversion to glib mainloop, non-upstreamed qemu-kvm will be
> > a problem if it's not there by then
> > - testing and nuances are still an issue (e.g. stefan berger's mmio 
> > read issue)
> > - qemu-kvm still evolving, needs to get sync'd or it will keep diverging
> > - 2 implementations of main init, cpu init, Jan has merged them into one
> > - qemu-kvm-x86.c file that's only a few hundred lines
> > - review as one patch to see the fundamental difference
> >
> >  
>  More precisely, my current work flow is to pick some function(s), e.g.
>  kvm_cpu_exec/kvm_run, and start wondering "What needs to be done to
>  upstream so that qemu-kvm could use that implementation?". If they
>  differ, the reasons need to be understood and patched away, either by
>  fixing/enhancing upstream or simplifying qemu-kvm. Once the upstream
>  changes are merged back, a qemu-kvm patch is posted to switch to that
>  version.
> 
>  Any help will be welcome, either via review of my subtle regressions or
>  on resolving concrete differences.
> 
>  E.g. posix-aio-compat.c: Why does qemu-kvm differ here? If it's because
>  of its own iothread code, can we wrap that away or do we need to
>  consolidate the threading code first? Or do we need to fix something in
>  upstream?
> 
> 
> >>> I bet it's the eventfd thing.  It's arbitrary.  If you've got a small
> >>> diff post your series, I'd be happy to take a look at it and see what I
> >>> can explain.
> >>>
> >>>  
> >> Looks like it's around signalfd and its emulation:
> >>
> > 
> > I really meant the compatfd thing.
> > 
> > signalfd can't really be emulated properly so in upstream we switched to 
> > a pipe() which Avi didn't like.
> > 
> > But with glib, this all goes away anyway so we should just drop the 
> > qemu-kvm changes and use the upstream version.  Once we enable I/O 
> > thread in qemu.git, we no longer need to use signals for I/O completion 
> > which I think everyone would agree is a better solution.
> Don't understand: If we do not need SIGIO for AIO emulation in threaded
> mode, why wasn't that stubbed out already? If that helps reducing
> worries about the signalfd emulation (which is likely a non-issue anyway
> as anyone with serious workload should run a kernel with such support).

qemu-kvm has this modification for performance reasons.
SIGUSR2 can't be blocked otherwise. See example test case at
https://patchwork.kernel.org/patch/20817/.

Problem is that you can't block the AIO signal and process it via
signalfd because of synchronous IO emulation:

- submit io
- qemu_aio_wait

Since the aio signal is processed in main_loop_wait by the iothread, the
above deadlocks. To be more clear:

SIGUSR2 unblocked:
signal -> aio_signal_handler -> write(posix_fd)

SIGUSR2 blocked:
signal -> signalfd -> aio_signal_handler -> write(posix_fd)

It would be good to maintain this behaviour upstream, before switching
(can be selective on CONFIG_IOTHREAD), IMO.

RE: [Qemu-devel] Re: [PING 0.14] Missing patches (mostly fixes)

2011-02-03 Thread Laurent Vivier

>On Wed, Feb 02, 2011 at 08:28:15PM +0100, Stefan Weil wrote:
>> [PATCH] linux-user: Fix possible realloc memory leak  
>> (http://patchwork.ozlabs.org/patch/79217/)
>
>Looks ok for me.

And this one ?

linux-user: correct core dump format

http://patchwork.ozlabs.org/patch/78464/

Laurent

-- 
- laur...@vivier.eu  -
"Tout ce qui est impossible reste à accomplir"Jules Verne
"Things are only impossible until they're not" Jean-Luc Picard

[Qemu-devel] Re: KVM call minutes for Feb 1

2011-02-03 Thread Marcelo Tosatti

On Tue, Feb 01, 2011 at 12:53:36PM -0500, Christoph Hellwig wrote:
> On Tue, Feb 01, 2011 at 05:36:13PM +0100, Jan Kiszka wrote:
> > kvm_cpu_exec/kvm_run, and start wondering "What needs to be done to
> > upstream so that qemu-kvm could use that implementation?". If they
> > differ, the reasons need to be understood and patched away, either by
> > fixing/enhancing upstream or simplifying qemu-kvm. Once the upstream
> > changes are merged back, a qemu-kvm patch is posted to switch to that
> > version.
> 
> while I'm not an expert in that area I really like you're approach.  I'd
> really prefer to let you finish up all the major work that way before
> starting massive revamping like the glib main loop.  Resolving the
> qemu/qemu-kvm schisma surely is more important for the overall project
> than rewriting existing functionality to look a little nicer.

Agree.

Re: [Qemu-devel] [PATCH 01/20] qdev: add print_options callback

2011-02-03 Thread Alon Levy

Please ignore, duplicate of 01/20 already on the list. It took so long for
01/20 to appear on the list that I thought it was lost.

Alon

On Wed, Feb 02, 2011 at 11:46:07PM +0200, Alon Levy wrote:
> another callback added to PropertyInfo, for later use by PROP_TYPE_ENUM.
> Allows printing of runtime computed options when doing:
>  qemu -device foo,?
> ---
>  hw/qdev.c |   10 +-
>  hw/qdev.h |1 +
>  2 files changed, 10 insertions(+), 1 deletions(-)
> 
> diff --git a/hw/qdev.c b/hw/qdev.c
> index c7fec44..0b2ad3d 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -187,7 +187,15 @@ int qdev_device_help(QemuOpts *opts)
>  if (!prop->info->parse) {
>  continue;   /* no way to set it, don't show */
>  }
> -error_printf("%s.%s=%s\n", info->name, prop->name, prop->info->name);
> +if (prop->info->print_options) {
> +char buf[256];
> +int ret;
> +ret = prop->info->print_options(info, prop, buf, sizeof(buf) - 
> 3);
> +error_printf("%s.%s=%s%s\n", info->name, prop->name, buf,
> +ret == sizeof(buf) - 3 ? "..." : "" );
> +} else {
> +error_printf("%s.%s=%s\n", info->name, prop->name, 
> prop->info->name);
> +}
>  }
>  return 1;
>  }
> diff --git a/hw/qdev.h b/hw/qdev.h
> index 9808f85..fa3221b 100644
> --- a/hw/qdev.h
> +++ b/hw/qdev.h
> @@ -109,6 +109,7 @@ struct PropertyInfo {
>  enum PropertyType type;
>  int (*parse)(DeviceState *dev, Property *prop, const char *str);
>  int (*print)(DeviceState *dev, Property *prop, char *dest, size_t len);
> +int (*print_options)(DeviceInfo *info, Property *prop, char *dest, 
> size_t len);
>  void (*free)(DeviceState *dev, Property *prop);
>  };
>  
> -- 
> 1.7.4
> 
>

[Qemu-devel] Re: [PATCH 7/7] ahci: work around bug with level interrupts

2011-02-03 Thread Jan Kiszka

On 2011-02-02 15:39, Alexander Graf wrote:
> When using level based interrupts, the interrupt is treated the same as an
> edge triggered one: leaving the line up does not retrigger the interrupt.
> 
> In fact, when not lowering the line, we won't ever get a new interrupt inside
> the guest. So let's always retrigger an interrupt as soon as the OS ack'ed
> something on the device. This way we're sure the guest doesn't starve on
> interrupts until someone fixes the actual interrupt path.
> 
> Signed-off-by: Alexander Graf 
> 
> ---
> 
> v2 -> v3:
> 
>   - add comment about the interrupt hack
> 
> v3 -> v4:
> 
>   - embed non-workaround version in the code
> ---
>  hw/ide/ahci.c |   13 +
>  1 files changed, 13 insertions(+), 0 deletions(-)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index 98bdf70..10377ca 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -152,12 +152,25 @@ static void ahci_check_irq(AHCIState *s)
>  }
>  }
>  
> +/* XXX We always lower the interrupt line here because of a bug with
> +   interrupt handling in Qemu. When leaving a line up, the interrupt
> +   does not get retriggered automatically currently. Once that bug is
> +   fixed, this workaround is not necessary anymore and we only need 
> to
> +   lower in the else branch. */
> +#if 0
>  if (s->control_regs.irqstatus &&
>  (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>  ahci_irq_raise(s, NULL);
>  } else {
>  ahci_irq_lower(s, NULL);
>  }
> +#else
> +ahci_irq_lower(s, NULL);
> +if (s->control_regs.irqstatus &&
> +(s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
> +ahci_irq_raise(s, NULL);
> +}
> +#endif
>  }
>  
>  static void ahci_trigger_irq(AHCIState *s, AHCIDevice *d,

Could you check if this quick-hack obsoletes the above hack?

I was hoping it has some influence on our 64-bit Windows issue, but it
hasn't, or it's still buggy, or both. However, its intention is to
reassert still pending level-triggered IRQs on APIC EOI. This logic is
missing in the user space model but exists in KVM's kernel model (I was
asking for a test against the latter but unfortunately did not receive
an answer - I bet you already don't need your patch over qemu-kvm).

Thanks,
Jan

---
 hw/apic.c   |9 ++---
 hw/ioapic.c |   43 +--
 hw/ioapic.h |   20 
 3 files changed, 67 insertions(+), 5 deletions(-)
 create mode 100644 hw/ioapic.h

diff --git a/hw/apic.c b/hw/apic.c
index ff581f0..05a115f 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -18,6 +18,7 @@
  */
 #include "hw.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "qemu-timer.h"
 #include "host-utils.h"
 #include "sysbus.h"
@@ -57,7 +58,8 @@
 
 #define ESR_ILLEGAL_ADDRESS (1 << 7)
 
-#define APIC_SV_ENABLE (1 << 8)
+#define APIC_SV_DIRECTED_IO (1<<12)
+#define APIC_SV_ENABLE  (1<<8)
 
 #define MAX_APICS 255
 #define MAX_APIC_WORDS 8
@@ -420,8 +422,9 @@ static void apic_eoi(APICState *s)
 if (isrv < 0)
 return;
 reset_bit(s->isr, isrv);
-/* XXX: send the EOI packet to the APIC bus to allow the I/O APIC to
-set the remote IRR bit for level triggered interrupts. */
+if (!(s->spurious_vec & APIC_SV_DIRECTED_IO) && get_bit(s->tmr, isrv)) {
+ioapic_eio_broadcast(isrv);
+}
 apic_update_irq(s);
 }
 
diff --git a/hw/ioapic.c b/hw/ioapic.c
index 2109568..1d23474 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -23,6 +23,7 @@
 #include "hw.h"
 #include "pc.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "qemu-timer.h"
 #include "host-utils.h"
 #include "sysbus.h"
@@ -36,7 +37,10 @@
 #define DPRINTF(fmt, ...)
 #endif
 
-#define IOAPIC_LVT_MASKED  (1<<16)
+#define MAX_IOAPICS 1
+
+#define IOAPIC_LVT_MASKED   (1ULL << 16)
+#define IOAPIC_LVT_REMOTE_IRR   (1ULL << 14)
 
 #define IOAPIC_TRIGGER_EDGE0
 #define IOAPIC_TRIGGER_LEVEL   1
@@ -61,6 +65,8 @@ struct IOAPICState {
 uint64_t ioredtbl[IOAPIC_NUM_PINS];
 };
 
+static IOAPICState *ioapics[MAX_IOAPICS];
+
 static void ioapic_service(IOAPICState *s)
 {
 uint8_t i;
@@ -83,8 +89,11 @@ static void ioapic_service(IOAPICState *s)
 dest_mode = (entry >> 11) & 1;
 delivery_mode = (entry >> 8) & 7;
 polarity = (entry >> 13) & 1;
-if (trig_mode == IOAPIC_TRIGGER_EDGE)
+if (trig_mode == IOAPIC_TRIGGER_EDGE) {
 s->irr &= ~mask;
+} else {
+s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
+}
 if (delivery_mode == IOAPIC_DM_EXTINT)
 vector = pic_read_irq(isa_pic);
 else
@@ -131,6 +140,29 @@ static void ioapic_set_irq(void *opaque, int vector, int 
level)
 }
 }
 
+void ioapic_eio_broadcast(int vector)
+{
+IOAPICState *s;
+

[Qemu-devel] Re: [PATCH 7/7] ahci: work around bug with level interrupts

2011-02-03 Thread Alexander Graf


On 03.02.2011, at 11:30, Jan Kiszka wrote:

> On 2011-02-02 15:39, Alexander Graf wrote:
>> When using level based interrupts, the interrupt is treated the same as an
>> edge triggered one: leaving the line up does not retrigger the interrupt.
>> 
>> In fact, when not lowering the line, we won't ever get a new interrupt inside
>> the guest. So let's always retrigger an interrupt as soon as the OS ack'ed
>> something on the device. This way we're sure the guest doesn't starve on
>> interrupts until someone fixes the actual interrupt path.
>> 
>> Signed-off-by: Alexander Graf 
>> 
>> ---
>> 
>> v2 -> v3:
>> 
>>  - add comment about the interrupt hack
>> 
>> v3 -> v4:
>> 
>>  - embed non-workaround version in the code
>> ---
>> hw/ide/ahci.c |   13 +
>> 1 files changed, 13 insertions(+), 0 deletions(-)
>> 
>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
>> index 98bdf70..10377ca 100644
>> --- a/hw/ide/ahci.c
>> +++ b/hw/ide/ahci.c
>> @@ -152,12 +152,25 @@ static void ahci_check_irq(AHCIState *s)
>> }
>> }
>> 
>> +/* XXX We always lower the interrupt line here because of a bug with
>> +   interrupt handling in Qemu. When leaving a line up, the interrupt
>> +   does not get retriggered automatically currently. Once that bug 
>> is
>> +   fixed, this workaround is not necessary anymore and we only need 
>> to
>> +   lower in the else branch. */
>> +#if 0
>> if (s->control_regs.irqstatus &&
>> (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>> ahci_irq_raise(s, NULL);
>> } else {
>> ahci_irq_lower(s, NULL);
>> }
>> +#else
>> +ahci_irq_lower(s, NULL);
>> +if (s->control_regs.irqstatus &&
>> +(s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>> +ahci_irq_raise(s, NULL);
>> +}
>> +#endif
>> }
>> 
>> static void ahci_trigger_irq(AHCIState *s, AHCIDevice *d,
> 
> Could you check if this quick-hack obsoletes the above hack?
> 
> I was hoping it has some influence on our 64-bit Windows issue, but it
> hasn't, or it's still buggy, or both. However, its intention is to
> reassert still pending level-triggered IRQs on APIC EOI. This logic is
> missing in the user space model but exists in KVM's kernel model (I was
> asking for a test against the latter but unfortunately did not receive
> an answer - I bet you already don't need your patch over qemu-kvm).

Oh, sorry for not replying there. I work on qemu.git, so the in-kernel apic is 
out of question for testing. I tried merging qemu-kvm.git and qemu.git several 
times and every time just wasted a few hours of my life, so I gave up on 
testing things on qemu-kvm.git.

The patch works though. If everybody agrees to take it, we can drop patch 7/7 
of my ahci set.


Alex

[Qemu-devel] Re: [PATCH 7/7] ahci: work around bug with level interrupts

2011-02-03 Thread Jan Kiszka

On 2011-02-03 11:38, Alexander Graf wrote:
> 
> On 03.02.2011, at 11:30, Jan Kiszka wrote:
> 
>> On 2011-02-02 15:39, Alexander Graf wrote:
>>> When using level based interrupts, the interrupt is treated the same as an
>>> edge triggered one: leaving the line up does not retrigger the interrupt.
>>>
>>> In fact, when not lowering the line, we won't ever get a new interrupt 
>>> inside
>>> the guest. So let's always retrigger an interrupt as soon as the OS ack'ed
>>> something on the device. This way we're sure the guest doesn't starve on
>>> interrupts until someone fixes the actual interrupt path.
>>>
>>> Signed-off-by: Alexander Graf 
>>>
>>> ---
>>>
>>> v2 -> v3:
>>>
>>>  - add comment about the interrupt hack
>>>
>>> v3 -> v4:
>>>
>>>  - embed non-workaround version in the code
>>> ---
>>> hw/ide/ahci.c |   13 +
>>> 1 files changed, 13 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
>>> index 98bdf70..10377ca 100644
>>> --- a/hw/ide/ahci.c
>>> +++ b/hw/ide/ahci.c
>>> @@ -152,12 +152,25 @@ static void ahci_check_irq(AHCIState *s)
>>> }
>>> }
>>>
>>> +/* XXX We always lower the interrupt line here because of a bug with
>>> +   interrupt handling in Qemu. When leaving a line up, the 
>>> interrupt
>>> +   does not get retriggered automatically currently. Once that bug 
>>> is
>>> +   fixed, this workaround is not necessary anymore and we only 
>>> need to
>>> +   lower in the else branch. */
>>> +#if 0
>>> if (s->control_regs.irqstatus &&
>>> (s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>>> ahci_irq_raise(s, NULL);
>>> } else {
>>> ahci_irq_lower(s, NULL);
>>> }
>>> +#else
>>> +ahci_irq_lower(s, NULL);
>>> +if (s->control_regs.irqstatus &&
>>> +(s->control_regs.ghc & HOST_CTL_IRQ_EN)) {
>>> +ahci_irq_raise(s, NULL);
>>> +}
>>> +#endif
>>> }
>>>
>>> static void ahci_trigger_irq(AHCIState *s, AHCIDevice *d,
>>
>> Could you check if this quick-hack obsoletes the above hack?
>>
>> I was hoping it has some influence on our 64-bit Windows issue, but it
>> hasn't, or it's still buggy, or both. However, its intention is to
>> reassert still pending level-triggered IRQs on APIC EOI. This logic is
>> missing in the user space model but exists in KVM's kernel model (I was
>> asking for a test against the latter but unfortunately did not receive
>> an answer - I bet you already don't need your patch over qemu-kvm).
> 
> Oh, sorry for not replying there. I work on qemu.git, so the in-kernel apic 
> is out of question for testing. I tried merging qemu-kvm.git and qemu.git 
> several times and every time just wasted a few hours of my life, so I gave up 
> on testing things on qemu-kvm.git.

Ah, I see. Marcelo recently merged back, resolving the tricky bits, and
now it should be much easier. Anyway.

> 
> The patch works though. If everybody agrees to take it, we can drop patch 7/7 
> of my ahci set.

Cool! I've a few more minor things to clean up here and will sent that
as a series later today, also for 0.14.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [Bug 712416] [NEW] kvm_intel kernel module crash with via nano vmx

2011-02-03 Thread khetzal

Public bug reported:

kvm module for hardware virtualisation not work properly on via nano
processors.

Tested with processor: VIA Nano processor U2250.
Processors flags (visible in /proc/cpuinfo): fpu vme de pse tsc msr pae mce cx8 
apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall nx 
lm constant_tsc up rep_good pni monitor vmx est tm2 ssse3 cx16 xtpr rng rng_en 
ace ace_en ace2 phe phe_en lahf_lm

With kernel 2.6.32: kvm not work and dmesg contains a lot of:
handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d

With kernel 2.6.35: all the system crash. Nothing visible in logs

** Affects: qemu
 Importance: Undecided
 Status: New

** Affects: kvm (Ubuntu)
 Importance: Undecided
 Status: New

** Affects: kvm (Debian)
 Importance: Undecided
 Status: New

** Also affects: ubuntu
   Importance: Undecided
   Status: New

** Package changed: ubuntu => kvm (Ubuntu)

** Also affects: kvm (Debian)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/712416

Title:
  kvm_intel kernel module crash with via nano vmx

Status in QEMU:
  New
Status in “kvm” package in Ubuntu:
  New
Status in “kvm” package in Debian:
  New

Bug description:
  kvm module for hardware virtualisation not work properly on via nano
  processors.

  Tested with processor: VIA Nano processor U2250.
  Processors flags (visible in /proc/cpuinfo): fpu vme de pse tsc msr pae mce 
cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall 
nx lm constant_tsc up rep_good pni monitor vmx est tm2 ssse3 cx16 xtpr rng 
rng_en ace ace_en ace2 phe phe_en lahf_lm

  With kernel 2.6.32: kvm not work and dmesg contains a lot of:
  handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d

  With kernel 2.6.35: all the system crash. Nothing visible in logs

[Qemu-devel] Re: KVM call minutes for Feb 1

2011-02-03 Thread Paolo Bonzini


On 02/01/2011 06:53 PM, Christoph Hellwig wrote:

I'd really prefer to let you finish up all the major work that way
before starting massive revamping like the glib main loop.


Yes, the glib main loop is not going to go anywhere if it cannot be 
applied to both qemu and qemu-kvm.


(And, I believe a new main loop is a necessary condition to show that 
glib is bringing benefits.  It's basically impossible to interface with 
external libraries as long as we have our own main loop).


Paolo

Re: [Qemu-devel] [ANNOUNCE] QEMU 0.14.0-rc0.tar.gz is available

2011-02-03 Thread Miguel Di Ciurcio Filho

On Wed, Feb 2, 2011 at 11:27 PM, Anthony Liguori
 wrote:
>
> An in progress detailed change log is available at
> http://wiki.qemu.org/Changelog/0.14
>

I still need some help to list the changes, bug fixes and improvements
for i386/x86_64.

- What is the status of nested SVM/VMX?
- Are we using any new feature from the KVM module?
- Any notable performance improvement?
- Some feature of qemu-kvm has been merged?
- Anything you think might be worth mentioning on the Changelog

Just point me to the commits and I will read the log and track any
discussion on the mailing list.

I'm CCing everybody that I suppose could give me a hand :-)

Regards,

Miguel

[Qemu-devel] Re: [PATCH 0.14] savevm: fix corruption in vmstate_subsection_load().

2011-02-03 Thread Juan Quintela

Yoshiaki Tamura  wrote:
> Although it's rare to happen in live migration, when the head of a
> byte stream contains 0x05 which is the marker of subsection, the
> loader gets corrupted because vmstate_subsection_load() continues even
> the device doesn't require it.  This patch adds a checker whether
> subsection is needed, and skips following routines if not needed.

This only fixes "partially" the problem :(

I agree with bonzini that it is better than what we had until we get the
time to try a better aproach.  I was thinking about only allowing
subsections at the top level, but that still don't improve things, would
have to think more about the problem.

Reviewed-by: Juan Quintela 

> Signed-off-by: Yoshiaki Tamura 
> Acked-by: Paolo Bonzini 
> ---
>  savevm.c |   10 +-
>  1 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/savevm.c b/savevm.c
> index 4453217..6d83b0f 100644
> --- a/savevm.c
> +++ b/savevm.c
> @@ -1638,6 +1638,12 @@ static const VMStateDescription 
> *vmstate_get_subsection(const VMStateSubsection
>  static int vmstate_subsection_load(QEMUFile *f, const VMStateDescription 
> *vmsd,
> void *opaque)
>  {
> +const VMStateSubsection *sub = vmsd->subsections;
> +
> +if (!sub || !sub->needed) {
> +return 0;
> +}
> +
>  while (qemu_peek_byte(f) == QEMU_VM_SUBSECTION) {
>  char idstr[256];
>  int ret;
> @@ -1650,10 +1656,11 @@ static int vmstate_subsection_load(QEMUFile *f, const 
> VMStateDescription *vmsd,
>  idstr[len] = 0;
>  version_id = qemu_get_be32(f);
>  
> -sub_vmsd = vmstate_get_subsection(vmsd->subsections, idstr);
> +sub_vmsd = vmstate_get_subsection(sub, idstr);
>  if (sub_vmsd == NULL) {
>  return -ENOENT;
>  }
> +assert(!sub_vmsd->subsections);
>  ret = vmstate_load_state(f, sub_vmsd, opaque, version_id);
>  if (ret) {
>  return ret;
> @@ -1677,6 +1684,7 @@ static void vmstate_subsection_save(QEMUFile *f, const 
> VMStateDescription *vmsd,
>  qemu_put_byte(f, len);
>  qemu_put_buffer(f, (uint8_t *)vmsd->name, len);
>  qemu_put_be32(f, vmsd->version_id);
> +assert(!vmsd->subsections);
>  vmstate_save_state(f, vmsd, opaque);
>  }
>  sub++;

[Qemu-devel] [Bug 712416] Re: kvm_intel kernel module crash with via nano vmx

2011-02-03 Thread Matthieu CERDA

I confirm that. Same dmesg log all over the place trying to use KVM with
a VIA Nano U2250.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/712416

Title:
  kvm_intel kernel module crash with via nano vmx

Status in QEMU:
  New
Status in “kvm” package in Ubuntu:
  New
Status in “kvm” package in Debian:
  New

Bug description:
  kvm module for hardware virtualisation not work properly on via nano
  processors.

  Tested with processor: VIA Nano processor U2250.
  Processors flags (visible in /proc/cpuinfo): fpu vme de pse tsc msr pae mce 
cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall 
nx lm constant_tsc up rep_good pni monitor vmx est tm2 ssse3 cx16 xtpr rng 
rng_en ace ace_en ace2 phe phe_en lahf_lm

  With kernel 2.6.32: kvm not work and dmesg contains a lot of:
  handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d

  With kernel 2.6.35: all the system crash. Nothing visible in logs

Re: [Qemu-devel] Re: KVM call minutes for Feb 1

2011-02-03 Thread Anthony Liguori


On 02/03/2011 04:11 AM, Marcelo Tosatti wrote:

On Tue, Feb 01, 2011 at 06:34:50PM +0100, Jan Kiszka wrote:
   

On 2011-02-01 18:20, Anthony Liguori wrote:
 

On 02/01/2011 11:03 AM, Jan Kiszka wrote:
   

On 2011-02-01 17:53, Anthony Liguori wrote:

 

On 02/01/2011 10:36 AM, Jan Kiszka wrote:

   

On 2011-02-01 16:54, Chris Wright wrote:


 

KVM upstream merge: status, plans, coordination
- Jan has a git tree, consolidating
- qemu-kvm io threading is still an issue
- Anthony wants to just merge
 - concerns with non-x86 arch and merge
 - concerns with big-bang patch merge and following stability
- post 0.14 conversion to glib mainloop, non-upstreamed qemu-kvm will be
 a problem if it's not there by then
- testing and nuances are still an issue (e.g. stefan berger's mmio read issue)
- qemu-kvm still evolving, needs to get sync'd or it will keep diverging
- 2 implementations of main init, cpu init, Jan has merged them into one
 - qemu-kvm-x86.c file that's only a few hundred lines
- review as one patch to see the fundamental difference


   

More precisely, my current work flow is to pick some function(s), e.g.
kvm_cpu_exec/kvm_run, and start wondering "What needs to be done to
upstream so that qemu-kvm could use that implementation?". If they
differ, the reasons need to be understood and patched away, either by
fixing/enhancing upstream or simplifying qemu-kvm. Once the upstream
changes are merged back, a qemu-kvm patch is posted to switch to that
version.

Any help will be welcome, either via review of my subtle regressions or
on resolving concrete differences.

E.g. posix-aio-compat.c: Why does qemu-kvm differ here? If it's because
of its own iothread code, can we wrap that away or do we need to
consolidate the threading code first? Or do we need to fix something in
upstream?


 

I bet it's the eventfd thing.  It's arbitrary.  If you've got a small
diff post your series, I'd be happy to take a look at it and see what I
can explain.


   

Looks like it's around signalfd and its emulation:

 

I really meant the compatfd thing.

signalfd can't really be emulated properly so in upstream we switched to
a pipe() which Avi didn't like.

But with glib, this all goes away anyway so we should just drop the
qemu-kvm changes and use the upstream version.  Once we enable I/O
thread in qemu.git, we no longer need to use signals for I/O completion
which I think everyone would agree is a better solution.
   

Don't understand: If we do not need SIGIO for AIO emulation in threaded
mode, why wasn't that stubbed out already? If that helps reducing
worries about the signalfd emulation (which is likely a non-issue anyway
as anyone with serious workload should run a kernel with such support).
 

qemu-kvm has this modification for performance reasons.
SIGUSR2 can't be blocked otherwise. See example test case at
https://patchwork.kernel.org/patch/20817/.
   


That test-case is not realistic.  That's 10k signals per second.  With 
batching, we're at an I/O op rate that we're not even close to today.  I 
can guarantee that you won't find a real workload were you can actually 
measure the difference.


And keep in mind, the signal notification should go away so having this 
change in qemu-kvm really doesn't make sense.


Regards,

Anthony Liguori

[Qemu-devel] [PATCH v2 2/3] Correct alarm deadline computation

2011-02-03 Thread Paolo Bonzini

When the QEMU_CLOCK_HOST clock was added, computation of its
deadline was added to qemu_next_deadline, which is correct but
incomplete.

I noticed this by reading the very convoluted rules whereby
qemu_next_deadline_dyntick is computed, which miss QEMU_CLOCK_HOST
when use_icount is true.  This patch inlines qemu_next_deadline
into qemu_next_deadline_dyntick, and then corrects the logic to skip
only QEMU_CLOCK_VIRTUAL when use_icount is true.

Signed-off-by: Paolo Bonzini 
Cc: Jan Kiszka 
---
 qemu-timer.c |   15 +++
 1 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index 60283a8..c19d0a2 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -724,11 +724,18 @@ static uint64_t qemu_next_deadline_dyntick(void)
 int64_t delta;
 int64_t rtdelta;
 
-if (use_icount)
+if (!use_icount && active_timers[QEMU_CLOCK_VIRTUAL]) {
+delta = active_timers[QEMU_CLOCK_VIRTUAL]->expire_time -
+ qemu_get_clock(vm_clock);
+} else {
 delta = INT32_MAX;
-else
-delta = qemu_next_deadline();
-
+}
+if (active_timers[QEMU_CLOCK_HOST]) {
+int64_t hdelta = active_timers[QEMU_CLOCK_HOST]->expire_time -
+ qemu_get_clock_ns(host_clock);
+if (hdelta < delta)
+delta = hdelta;
+}
 if (active_timers[QEMU_CLOCK_REALTIME]) {
 rtdelta = (active_timers[QEMU_CLOCK_REALTIME]->expire_time * 100 -
  qemu_get_clock_ns(rt_clock));
-- 
1.7.3.4

[Qemu-devel] [PATCH v2 0/3] Simplify and fix alarm deadline computation

2011-02-03 Thread Paolo Bonzini

This is a rebased and updated version of the series to fix alarm
deadline computation.  I adopted Aurelien's suggestion to change
everything to nanoseconds.  I also did more testing than just
compiling, by booting with -icount.

And it wasn't enough.  Patch 1 was broken due to a mistake in converting
to QEMU_CLOCK_REALTIME to nanoseconds (in patch 1).  I found it just by
inspection, but at this point I believe this should be baked in 0.15
first before backporting.

v1->v2
Add patch 1

v2->v3
Fix patch 1

Paolo Bonzini (3):
  use nanoseconds everywhere for timeout computation
  Correct alarm deadline computation
  Unify alarm deadline computation

 qemu-timer.c |   65 +
 1 files changed, 33 insertions(+), 32 deletions(-)

-- 
1.7.3.4

[Qemu-devel] [PATCH v2 3/3] Unify alarm deadline computation

2011-02-03 Thread Paolo Bonzini

This patch shows how using the correct formula for
qemu_next_deadline_dyntick can simplify the code of
host_alarm_handler and eliminate useless duplication.

Signed-off-by: Paolo Bonzini 
---
 qemu-timer.c |   28 +++-
 2 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index c19d0a2..d3c426c 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -635,6 +635,8 @@ void qemu_run_all_timers(void)
 qemu_run_timers(host_clock);
 }
 
+static int64_t qemu_next_alarm_deadline(void);
+
 #ifdef _WIN32
 static void CALLBACK host_alarm_handler(UINT uTimerID, UINT uMsg,
 DWORD_PTR dwUser, DWORD_PTR dw1,
@@ -677,14 +679,7 @@ static void host_alarm_handler(int host_signum)
 }
 #endif
 if (alarm_has_dynticks(t) ||
-(!use_icount &&
-qemu_timer_expired(active_timers[QEMU_CLOCK_VIRTUAL],
-   qemu_get_clock(vm_clock))) ||
-qemu_timer_expired(active_timers[QEMU_CLOCK_REALTIME],
-   qemu_get_clock(rt_clock)) ||
-qemu_timer_expired(active_timers[QEMU_CLOCK_HOST],
-   qemu_get_clock(host_clock))) {
-
+qemu_next_alarm_deadline () <= 0) {
 t->expired = alarm_has_dynticks(t);
 t->pending = 1;
 qemu_notify_event();
@@ -715,11 +710,7 @@ int64_t qemu_next_deadline(void)
 
 #ifndef _WIN32
 
-#if defined(__linux__)
-
-#define RTC_FREQ 1024
-
-static uint64_t qemu_next_deadline_dyntick(void)
+static int64_t qemu_next_alarm_deadline(void)
 {
 int64_t delta;
 int64_t rtdelta;
@@ -743,12 +734,13 @@ static uint64_t qemu_next_deadline_dyntick(void)
 delta = rtdelta;
 }
 
-if (delta < MIN_TIMER_REARM_NS)
-delta = MIN_TIMER_REARM_NS;
-
 return delta;
 }
 
+#if defined(__linux__)
+
+#define RTC_FREQ 1024
+
 static void enable_sigio_timer(int fd)
 {
 struct sigaction act;
@@ -903,7 +895,9 @@ static void dynticks_rearm_timer(struct qemu_alarm_timer *t)
 !active_timers[QEMU_CLOCK_HOST])
 return;
 
-nearest_delta_ns = qemu_next_deadline_dyntick();
+nearest_delta_ns = qemu_next_alarm_deadline();
+if (nearest_delta_ns < MIN_TIMER_REARM_NS)
+nearest_delta_ns = MIN_TIMER_REARM_NS;
 
 /* check whether a timer is already running */
 if (timer_gettime(host_timer, &timeout)) {
-- 
1.7.3.4

[Qemu-devel] [PATCH v2 1/3] use nanoseconds everywhere for timeout computation

2011-02-03 Thread Paolo Bonzini

Suggested by Aurelien Jarno.

Signed-off-by: Paolo Bonzini 
---
 qemu-timer.c |   30 +++---
 1 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/qemu-timer.c b/qemu-timer.c
index db1ec49..60283a8 100644
--- a/qemu-timer.c
+++ b/qemu-timer.c
@@ -197,8 +197,8 @@ static void qemu_rearm_alarm_timer(struct qemu_alarm_timer 
*t)
 t->rearm(t);
 }
 
-/* TODO: MIN_TIMER_REARM_US should be optimized */
-#define MIN_TIMER_REARM_US 250
+/* TODO: MIN_TIMER_REARM_NS should be optimized */
+#define MIN_TIMER_REARM_NS 25
 
 #ifdef _WIN32
 
@@ -698,11 +698,11 @@ int64_t qemu_next_deadline(void)
 
 if (active_timers[QEMU_CLOCK_VIRTUAL]) {
 delta = active_timers[QEMU_CLOCK_VIRTUAL]->expire_time -
- qemu_get_clock(vm_clock);
+ qemu_get_clock_ns(vm_clock);
 }
 if (active_timers[QEMU_CLOCK_HOST]) {
 int64_t hdelta = active_timers[QEMU_CLOCK_HOST]->expire_time -
- qemu_get_clock(host_clock);
+ qemu_get_clock_ns(host_clock);
 if (hdelta < delta)
 delta = hdelta;
 }
@@ -727,17 +727,17 @@ static uint64_t qemu_next_deadline_dyntick(void)
 if (use_icount)
 delta = INT32_MAX;
 else
-delta = (qemu_next_deadline() + 999) / 1000;
+delta = qemu_next_deadline();
 
 if (active_timers[QEMU_CLOCK_REALTIME]) {
-rtdelta = (active_timers[QEMU_CLOCK_REALTIME]->expire_time -
- qemu_get_clock(rt_clock))*1000;
+rtdelta = (active_timers[QEMU_CLOCK_REALTIME]->expire_time * 100 -
+ qemu_get_clock_ns(rt_clock));
 if (rtdelta < delta)
 delta = rtdelta;
 }
 
-if (delta < MIN_TIMER_REARM_US)
-delta = MIN_TIMER_REARM_US;
+if (delta < MIN_TIMER_REARM_NS)
+delta = MIN_TIMER_REARM_NS;
 
 return delta;
 }
@@ -887,8 +887,8 @@ static void dynticks_rearm_timer(struct qemu_alarm_timer *t)
 {
 timer_t host_timer = (timer_t)(long)t->priv;
 struct itimerspec timeout;
-int64_t nearest_delta_us = INT64_MAX;
-int64_t current_us;
+int64_t nearest_delta_ns = INT64_MAX;
+int64_t current_ns;
 
 assert(alarm_has_dynticks(t));
 if (!active_timers[QEMU_CLOCK_REALTIME] &&
@@ -896,7 +896,7 @@ static void dynticks_rearm_timer(struct qemu_alarm_timer *t)
 !active_timers[QEMU_CLOCK_HOST])
 return;
 
-nearest_delta_us = qemu_next_deadline_dyntick();
+nearest_delta_ns = qemu_next_deadline_dyntick();
 
 /* check whether a timer is already running */
 if (timer_gettime(host_timer, &timeout)) {
@@ -904,14 +904,14 @@ static void dynticks_rearm_timer(struct qemu_alarm_timer 
*t)
 fprintf(stderr, "Internal timer error: aborting\n");
 exit(1);
 }
-current_us = timeout.it_value.tv_sec * 100 + 
timeout.it_value.tv_nsec/1000;
-if (current_us && current_us <= nearest_delta_us)
+current_ns = timeout.it_value.tv_sec * 10LL + 
timeout.it_value.tv_nsec;
+if (current_ns && current_ns <= nearest_delta_ns)
 return;
 
 timeout.it_interval.tv_sec = 0;
 timeout.it_interval.tv_nsec = 0; /* 0 for one-shot timer */
-timeout.it_value.tv_sec =  nearest_delta_us / 100;
-timeout.it_value.tv_nsec = (nearest_delta_us % 100) * 1000;
+timeout.it_value.tv_sec =  nearest_delta_ns / 10;
+timeout.it_value.tv_nsec = nearest_delta_ns % 10;
 if (timer_settime(host_timer, 0 /* RELATIVE */, &timeout, NULL)) {
 perror("settime");
 fprintf(stderr, "Internal timer error: aborting\n");
-- 
1.7.3.4

[Qemu-devel] Help initializing a hw module

2011-02-03 Thread Juan Antonio Moya Vicén

Hi list,
I'm new to qemu, and developing a new hw module. And I'll appreciate any
help or link any of you can provide relating to the following problem
I'm experiencing, that I'm certain it is because of inexperience:


I'm getting ENODEV error codes each time I load the kernel module for my
specific device.

** I have a guest OS Linux. The hw I'm emulating is a stupid device,
whose kernel module tries to find at a certain IO mem position: 0xFED4
The code in the kernel module that makes the call:
 ioremap(0xFED4, 0x5000);
 if(ioread8(0xFED4) == 0)
return -ENODEV;

** My device initialization code in qemu:

static uint32_t my_access_read(void* opaque, target_phys_addr_t addr) {
return (uint32_t) 0xFF;
}
static CPUReadMemoryFunc* const my_access_read_str[3] = {
my_access_read, NULL, NULL };

void my_register_function(void) {
int addr = cpu_register_io_memory(my_access_read_str, NULL, s);
cpu_register_physical_memory(0xFED4, 0x5000, addr);
}


** I do know that Qemu is calling and executing "my_register_function()"
** I do know that the guest OS kernel module is executing the two lines
I pasted before.

But looks like "my_access_read()" function gets never called. Do you
guys have any idea how to fix this mess?
Thanks in advance


-- 


*Juan Antonio Moya*

Scientific Developer, Securiforest
Sant Joan de La Salle 42, 4th floor
08022 Barcelona (Spain)
Email: j...@securiforest.com 
Web:   http://www.securiforest.com



/ /!\ Disclaimer -- The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy the
information by any means./

[Qemu-devel] x86: No 64-bit Windows bootable in emulations mode

2011-02-03 Thread Jan Kiszka

Just a heads up in case someone want's to run Windows 7 x64 on an
Android phone or some other crazy host without x86 KVM support:

Our current emulation is not able to boot any 64-bit Windows version I
found. Already the installations DVDs bail out with STOP 0x005D,
which means something like "hardware not supported". I tried the various
64-bit CPU types, but none works. Don't know if its just a missing
feature bit, a missing instruction emulation, or some other emulation bug.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Gleb Natapov

On Thu, Feb 03, 2011 at 11:11:23AM +0100, Jan Kiszka wrote:
> On 2011-02-03 11:04, Marcelo Tosatti wrote:
> > On Thu, Feb 03, 2011 at 10:32:25AM +0100, Jan Kiszka wrote:
> >> On 2011-02-03 09:18, Avi Kivity wrote:
> >>> On 02/02/2011 05:52 PM, Jan Kiszka wrote:
> >>
> >  If there is no problem in the logic of this commit (and I do not see
> >  one yet) then we somewhere miss kicking vcpu when interrupt, that 
> > should be
> >  handled, arrives?
> 
>  I'm not yet confident about the logic of the kernel patch: mov to cr8 is
>  serializing. If the guest raises the tpr and then signals this with a
>  succeeding, non vm-exiting instruction to the other vcpus, one of those
>  could inject an interrupt with a higher priority than the previous tpr,
>  but a lower one than current tpr. QEMU user space would accept this
>  interrupt - and would likely surprise the guest. Do I miss something?
> >>>
> >>> apic_get_interrupt() is only called from the vcpu thread, so it should 
> >>> see a correct tpr.
> >>>
> >>> The only difference I can see with the patch is that we may issue a 
> >>> spurious cpu_interrupt().  But that shouldn't do anything bad, should it?
> >>
> >> I tested this yesterday, and it doesn't confuse Windows. It actually
> >> receives quite a few spurious IRQs in normal operation, w/ or w/o the
> >> kernel's tpr optimization.
> > 
> > http://www.mail-archive.com/kvm@vger.kernel.org/msg41681.html
> 
> Don't get the scenario yet: We do not inject (or set isr) over the
> context of apic_set_irq caller.
> 
> > 
> > tpr of a vcpu should always be inspected in vcpu context, instead of 
> > iothread context?
> 
> Maybe this is true for the in-kernel model, but I don't see the issue
> (anymore) for the way user space works.
> 
With patch below I can boot Windows7.

diff --git a/hw/apic.c b/hw/apic.c
index 146deca..fdcac88 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -600,7 +600,7 @@ int apic_get_interrupt(DeviceState *d)
 intno = get_highest_priority_int(s->irr);
 if (intno < 0)
 return -1;
-if (s->tpr && intno <= s->tpr)
+if ((s->tpr >> 4) && (intno >> 4) <= (s->tpr >> 4))
 return s->spurious_vec & 0xff;
 reset_bit(s->irr, intno);
 set_bit(s->isr, intno);
--
Gleb.

Re: [Qemu-devel] Help initializing a hw module

2011-02-03 Thread Juan Antonio Moya Vicén

I'm sorry I made a typo.

The code in the kernel module that makes the call REALLY IS:
 iobase = ioremap(0xFED4, 0x5000);
 if(ioread8(iobase) == 0)
return -ENODEV;



On 03/02/11 14:58, Juan Antonio Moya Vicén wrote:
> Hi list,
> I'm new to qemu, and developing a new hw module. And I'll appreciate any
> help or link any of you can provide relating to the following problem
> I'm experiencing, that I'm certain it is because of inexperience:
>
>
> I'm getting ENODEV error codes each time I load the kernel module for my
> specific device.
>
> ** I have a guest OS Linux. The hw I'm emulating is a stupid device,
> whose kernel module tries to find at a certain IO mem position: 0xFED4
> The code in the kernel module that makes the call:
>  ioremap(0xFED4, 0x5000);
>  if(ioread8(0xFED4) == 0)
> return -ENODEV;
>
> ** My device initialization code in qemu:
>
> static uint32_t my_access_read(void* opaque, target_phys_addr_t addr) {
> return (uint32_t) 0xFF;
> }
> static CPUReadMemoryFunc* const my_access_read_str[3] = {
> my_access_read, NULL, NULL };
>
> void my_register_function(void) {
> int addr = cpu_register_io_memory(my_access_read_str, NULL, s);
> cpu_register_physical_memory(0xFED4, 0x5000, addr);
> }
>
>
> ** I do know that Qemu is calling and executing "my_register_function()"
> ** I do know that the guest OS kernel module is executing the two lines
> I pasted before.
>
> But looks like "my_access_read()" function gets never called. Do you
> guys have any idea how to fix this mess?
> Thanks in advance
>
>
>   

-- 


*Juan Antonio Moya*

Scientific Developer, Securiforest
Sant Joan de La Salle 42, 4th floor
08022 Barcelona (Spain)
Email: j...@securiforest.com 
Web:   http://www.securiforest.com



/ /!\ Disclaimer -- The contents of this email and any attachments are
confidential and may also be privileged. If you are not the intended
recipient, please notify the sender immediately and do not disclose the
contents to any other person, use it for any purpose, or store or copy the
information by any means./

Re: [Qemu-devel] KVM: Windows 64-bit troubles with user space irqchip

2011-02-03 Thread Jan Kiszka

On 2011-02-03 15:15, Gleb Natapov wrote:
> On Thu, Feb 03, 2011 at 11:11:23AM +0100, Jan Kiszka wrote:
>> On 2011-02-03 11:04, Marcelo Tosatti wrote:
>>> On Thu, Feb 03, 2011 at 10:32:25AM +0100, Jan Kiszka wrote:
 On 2011-02-03 09:18, Avi Kivity wrote:
> On 02/02/2011 05:52 PM, Jan Kiszka wrote:

>>>  If there is no problem in the logic of this commit (and I do not see
>>>  one yet) then we somewhere miss kicking vcpu when interrupt, that 
>>> should be
>>>  handled, arrives?
>>
>> I'm not yet confident about the logic of the kernel patch: mov to cr8 is
>> serializing. If the guest raises the tpr and then signals this with a
>> succeeding, non vm-exiting instruction to the other vcpus, one of those
>> could inject an interrupt with a higher priority than the previous tpr,
>> but a lower one than current tpr. QEMU user space would accept this
>> interrupt - and would likely surprise the guest. Do I miss something?
>
> apic_get_interrupt() is only called from the vcpu thread, so it should 
> see a correct tpr.
>
> The only difference I can see with the patch is that we may issue a 
> spurious cpu_interrupt().  But that shouldn't do anything bad, should it?

 I tested this yesterday, and it doesn't confuse Windows. It actually
 receives quite a few spurious IRQs in normal operation, w/ or w/o the
 kernel's tpr optimization.
>>>
>>> http://www.mail-archive.com/kvm@vger.kernel.org/msg41681.html
>>
>> Don't get the scenario yet: We do not inject (or set isr) over the
>> context of apic_set_irq caller.
>>
>>>
>>> tpr of a vcpu should always be inspected in vcpu context, instead of 
>>> iothread context?
>>
>> Maybe this is true for the in-kernel model, but I don't see the issue
>> (anymore) for the way user space works.
>>
> With patch below I can boot Windows7.
> 
> diff --git a/hw/apic.c b/hw/apic.c
> index 146deca..fdcac88 100644
> --- a/hw/apic.c
> +++ b/hw/apic.c
> @@ -600,7 +600,7 @@ int apic_get_interrupt(DeviceState *d)
>  intno = get_highest_priority_int(s->irr);
>  if (intno < 0)
>  return -1;
> -if (s->tpr && intno <= s->tpr)
> +if ((s->tpr >> 4) && (intno >> 4) <= (s->tpr >> 4))
>  return s->spurious_vec & 0xff;
>  reset_bit(s->irr, intno);
>  set_bit(s->isr, intno);
> --
>   Gleb.

Cool, /me too. I would just suggest

diff --git a/hw/apic.c b/hw/apic.c
index 05a115f..13bd7b4 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -582,6 +582,7 @@ int apic_get_interrupt(DeviceState *d)
 {
 APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
 int intno;
+int tpr;
 
 /* if the APIC is installed or enabled, we let the 8259 handle the
IRQs */
@@ -594,8 +595,10 @@ int apic_get_interrupt(DeviceState *d)
 intno = get_highest_priority_int(s->irr);
 if (intno < 0)
 return -1;
-if (s->tpr && intno <= s->tpr)
+tpr = s->tpr >> 4;
+if (tpr && (intno >> 4) <= tpr) {
 return s->spurious_vec & 0xff;
+}
 reset_bit(s->irr, intno);
 set_bit(s->isr, intno);
 apic_update_irq(s);


Unfortunately, that issue was not related to the emulation mode
problems of QEMU.

Thanks!
Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] checkpatch.pl complains about untouched tab'ed lines

2011-02-03 Thread Jan Kiszka

Hi,

just a note, I can't look into this ATM:

I get complaints from checkpatch when my patch includes lines with tabs
only in the hunk's untouched context. Maybe someone feels like fixing this.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: KVM call minutes for Feb 1

2011-02-03 Thread Anthony Liguori


On 02/03/2011 06:36 AM, Paolo Bonzini wrote:

On 02/01/2011 06:53 PM, Christoph Hellwig wrote:

I'd really prefer to let you finish up all the major work that way
before starting massive revamping like the glib main loop.


Yes, the glib main loop is not going to go anywhere if it cannot be 
applied to both qemu and qemu-kvm.


(And, I believe a new main loop is a necessary condition to show that 
glib is bringing benefits.  It's basically impossible to interface 
with external libraries as long as we have our own main loop).


I agree (and have been repeatedly making the point) that we can't do 
something as drastic as changing the main loop without resolving the I/O 
thread differences in qemu-kvm.  Really, any deep threading changes 
require that we resolve this which makes this priority #1 for the 0.15 
release.


But virtio-9p needs a threadlet mechanism to make any forward progress.  
I don't want to roll out our own so I think we need to pull glib into 
the build ASAP to enable this.  But we should avoid major refactoring of 
the main loop until the two I/O thread implementations are merged.


Regards,

Anthony Liguori


Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Qemu-devel] [0.14?][PATCH 1/4] ioapic: Implement EOI handling for level-triggered IRQs

2011-02-03 Thread Jan Kiszka

Add the missing EOI broadcast from local APIC to the IOAPICs on
completion of level-triggered IRQs. This ensures that a still asserted
IRQ source properly re-triggers an APIC IRQ.

Signed-off-by: Jan Kiszka 
---
 hw/apic.c   |9 ++---
 hw/ioapic.c |   43 +--
 hw/ioapic.h |   20 
 3 files changed, 67 insertions(+), 5 deletions(-)
 create mode 100644 hw/ioapic.h

diff --git a/hw/apic.c b/hw/apic.c
index ff581f0..05a115f 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -18,6 +18,7 @@
  */
 #include "hw.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "qemu-timer.h"
 #include "host-utils.h"
 #include "sysbus.h"
@@ -57,7 +58,8 @@
 
 #define ESR_ILLEGAL_ADDRESS (1 << 7)
 
-#define APIC_SV_ENABLE (1 << 8)
+#define APIC_SV_DIRECTED_IO (1<<12)
+#define APIC_SV_ENABLE  (1<<8)
 
 #define MAX_APICS 255
 #define MAX_APIC_WORDS 8
@@ -420,8 +422,9 @@ static void apic_eoi(APICState *s)
 if (isrv < 0)
 return;
 reset_bit(s->isr, isrv);
-/* XXX: send the EOI packet to the APIC bus to allow the I/O APIC to
-set the remote IRR bit for level triggered interrupts. */
+if (!(s->spurious_vec & APIC_SV_DIRECTED_IO) && get_bit(s->tmr, isrv)) {
+ioapic_eio_broadcast(isrv);
+}
 apic_update_irq(s);
 }
 
diff --git a/hw/ioapic.c b/hw/ioapic.c
index 2109568..443c579 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -23,6 +23,7 @@
 #include "hw.h"
 #include "pc.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "qemu-timer.h"
 #include "host-utils.h"
 #include "sysbus.h"
@@ -36,7 +37,10 @@
 #define DPRINTF(fmt, ...)
 #endif
 
-#define IOAPIC_LVT_MASKED  (1<<16)
+#define MAX_IOAPICS 1
+
+#define IOAPIC_LVT_MASKED   (1 << 16)
+#define IOAPIC_LVT_REMOTE_IRR   (1 << 14)
 
 #define IOAPIC_TRIGGER_EDGE0
 #define IOAPIC_TRIGGER_LEVEL   1
@@ -61,6 +65,8 @@ struct IOAPICState {
 uint64_t ioredtbl[IOAPIC_NUM_PINS];
 };
 
+static IOAPICState *ioapics[MAX_IOAPICS];
+
 static void ioapic_service(IOAPICState *s)
 {
 uint8_t i;
@@ -83,8 +89,11 @@ static void ioapic_service(IOAPICState *s)
 dest_mode = (entry >> 11) & 1;
 delivery_mode = (entry >> 8) & 7;
 polarity = (entry >> 13) & 1;
-if (trig_mode == IOAPIC_TRIGGER_EDGE)
+if (trig_mode == IOAPIC_TRIGGER_EDGE) {
 s->irr &= ~mask;
+} else {
+s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
+}
 if (delivery_mode == IOAPIC_DM_EXTINT)
 vector = pic_read_irq(isa_pic);
 else
@@ -131,6 +140,29 @@ static void ioapic_set_irq(void *opaque, int vector, int 
level)
 }
 }
 
+void ioapic_eio_broadcast(int vector)
+{
+IOAPICState *s;
+uint64_t entry;
+int i, n;
+
+for (i = 0; i < MAX_IOAPICS; i++) {
+s = ioapics[i];
+if (!s) {
+continue;
+}
+for (n = 0; n < IOAPIC_NUM_PINS; n++) {
+entry = s->ioredtbl[n];
+if ((entry & IOAPIC_LVT_REMOTE_IRR) && (entry & 0xff) == vector) {
+s->ioredtbl[n] = entry & ~IOAPIC_LVT_REMOTE_IRR;
+if (!(entry & IOAPIC_LVT_MASKED) && (s->irr & (1 << n))) {
+ioapic_service(s);
+}
+}
+}
+}
+}
+
 static uint32_t ioapic_mem_readl(void *opaque, target_phys_addr_t addr)
 {
 IOAPICState *s = opaque;
@@ -240,6 +272,11 @@ static int ioapic_init1(SysBusDevice *dev)
 {
 IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
 int io_memory;
+static int ioapic_no;
+
+if (ioapic_no >= MAX_IOAPICS) {
+return -1;
+}
 
 io_memory = cpu_register_io_memory(ioapic_mem_read,
ioapic_mem_write, s,
@@ -248,6 +285,8 @@ static int ioapic_init1(SysBusDevice *dev)
 
 qdev_init_gpio_in(&dev->qdev, ioapic_set_irq, IOAPIC_NUM_PINS);
 
+ioapics[ioapic_no++] = s;
+
 return 0;
 }
 
diff --git a/hw/ioapic.h b/hw/ioapic.h
new file mode 100644
index 000..c441567
--- /dev/null
+++ b/hw/ioapic.h
@@ -0,0 +1,20 @@
+/*
+ *  ioapic.c IOAPIC emulation logic
+ *
+ *  Copyright (c) 2011 Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see

[Qemu-devel] [0.14?][PATCH 0/4] IOAPIC fixes

2011-02-03 Thread Jan Kiszka

This series fixes the re-injection of level-triggered IRQs that are
still raised on APIC EOI, adds a must-have field to the vmstate of the
IOAPIC, and also aligns that vmstate with qemu-kvm.

I would recommend the whole series for 0.14, but at least patch 1 should
be applied.



Jan Kiszka (4):
  ioapic: Implement EOI handling for level-triggered IRQs
  ioapic: Save/restore irr
  ioapic: Prepare for base address relocation
  ioapic: Style & magics cleanup

 hw/apic.c|9 ++-
 hw/ioapic.c  |  244 +-
 hw/ioapic.h  |   24 ++
 hw/pc_piix.c |5 +-
 4 files changed, 206 insertions(+), 76 deletions(-)
 create mode 100644 hw/ioapic.h

[Qemu-devel] [0.14?][PATCH 4/4] ioapic: Style & magics cleanup

2011-02-03 Thread Jan Kiszka

Fix a few style issues and convert magic numbers into prober symbolic
constants, also fixing the wrong but unused IOAPIC_DM_SIPI value.

Signed-off-by: Jan Kiszka 
---
 hw/ioapic.c |  177 +++---
 1 files changed, 107 insertions(+), 70 deletions(-)

diff --git a/hw/ioapic.c b/hw/ioapic.c
index b53d436..a9d7fcd 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -39,20 +39,48 @@
 
 #define MAX_IOAPICS 1
 
-#define IOAPIC_LVT_MASKED   (1 << 16)
-#define IOAPIC_LVT_REMOTE_IRR   (1 << 14)
+#define IOAPIC_VERSION  0x11
 
-#define IOAPIC_TRIGGER_EDGE0
-#define IOAPIC_TRIGGER_LEVEL   1
+#define IOAPIC_LVT_DEST_SHIFT   56
+#define IOAPIC_LVT_MASKED_SHIFT 16
+#define IOAPIC_LVT_TRIGGER_MODE_SHIFT   15
+#define IOAPIC_LVT_REMOTE_IRR_SHIFT 14
+#define IOAPIC_LVT_POLARITY_SHIFT   13
+#define IOAPIC_LVT_DELIV_STATUS_SHIFT   12
+#define IOAPIC_LVT_DEST_MODE_SHIFT  11
+#define IOAPIC_LVT_DELIV_MODE_SHIFT 8
+
+#define IOAPIC_LVT_MASKED   (1 << IOAPIC_LVT_MASKED_SHIFT)
+#define IOAPIC_LVT_REMOTE_IRR   (1 << IOAPIC_LVT_REMOTE_IRR_SHIFT)
+
+#define IOAPIC_TRIGGER_EDGE 0
+#define IOAPIC_TRIGGER_LEVEL1
 
 /*io{apic,sapic} delivery mode*/
-#define IOAPIC_DM_FIXED0x0
-#define IOAPIC_DM_LOWEST_PRIORITY  0x1
-#define IOAPIC_DM_PMI  0x2
-#define IOAPIC_DM_NMI  0x4
-#define IOAPIC_DM_INIT 0x5
-#define IOAPIC_DM_SIPI 0x5
-#define IOAPIC_DM_EXTINT   0x7
+#define IOAPIC_DM_FIXED 0x0
+#define IOAPIC_DM_LOWEST_PRIORITY   0x1
+#define IOAPIC_DM_PMI   0x2
+#define IOAPIC_DM_NMI   0x4
+#define IOAPIC_DM_INIT  0x5
+#define IOAPIC_DM_SIPI  0x6
+#define IOAPIC_DM_EXTINT0x7
+#define IOAPIC_DM_MASK  0x7
+
+#define IOAPIC_VECTOR_MASK  0xff
+
+#define IOAPIC_IOREGSEL 0x00
+#define IOAPIC_IOWIN0x10
+
+#define IOAPIC_REG_ID   0x00
+#define IOAPIC_REG_VER  0x01
+#define IOAPIC_REG_ARB  0x02
+#define IOAPIC_REG_REDTBL_BASE  0x10
+#define IOAPIC_ID   0x00
+
+#define IOAPIC_ID_SHIFT 24
+#define IOAPIC_ID_MASK  0xf
+
+#define IOAPIC_VER_ENTRIES_SHIFT16
 
 typedef struct IOAPICState IOAPICState;
 
@@ -60,7 +88,6 @@ struct IOAPICState {
 SysBusDevice busdev;
 uint8_t id;
 uint8_t ioregsel;
-
 uint32_t irr;
 uint64_t ioredtbl[IOAPIC_NUM_PINS];
 uint64_t default_base_address;
@@ -86,21 +113,22 @@ static void ioapic_service(IOAPICState *s)
 if (s->irr & mask) {
 entry = s->ioredtbl[i];
 if (!(entry & IOAPIC_LVT_MASKED)) {
-trig_mode = ((entry >> 15) & 1);
-dest = entry >> 56;
-dest_mode = (entry >> 11) & 1;
-delivery_mode = (entry >> 8) & 7;
-polarity = (entry >> 13) & 1;
+trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
+dest = entry >> IOAPIC_LVT_DEST_SHIFT;
+dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
+delivery_mode =
+(entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
+polarity = (entry >> IOAPIC_LVT_POLARITY_SHIFT) & 1;
 if (trig_mode == IOAPIC_TRIGGER_EDGE) {
 s->irr &= ~mask;
 } else {
 s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
 }
-if (delivery_mode == IOAPIC_DM_EXTINT)
+if (delivery_mode == IOAPIC_DM_EXTINT) {
 vector = pic_read_irq(isa_pic);
-else
-vector = entry & 0xff;
-
+} else {
+vector = entry & IOAPIC_VECTOR_MASK;
+}
 apic_deliver_irq(dest, dest_mode, delivery_mode,
  vector, polarity, trig_mode);
 }
@@ -116,15 +144,16 @@ static void ioapic_set_irq(void *opaque, int vector, int 
level)
  * to GSI 2.  GSI maps to ioapic 1-1.  This is not
  * the cleanest way of doing it but it should work. */
 
-DPRINTF("%s: %s vec %x\n", __func__, level? "raise" : "lower", vector);
-if (vector == 0)
+DPRINTF("%s: %s vec %x\n", __func__, level ? "raise" : "lower", vector);
+if (vector == 0) {
 vector = 2;
-
+}
 if (vector >= 0 && vector < IOAPIC_NUM_PINS) {
 uint32_t mask = 1 << vector;
 uint64_t entry = s->ioredtbl[vector];
 
-if ((entry >> 15) & 1) {
+if (((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1) ==
+IOAPIC_TRIGGER_LEVEL) {
 /* level triggered */

[Qemu-devel] [0.14?][PATCH 2/4] ioapic: Save/restore irr

2011-02-03 Thread Jan Kiszka

This is a guest modifiable state that must be saved/restored properly.

Signed-off-by: Jan Kiszka 
---
 hw/ioapic.c |   13 -
 1 files changed, 12 insertions(+), 1 deletions(-)

diff --git a/hw/ioapic.c b/hw/ioapic.c
index 443c579..c7019f5 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -231,14 +231,25 @@ static void ioapic_mem_writel(void *opaque, 
target_phys_addr_t addr, uint32_t va
 }
 }
 
+static int ioapic_pre_load(void *opaque)
+{
+IOAPICState *s = opaque;
+
+/* in case we are loading version 1, set these to sane values */
+s->irr = 0;
+return 0;
+}
+
 static const VMStateDescription vmstate_ioapic = {
 .name = "ioapic",
-.version_id = 1,
+.version_id = 2,
+.pre_load = ioapic_pre_load,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
 .fields  = (VMStateField []) {
 VMSTATE_UINT8(id, IOAPICState),
 VMSTATE_UINT8(ioregsel, IOAPICState),
+VMSTATE_UINT32_V(irr, IOAPICState, 2),
 VMSTATE_UINT64_ARRAY(ioredtbl, IOAPICState, IOAPIC_NUM_PINS),
 VMSTATE_END_OF_LIST()
 }
-- 
1.7.1

[Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Jan Kiszka

The registers of real IOAPICs can be relocated during runtime (via
chipset registers). We don't support this yet, but qemu-kvm carries the
current base address in its version 2 vmstate.

To align both implementations for migratability, add the proper
infrastructure to accept initial as well as updated base addresses and
include the current address in the vmstate. This is done in a way that
will also allow multiple IOAPICs in the future.

Signed-off-by: Jan Kiszka 
---
 hw/ioapic.c  |   17 +
 hw/ioapic.h  |4 
 hw/pc_piix.c |5 ++---
 3 files changed, 23 insertions(+), 3 deletions(-)

diff --git a/hw/ioapic.c b/hw/ioapic.c
index c7019f5..b53d436 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -63,6 +63,8 @@ struct IOAPICState {
 
 uint32_t irr;
 uint64_t ioredtbl[IOAPIC_NUM_PINS];
+uint64_t default_base_address;
+uint64_t current_base_address;
 };
 
 static IOAPICState *ioapics[MAX_IOAPICS];
@@ -237,6 +239,7 @@ static int ioapic_pre_load(void *opaque)
 
 /* in case we are loading version 1, set these to sane values */
 s->irr = 0;
+s->current_base_address = s->default_base_address;
 return 0;
 }
 
@@ -249,6 +252,7 @@ static const VMStateDescription vmstate_ioapic = {
 .fields  = (VMStateField []) {
 VMSTATE_UINT8(id, IOAPICState),
 VMSTATE_UINT8(ioregsel, IOAPICState),
+VMSTATE_UINT64_V(current_base_address, IOAPICState, 2),
 VMSTATE_UINT32_V(irr, IOAPICState, 2),
 VMSTATE_UINT64_ARRAY(ioredtbl, IOAPICState, IOAPIC_NUM_PINS),
 VMSTATE_END_OF_LIST()
@@ -260,6 +264,8 @@ static void ioapic_reset(DeviceState *d)
 IOAPICState *s = DO_UPCAST(IOAPICState, busdev.qdev, d);
 int i;
 
+s->current_base_address = s->default_base_address;
+sysbus_mmio_map(&s->busdev, 0, s->current_base_address);
 s->id = 0;
 s->ioregsel = 0;
 s->irr = 0;
@@ -279,6 +285,17 @@ static CPUWriteMemoryFunc * const ioapic_mem_write[3] = {
 ioapic_mem_writel,
 };
 
+void ioapic_set_base_address(DeviceState *d, target_phys_addr_t addr)
+{
+IOAPICState *s = DO_UPCAST(IOAPICState, busdev.qdev, d);
+
+s->current_base_address = addr;
+if (!s->default_base_address) {
+s->default_base_address = addr;
+}
+sysbus_mmio_map(&s->busdev, 0, addr);
+}
+
 static int ioapic_init1(SysBusDevice *dev)
 {
 IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
diff --git a/hw/ioapic.h b/hw/ioapic.h
index c441567..1130d60 100644
--- a/hw/ioapic.h
+++ b/hw/ioapic.h
@@ -17,4 +17,8 @@
  * License along with this library; if not, see .
  */
 
+#define IOAPIC_DEFAULT_BASE_ADDRESS0xfec0
+
 void ioapic_eio_broadcast(int vector);
+
+void ioapic_set_base_address(DeviceState *d, target_phys_addr_t addr);
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 7b74473..479334f 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -25,6 +25,7 @@
 #include "hw.h"
 #include "pc.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "pci.h"
 #include "usb-uhci.h"
 #include "usb-ohci.h"
@@ -46,13 +47,11 @@ static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 static void ioapic_init(IsaIrqState *isa_irq_state)
 {
 DeviceState *dev;
-SysBusDevice *d;
 unsigned int i;
 
 dev = qdev_create(NULL, "ioapic");
 qdev_init_nofail(dev);
-d = sysbus_from_qdev(dev);
-sysbus_mmio_map(d, 0, 0xfec0);
+ioapic_set_base_address(dev, IOAPIC_DEFAULT_BASE_ADDRESS);
 
 for (i = 0; i < IOAPIC_NUM_PINS; i++) {
 isa_irq_state->ioapic[i] = qdev_get_gpio_in(dev, i);
-- 
1.7.1

[Qemu-devel] [Bug 712416] Re: kvm_intel kernel module crash with via nano vmx

2011-02-03 Thread Serge Hallyn

Thanks for reporting this bug and helping to make Ubuntu better.

A bit of searching suggests that VIA Nano was until recently very little
tested and supported.   The kernels you are working with are rather old
compared to upstream.  So in order to see whether you should be talking
to us or kvm upstream, could you please test this with the latest daily
kernel build (see http://kernel.ubuntu.com/~kernel-
ppa/mainline/daily/current/) and the latest qemu-kvm build (see
https://launchpad.net/~ubuntu-server-edgers/+archive/server-edgers-qemu-
kvm).

** Changed in: kvm (Ubuntu)
   Status: New => Incomplete

** Bug watch added: Debian Bug tracker #570244
   http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=570244

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/712416

Title:
  kvm_intel kernel module crash with via nano vmx

Status in QEMU:
  New
Status in “kvm” package in Ubuntu:
  Incomplete
Status in “kvm” package in Debian:
  New

Bug description:
  kvm module for hardware virtualisation not work properly on via nano
  processors.

  Tested with processor: VIA Nano processor U2250.
  Processors flags (visible in /proc/cpuinfo): fpu vme de pse tsc msr pae mce 
cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall 
nx lm constant_tsc up rep_good pni monitor vmx est tm2 ssse3 cx16 xtpr rng 
rng_en ace ace_en ace2 phe phe_en lahf_lm

  With kernel 2.6.32: kvm not work and dmesg contains a lot of:
  handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d

  With kernel 2.6.35: all the system crash. Nothing visible in logs

[Qemu-devel] [Bug 712416] Re: kvm_intel kernel module crash with via nano vmx

2011-02-03 Thread Serge Hallyn

Note that this appears to be a dup of http://bugs.debian.org/cgi-
bin/bugreport.cgi?bug=570244.  See also thread at
http://www.spinics.net/lists/kvm/msg41122.html.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/712416

Title:
  kvm_intel kernel module crash with via nano vmx

Status in QEMU:
  New
Status in “kvm” package in Ubuntu:
  Incomplete
Status in “kvm” package in Debian:
  New

Bug description:
  kvm module for hardware virtualisation not work properly on via nano
  processors.

  Tested with processor: VIA Nano processor U2250.
  Processors flags (visible in /proc/cpuinfo): fpu vme de pse tsc msr pae mce 
cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall 
nx lm constant_tsc up rep_good pni monitor vmx est tm2 ssse3 cx16 xtpr rng 
rng_en ace ace_en ace2 phe phe_en lahf_lm

  With kernel 2.6.32: kvm not work and dmesg contains a lot of:
  handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d

  With kernel 2.6.35: all the system crash. Nothing visible in logs

[Qemu-devel] [PATCH] QCOW2: bug fix - read base image beyond its size

2011-02-03 Thread Chunqiang Tang

This patch fixes the following bug in QCOW2. For a QCOW2 image that is larger
than its base image, when handling a read request straddling over the end of the
base image, the QCOW2 driver attempts to read beyond the end of the base image
and the request would fail.

This bug was found by Fast Virtual Disk (FVD)'s fully automated testing tool.
The following test triggered the bug.

dd if=/dev/zero of=/var/ramdisk/truth.raw count=0 bs=1 seek=1098561536
dd if=/dev/zero of=/var/ramdisk/zero-500M.raw count=0 bs=1 seek=593099264
./qemu-img create -f qcow2 -ocluster_size=65536,backing_fmt=blksim -b 
/var/ramdisk/zero-500M.raw /var/ramdisk/test.qcow2 1098561536
./qemu-io --auto --seed=30477694 --truth=/var/ramdisk/truth.raw --format=qcow2 
--test=blksim:/var/ramdisk/test.qcow2 --verify_write=true 
--compare_before=false --compare_after=true --round=10 --parallel=100 
--io_size=10485760 --fail_prob=0 --cancel_prob=0 --instant_qemubh=true

Signed-off-by: Chunqiang Tang 
---
 block/qcow2.c |5 ++---
 cutils.c  |   31 +++
 qemu-common.h |2 ++
 3 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index dbe4fdd..8c906d1 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -355,7 +355,7 @@ int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector 
*qiov,
 else
 n1 = bs->total_sectors - sector_num;
 
-qemu_iovec_memset(qiov, 0, 512 * (nb_sectors - n1));
+qemu_iovec_memset_skip(qiov, 0, 512 * (nb_sectors - n1), 512 * n1);
 
 return n1;
 }
@@ -478,8 +478,7 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
 if (n1 > 0) {
 BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
 acb->hd_aiocb = bdrv_aio_readv(bs->backing_hd, acb->sector_num,
-&acb->hd_qiov, acb->cur_nr_sectors,
-   qcow2_aio_read_cb, acb);
+&acb->hd_qiov, n1, qcow2_aio_read_cb, acb);
 if (acb->hd_aiocb == NULL)
 goto done;
 } else {
diff --git a/cutils.c b/cutils.c
index 8d562b2..e215210 100644
--- a/cutils.c
+++ b/cutils.c
@@ -267,6 +267,37 @@ void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t 
count)
 }
 }
 
+void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count, 
+size_t skip)
+{
+int i;
+size_t done;
+void *iov_base;
+uint64_t iov_len;
+
+done = 0;
+for (i = 0; (i < qiov->niov) && (done != count); i++) {
+if (skip >= qiov->iov[i].iov_len) {
+/* Skip the whole iov */
+skip -= qiov->iov[i].iov_len;
+continue;
+} else {
+/* Skip only part (or nothing) of the iov */
+iov_base = (uint8_t*) qiov->iov[i].iov_base + skip;
+iov_len = qiov->iov[i].iov_len - skip;
+skip = 0;
+}
+
+if (done + iov_len > count) {
+memset (iov_base, c, count - done);
+break;
+} else {
+memset (iov_base, c, iov_len);
+}
+done += iov_len;
+}
+}
+
 #ifndef _WIN32
 /* Sets a specific flag */
 int fcntl_setfl(int fd, int flag)
diff --git a/qemu-common.h b/qemu-common.h
index c7ff280..7951109 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -322,6 +322,8 @@ void qemu_iovec_reset(QEMUIOVector *qiov);
 void qemu_iovec_to_buffer(QEMUIOVector *qiov, void *buf);
 void qemu_iovec_from_buffer(QEMUIOVector *qiov, const void *buf, size_t count);
 void qemu_iovec_memset(QEMUIOVector *qiov, int c, size_t count);
+void qemu_iovec_memset_skip(QEMUIOVector *qiov, int c, size_t count, 
+size_t skip);
 
 struct Monitor;
 typedef struct Monitor Monitor;
-- 
1.7.0.4

[Qemu-devel] [Bug 670776] Re: Build errors on TEGRA2(ubuntu), testandset()

2011-02-03 Thread Peter Maydell

This patch which I proposed would fix this bug:
http://patchwork.ozlabs.org/patch/81205/

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/670776

Title:
  Build errors on TEGRA2(ubuntu), testandset()

Status in QEMU:
  New

Bug description:
  Build errors on TEGRA2(ubuntu)

  /tmp/ccqt9Y5t.s: Assembler messages:
  /tmp/ccqt9Y5t.s:1899: Error: selected processor does not support Thumb mode 
'swp r4, r4, [r2]'
  /tmp/ccqt9Y5t.s:1974: Error: selected processor does not support Thumb mode 
'swp r5, r5, [ip]'
  make[1]: *** [exec.o Error 1

  qemu-lock.h

  #elif defined(__arm__)
  static inline int testandset (int *spinlock)
  {
  register unsigned int ret;
  __asm__ __volatile__("swp %0, %1, [%2]"
   : "=r"(ret)
   : "0"(1), "r"(spinlock));
  return ret;
  }

[Qemu-devel] Re: [RFC: 0/2] patch for QEMU HPET periodic timer emulation to alleviate time drift

2011-02-03 Thread Jan Kiszka

On 2011-02-03 14:43, Ulrich Obergfell wrote:
> 
> Hi,
> 
> I am observing severe backward time drift in a MS Windows Vista(tm)
> guest running on a Fedora 14 KVM host. I can reproduce the problem
> with the following steps:
> 
> 1. Use 'vncviewer' to connect to the guest's desktop.
> 2. Click on the menu title bar of a window on the guest's desktop.
> 3. Move that window around on the guest's desktop.
> 
> While I keep on moving the window around for one minute, the guest
> time falls up to 15 seconds behind host time.
> 
> The problem is caused by delayed callbacks of hpet_timer(). A timer
> interrupt is injected into the guest during each callback. However,
> interrupts are lost if delays are greater than a comparator period.
> 

Yes, that's a well known limitation of qemu, in fact. We are lacking a
generic irq coalescing infrastructure. That, once designed and
available, would also allow to fix the HPET.

> 
> This is an RFC through which I would like to get feedback on how the
> idea of a patch to compensate those lost interrupts would be received:
> 
> The patch determines the number of lost timer interrupts based on the
> number of elapsed comparator periods. Lost interrupts are compensated

That neglects coalescing of the HPET IRQs: If the timer is run regularly
but the guest is not able to retrieve the injected IRQs, you should
still see drifts with your patches.

> by gradually injecting additional interrupts during the subsequent
> timer intervals, starting at a rate of one additional interrupt per
> interval. If further interrupts are lost while compensation is still
> in progress, the rate is increased. The algorithm imposes a limit on
> the rate and on the 'backlog' of lost interrupts to be injected. The
> patch can be enabled via a qemu command line option.
> 
>   -hpet [device=none|present][,driftfix=none|slew]
> 
> The 'device=none' option is equivalent to the '-no-hpet' option, and
> the 'driftfix=slew' option enables the patch (similar to RTC).
> 
> 
> The second and third part of this series of email contain the patch:
> 
> - Code part 1 introduces the qemu command line option.
> - Code part 2 implements compensation of lost interrupts.
> 
> Please review and please comment.
> 

Generally, this issue needs to be attacked at qemu level (added to CC),
not qemu-kvm.

We had a lengthy discussion on the list last year. We (including qemu
people) basically agreed that we needs a generic feedback infrastructure
to track coalesced IRQs for periodic, clock providing devices to allow
reinjection (which would include reinjection of completely missed timer
events like in your series).

However, there was one unsolved design issue remain IIRC:

http://thread.gmane.org/gmane.comp.emulators.qemu/73181

Once we have a proper answer for this, we can resume creating the
de-coalescing framework.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] e1000: multi-buffer packet support

2011-02-03 Thread Michael S. Tsirkin

e1000 supports multi-buffer packets larger than rxbuf_size.

This fixes the following (on linux):
- in guest: ifconfig eth1 mtu 16110
- in host: ifconfig tap0 mtu 16110
   ping -s 16082 

Red Hat bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=602205

Signed-off-by: Michael S. Tsirkin 
---
 hw/e1000.c |   25 -
 1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index af101bd..2c133ab 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -642,6 +642,8 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 uint16_t vlan_special = 0;
 uint8_t vlan_status = 0, vlan_offset = 0;
 uint8_t min_buf[MIN_BUF_SIZE];
+size_t desc_offset;
+size_t desc_size;
 
 if (!(s->mac_reg[RCTL] & E1000_RCTL_EN))
 return -1;
@@ -654,7 +656,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 size = sizeof(min_buf);
 }
 
-if (size > s->rxbuf_size) {
+if (0 && size > s->rxbuf_size) {
 DBGOUT(RX, "packet too large for buffers (%lu > %d)\n",
(unsigned long)size, s->rxbuf_size);
 return -1;
@@ -672,8 +674,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 }
 
 rdh_start = s->mac_reg[RDH];
+desc_offset = 0;
 do {
+desc_size = size - desc_offset;
+if (desc_size > s->rxbuf_size) {
+desc_size = s->rxbuf_size;
+}
 if (s->mac_reg[RDH] == s->mac_reg[RDT] && s->check_rxov) {
+/* Discard all data written so far */
+s->mac_reg[RDH] = rdh_start;
 set_ics(s, 0, E1000_ICS_RXO);
 return -1;
 }
@@ -684,9 +693,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 desc.status |= (vlan_status | E1000_RXD_STAT_DD);
 if (desc.buffer_addr) {
 cpu_physical_memory_write(le64_to_cpu(desc.buffer_addr),
-  (void *)(buf + vlan_offset), size);
-desc.length = cpu_to_le16(size + fcs_len(s));
-desc.status |= E1000_RXD_STAT_EOP|E1000_RXD_STAT_IXSM;
+  (void *)(buf + desc_offset + 
vlan_offset),
+  desc_size);
+desc_offset += desc_size;
+if (desc_offset >= size) {
+desc.length = cpu_to_le16(desc_size + fcs_len(s));
+desc.status |= E1000_RXD_STAT_EOP | E1000_RXD_STAT_IXSM;
+} else {
+desc.length = cpu_to_le16(desc_size);
+}
 } else { // as per intel docs; skip descriptors with null buf addr
 DBGOUT(RX, "Null RX descriptor!!\n");
 }
@@ -702,7 +717,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 set_ics(s, 0, E1000_ICS_RXO);
 return -1;
 }
-} while (desc.buffer_addr == 0);
+} while (desc_offset < size);
 
 s->mac_reg[GPRC]++;
 s->mac_reg[TPR]++;
-- 
1.7.3.2.91.g446ac

[Qemu-devel] Re: [PATCH v2] make tsc stable over migration and machine start

2011-02-03 Thread Marcelo Tosatti

On Wed, Feb 02, 2011 at 07:16:20AM -0500, Glauber Costa wrote:
> If the machine is stopped, we should not record two different tsc values
> upon a save operation. The same problem happens with kvmclock.
> 
> But kvmclock is taking a different diretion, being now seen as a separate
> device. Since this is unlikely to happen with the tsc, I am taking the
> approach here of simply registering a handler for state change, and
> using a per-CPUState variable that prevents double updates for the TSC.
> 
> Signed-off-by: Glauber Costa 
> CC: Jan Kiszka 
> 
> ---
> v2: updated tsc validation logic, as asked by Jan
> ---
>  target-i386/cpu.h |1 +
>  target-i386/kvm.c |   18 +-
>  2 files changed, 18 insertions(+), 1 deletions(-)

Please regenerate against uq/master.

[Qemu-devel] Re: [PATCH] vnc: Fix password expiration through 'change vnc ""'

2011-02-03 Thread Daniel P. Berrange

On Mon, Jan 31, 2011 at 02:43:19PM -0600, Anthony Liguori wrote:
> commit 52c18be9e99dabe295321153fda7fce9f76647ac introduced a regression in the
> change vnc password command that changed the behavior of setting the VNC
> password to an empty string from disabling login to disabling authentication.
> 
> This commit refactors the code to eliminate this overloaded semantics in
> vnc_display_password and instead introduces the vnc_display_disable_login.   
> The
> monitor implementation then determines the behavior of an empty or missing
> string.

Personally I think this is a little overkill & just reverting the
original patch was fine, but from a functional POV your patch
produces the same results, so I won't argue.

> Recently, a set_password command was added that allows both the Spice and VNC
> password to be set.  This command has not shown up in a release yet so the
> behavior is not yet defined.
> 
> This patch proposes that an empty password be treated as an empty password 
> with
> no special handling.  For specifically disabling login, I believe a new 
> command
> should be introduced instead of overloading semantics.

Agreed, if some mgmt app does need to change this kind of thin
on the fly, they'll likely want more than just a toggle between
AUTH_NONE/AUTH_VNC too. eg There's AUTH_SASL, which is the only
VNC auth scheme with any real security, and the psuedo auth
schemes for providing the TLS encryption/certificate support.

> I'm not sure how Spice handles this but I would recommend that we have Spice
> and VNC have consistent semantics here for the 0.14.0 release.

Sounds like a very good idea.

> Reported-by: Neil Wilson 
> Signed-off-by: Anthony Liguori 
> 
> diff --git a/console.h b/console.h
> index 3157330..f4e4741 100644
> --- a/console.h
> +++ b/console.h
> @@ -369,6 +369,7 @@ void vnc_display_init(DisplayState *ds);
>  void vnc_display_close(DisplayState *ds);
>  int vnc_display_open(DisplayState *ds, const char *display);
>  int vnc_display_password(DisplayState *ds, const char *password);
> +int vnc_display_disable_login(DisplayState *ds);
>  int vnc_display_pw_expire(DisplayState *ds, time_t expires);
>  void do_info_vnc_print(Monitor *mon, const QObject *data);
>  void do_info_vnc(Monitor *mon, QObject **ret_data);
> diff --git a/monitor.c b/monitor.c
> index c5f54f4..24ed971 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -1018,6 +1018,13 @@ static int do_quit(Monitor *mon, const QDict *qdict, 
> QObject **ret_data)
>  
>  static int change_vnc_password(const char *password)
>  {
> +if (!password || !password[0]) {
> +if (vnc_display_disable_login(NULL)) {
> +qerror_report(QERR_SET_PASSWD_FAILED);
> +return -1;
> +}
> +}
> +
>  if (vnc_display_password(NULL, password) < 0) {
>  qerror_report(QERR_SET_PASSWD_FAILED);
>  return -1;
> @@ -1117,6 +1124,8 @@ static int set_password(Monitor *mon, const QDict 
> *qdict, QObject **ret_data)
>  qerror_report(QERR_INVALID_PARAMETER, "connected");
>  return -1;
>  }
> +/* Note that setting an empty password will not disable login through
> + * this interface. */
>  rc = vnc_display_password(NULL, password);
>  if (rc != 0) {
>  qerror_report(QERR_SET_PASSWD_FAILED);
> diff --git a/ui/vnc.c b/ui/vnc.c
> index 495d6d6..73e7ffa 100644
> --- a/ui/vnc.c
> +++ b/ui/vnc.c
> @@ -2484,6 +2484,24 @@ void vnc_display_close(DisplayState *ds)
>  #endif
>  }
>  
> +int vnc_display_disable_login(DisplayState *ds)
> +{
> +VncDisplay *vs = ds ? (VncDisplay *)ds->opaque : vnc_display;
> +
> +if (!vs) {
> +return -1;
> +}
> +
> +if (vs->password) {
> +qemu_free(vs->password);
> +}
> +
> +vs->password = NULL;
> +vs->auth = VNC_AUTH_VNC;
> +
> +return 0;
> +}
> +
>  int vnc_display_password(DisplayState *ds, const char *password)
>  {
>  VncDisplay *vs = ds ? (VncDisplay *)ds->opaque : vnc_display;
> @@ -2492,19 +2510,18 @@ int vnc_display_password(DisplayState *ds, const char 
> *password)
>  return -1;
>  }
>  
> +if (!password) {
> +/* This is not the intention of this interface but err on the side
> +   of being safe */
> +return vnc_display_disable_login(ds);
> +}
> +
>  if (vs->password) {
>  qemu_free(vs->password);
>  vs->password = NULL;
>  }
> -if (password && password[0]) {
> -if (!(vs->password = qemu_strdup(password)))
> -return -1;
> -if (vs->auth == VNC_AUTH_NONE) {
> -vs->auth = VNC_AUTH_VNC;
> -}
> -} else {
> -vs->auth = VNC_AUTH_NONE;
> -}
> +vs->password = qemu_strdup(password);
> +vs->auth = VNC_AUTH_VNC;
>  
>  return 0;
>  }

Looks good, assuming the addition of the missing 'return 0' you already
mentioned

Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http

Re: [Qemu-devel] Re: [PATCH] vnc: Fix password expiration through 'change vnc ""'

2011-02-03 Thread Anthony Liguori


On 02/03/2011 10:29 AM, Daniel P. Berrange wrote:

On Mon, Jan 31, 2011 at 02:43:19PM -0600, Anthony Liguori wrote:
   

commit 52c18be9e99dabe295321153fda7fce9f76647ac introduced a regression in the
change vnc password command that changed the behavior of setting the VNC
password to an empty string from disabling login to disabling authentication.

This commit refactors the code to eliminate this overloaded semantics in
vnc_display_password and instead introduces the vnc_display_disable_login.   The
monitor implementation then determines the behavior of an empty or missing
string.
 

Personally I think this is a little overkill&  just reverting the
original patch was fine, but from a functional POV your patch
produces the same results, so I won't argue.
   


For 0.15, I'd like to introduce a new set of commands such that we don't 
multiplex the change command anymore.   This refactoring lays the ground 
work for that.


For instance, if you created a block device with the name 'vnc', you'd 
get very unexpected results!  Multiplexing based on special values on 
top of existing commands is pretty evil.



Recently, a set_password command was added that allows both the Spice and VNC
password to be set.  This command has not shown up in a release yet so the
behavior is not yet defined.

This patch proposes that an empty password be treated as an empty password with
no special handling.  For specifically disabling login, I believe a new command
should be introduced instead of overloading semantics.
 

Agreed, if some mgmt app does need to change this kind of thin
on the fly, they'll likely want more than just a toggle between
AUTH_NONE/AUTH_VNC too. eg There's AUTH_SASL, which is the only
VNC auth scheme with any real security, and the psuedo auth
schemes for providing the TLS encryption/certificate support.

   

I'm not sure how Spice handles this but I would recommend that we have Spice
and VNC have consistent semantics here for the 0.14.0 release.
 

Sounds like a very good idea.

   

Reported-by: Neil Wilson
Signed-off-by: Anthony Liguori

diff --git a/console.h b/console.h
index 3157330..f4e4741 100644
--- a/console.h
+++ b/console.h
@@ -369,6 +369,7 @@ void vnc_display_init(DisplayState *ds);
  void vnc_display_close(DisplayState *ds);
  int vnc_display_open(DisplayState *ds, const char *display);
  int vnc_display_password(DisplayState *ds, const char *password);
+int vnc_display_disable_login(DisplayState *ds);
  int vnc_display_pw_expire(DisplayState *ds, time_t expires);
  void do_info_vnc_print(Monitor *mon, const QObject *data);
  void do_info_vnc(Monitor *mon, QObject **ret_data);
diff --git a/monitor.c b/monitor.c
index c5f54f4..24ed971 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1018,6 +1018,13 @@ static int do_quit(Monitor *mon, const QDict *qdict, 
QObject **ret_data)

  static int change_vnc_password(const char *password)
  {
+if (!password || !password[0]) {
+if (vnc_display_disable_login(NULL)) {
+qerror_report(QERR_SET_PASSWD_FAILED);
+return -1;
+}
+}
+
  if (vnc_display_password(NULL, password)<  0) {
  qerror_report(QERR_SET_PASSWD_FAILED);
  return -1;
@@ -1117,6 +1124,8 @@ static int set_password(Monitor *mon, const QDict *qdict, 
QObject **ret_data)
  qerror_report(QERR_INVALID_PARAMETER, "connected");
  return -1;
  }
+/* Note that setting an empty password will not disable login through
+ * this interface. */
  rc = vnc_display_password(NULL, password);
  if (rc != 0) {
  qerror_report(QERR_SET_PASSWD_FAILED);
diff --git a/ui/vnc.c b/ui/vnc.c
index 495d6d6..73e7ffa 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -2484,6 +2484,24 @@ void vnc_display_close(DisplayState *ds)
  #endif
  }

+int vnc_display_disable_login(DisplayState *ds)
+{
+VncDisplay *vs = ds ? (VncDisplay *)ds->opaque : vnc_display;
+
+if (!vs) {
+return -1;
+}
+
+if (vs->password) {
+qemu_free(vs->password);
+}
+
+vs->password = NULL;
+vs->auth = VNC_AUTH_VNC;
+
+return 0;
+}
+
  int vnc_display_password(DisplayState *ds, const char *password)
  {
  VncDisplay *vs = ds ? (VncDisplay *)ds->opaque : vnc_display;
@@ -2492,19 +2510,18 @@ int vnc_display_password(DisplayState *ds, const char 
*password)
  return -1;
  }

+if (!password) {
+/* This is not the intention of this interface but err on the side
+   of being safe */
+return vnc_display_disable_login(ds);
+}
+
  if (vs->password) {
  qemu_free(vs->password);
  vs->password = NULL;
  }
-if (password&&  password[0]) {
-if (!(vs->password = qemu_strdup(password)))
-return -1;
-if (vs->auth == VNC_AUTH_NONE) {
-vs->auth = VNC_AUTH_VNC;
-}
-} else {
-vs->auth = VNC_AUTH_NONE;
-}
+vs->password = qemu_strdup(password);
+vs->aut

[Qemu-devel] [Bug 712416] Re: kvm_intel kernel module crash with via nano vmx

2011-02-03 Thread khetzal

yes this appears to be a duplication but the original bugreport has been
posted one year ago and there is no evolution. Moreover, it's not posted
on the officiel qemu / kvm bugtracker ;).

I'm waiting that my current installation complete and i test the new
kernel / kvm, but i've tested one week ago with the linux kernel kvm git
branch and the git master version of kvm, and it crashed directly.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/712416

Title:
  kvm_intel kernel module crash with via nano vmx

Status in QEMU:
  New
Status in “kvm” package in Ubuntu:
  Incomplete
Status in “kvm” package in Debian:
  New

Bug description:
  kvm module for hardware virtualisation not work properly on via nano
  processors.

  Tested with processor: VIA Nano processor U2250.
  Processors flags (visible in /proc/cpuinfo): fpu vme de pse tsc msr pae mce 
cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall 
nx lm constant_tsc up rep_good pni monitor vmx est tm2 ssse3 cx16 xtpr rng 
rng_en ace ace_en ace2 phe phe_en lahf_lm

  With kernel 2.6.32: kvm not work and dmesg contains a lot of:
  handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d

  With kernel 2.6.35: all the system crash. Nothing visible in logs

[Qemu-devel] [PATCH] tap: minor code cleanup

2011-02-03 Thread Michael S. Tsirkin

remove a confusing comment.
TAP_DEFAULT_SNDBUF 0 is later translated to INT_MAX, so
let's set that value directly.

Signed-off-by: Michael S. Tsirkin 
---
 net/tap-linux.c |7 +--
 1 files changed, 1 insertions(+), 6 deletions(-)

diff --git a/net/tap-linux.c b/net/tap-linux.c
index 00f84d4..b8c0f3b 100644
--- a/net/tap-linux.c
+++ b/net/tap-linux.c
@@ -84,13 +84,8 @@ int tap_open(char *ifname, int ifname_size, int *vnet_hdr, 
int vnet_hdr_required
  * Unfortunately when it's enabled, and packets are sent
  * to other guests on the same host, the receiver
  * can lock up the transmitter indefinitely.
- *
- * To avoid packet loss, sndbuf should be set to a value lower than the tx
- * queue capacity of any destination network interface.
- * Ethernet NICs generally have txqueuelen=1000, so 1Mb is
- * a good value, given a 1500 byte MTU.
  */
-#define TAP_DEFAULT_SNDBUF 0
+#define TAP_DEFAULT_SNDBUF INT_MAX
 
 int tap_set_sndbuf(int fd, QemuOpts *opts)
 {
-- 
1.7.3.2.91.g446ac

[Qemu-devel] Re: [PATCH] e1000: multi-buffer packet support

2011-02-03 Thread Alex Williamson

On Thu, 2011-02-03 at 17:38 +0200, Michael S. Tsirkin wrote:
> e1000 supports multi-buffer packets larger than rxbuf_size.
> 
> This fixes the following (on linux):
> - in guest: ifconfig eth1 mtu 16110
> - in host: ifconfig tap0 mtu 16110
>ping -s 16082 
> 
> Red Hat bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=602205
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  hw/e1000.c |   25 -
>  1 files changed, 20 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/e1000.c b/hw/e1000.c
> index af101bd..2c133ab 100644
> --- a/hw/e1000.c
> +++ b/hw/e1000.c
> @@ -642,6 +642,8 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  uint16_t vlan_special = 0;
>  uint8_t vlan_status = 0, vlan_offset = 0;
>  uint8_t min_buf[MIN_BUF_SIZE];
> +size_t desc_offset;
> +size_t desc_size;
>  
>  if (!(s->mac_reg[RCTL] & E1000_RCTL_EN))
>  return -1;
> @@ -654,7 +656,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  size = sizeof(min_buf);
>  }
>  
> -if (size > s->rxbuf_size) {
> +if (0 && size > s->rxbuf_size) {
>  DBGOUT(RX, "packet too large for buffers (%lu > %d)\n",
> (unsigned long)size, s->rxbuf_size);
>  return -1;

Why are we saving this code if it's unreachable?

> @@ -672,8 +674,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  }
>  
>  rdh_start = s->mac_reg[RDH];
> +desc_offset = 0;
>  do {
> +desc_size = size - desc_offset;
> +if (desc_size > s->rxbuf_size) {
> +desc_size = s->rxbuf_size;
> +}
>  if (s->mac_reg[RDH] == s->mac_reg[RDT] && s->check_rxov) {
> +/* Discard all data written so far */
> +s->mac_reg[RDH] = rdh_start;
>  set_ics(s, 0, E1000_ICS_RXO);
>  return -1;
>  }
> @@ -684,9 +693,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  desc.status |= (vlan_status | E1000_RXD_STAT_DD);
>  if (desc.buffer_addr) {
>  cpu_physical_memory_write(le64_to_cpu(desc.buffer_addr),
> -  (void *)(buf + vlan_offset), size);
> -desc.length = cpu_to_le16(size + fcs_len(s));
> -desc.status |= E1000_RXD_STAT_EOP|E1000_RXD_STAT_IXSM;
> +  (void *)(buf + desc_offset + 
> vlan_offset),
> +  desc_size);
> +desc_offset += desc_size;
> +if (desc_offset >= size) {
> +desc.length = cpu_to_le16(desc_size + fcs_len(s));
> +desc.status |= E1000_RXD_STAT_EOP | E1000_RXD_STAT_IXSM;
> +} else {
> +desc.length = cpu_to_le16(desc_size);
> +}
>  } else { // as per intel docs; skip descriptors with null buf addr
>  DBGOUT(RX, "Null RX descriptor!!\n");
>  }
> @@ -702,7 +717,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  set_ics(s, 0, E1000_ICS_RXO);
>  return -1;
>  }
> -} while (desc.buffer_addr == 0);
> +} while (desc_offset < size);
>  
>  s->mac_reg[GPRC]++;
>  s->mac_reg[TPR]++;

[Qemu-devel] Re: [PATCH] e1000: multi-buffer packet support

2011-02-03 Thread Stefan Hajnoczi

On Thu, Feb 03, 2011 at 05:38:35PM +0200, Michael S. Tsirkin wrote:
> @@ -654,7 +656,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  size = sizeof(min_buf);
>  }
> 
> -if (size > s->rxbuf_size) {
> +if (0 && size > s->rxbuf_size) {
>  DBGOUT(RX, "packet too large for buffers (%lu > %d)\n",
> (unsigned long)size, s->rxbuf_size);
>  return -1;

Please remove completely or enhance this conditional to work in a
multi-buffer world.

Stefan

Re: [Qemu-devel] [PATCH 01/20] qdev: add print_options callback

2011-02-03 Thread Anthony Liguori


On 02/02/2011 02:28 PM, Alon Levy wrote:

another callback added to PropertyInfo, for later use by PROP_TYPE_ENUM.
Allows printing of runtime computed options when doing:
  qemu -device foo,?
---
  hw/qdev.c |   10 +-
  hw/qdev.h |1 +
  2 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/hw/qdev.c b/hw/qdev.c
index c7fec44..0b2ad3d 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -187,7 +187,15 @@ int qdev_device_help(QemuOpts *opts)
  if (!prop->info->parse) {
  continue;   /* no way to set it, don't show */
  }
-error_printf("%s.%s=%s\n", info->name, prop->name, prop->info->name);
+if (prop->info->print_options) {
+char buf[256];
+int ret;
+ret = prop->info->print_options(info, prop, buf, sizeof(buf) - 3);
+error_printf("%s.%s=%s%s\n", info->name, prop->name, buf,
+ret == sizeof(buf) - 3 ? "..." : "" );
+} else {
+error_printf("%s.%s=%s\n", info->name, prop->name, 
prop->info->name);
+}
  }
  return 1;
  }
diff --git a/hw/qdev.h b/hw/qdev.h
index 9808f85..fa3221b 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -109,6 +109,7 @@ struct PropertyInfo {
  enum PropertyType type;
  int (*parse)(DeviceState *dev, Property *prop, const char *str);
  int (*print)(DeviceState *dev, Property *prop, char *dest, size_t len);
+int (*print_options)(DeviceInfo *info, Property *prop, char *dest, size_t 
len);
   


I'm not clear what the different between print and print_options is and 
why two callbacks are needed.  Can you explain and comments to the patch 
clarifying why two interfaces are needed.  Even better would be if one 
interface could handle it.


Regards,

Anthony Liguori



  void (*free)(DeviceState *dev, Property *prop);
  };

Re: [Qemu-devel] [PATCH 03/20] qdev-properties: add PROP_TYPE_ENUM

2011-02-03 Thread Anthony Liguori


On 02/02/2011 02:28 PM, Alon Levy wrote:

Example usage:

EnumTable foo_enum_table[] = {
 {"bar", 1},
 {"buz", 2},
 {NULL, 0},
};

DEFINE_PROP_ENUM("foo", State, foo, 1, foo_enum_table)

When using qemu -device foodev,? it will appear as:
  foodev.foo=bar/buz
---
  hw/qdev-properties.c |   60 ++
  hw/qdev.h|   15 
  2 files changed, 75 insertions(+), 0 deletions(-)
   


Missing Signed-off-by.

Is it really necessary to introduce this in this series?  There's a lot 
missing here like the ability to determine which values are valid for an 
enum through QMP.


I think this will also unexpected break the device_add QMP interface 
since there's some special logic to handle types.  It's unclear to me 
whether we should be sending the symbolic name or the value through QMP.


Regards,

Anthony Liguori


diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c
index a493087..3157721 100644
--- a/hw/qdev-properties.c
+++ b/hw/qdev-properties.c
@@ -63,6 +63,66 @@ PropertyInfo qdev_prop_bit = {
  .print = print_bit,
  };

+/* --- Enumeration --- */
+/* Example usage:
+EnumTable foo_enum_table[] = {
+{"bar", 1},
+{"buz", 2},
+{NULL, 0},
+};
+DEFINE_PROP_ENUM("foo", State, foo, 1, foo_enum_table),
+ */
+static int parse_enum(DeviceState *dev, Property *prop, const char *str)
+{
+uint8_t *ptr = qdev_get_prop_ptr(dev, prop);
+EnumTable *option = (EnumTable*)prop->data;
+
+while (option->name != NULL) {
+if (!strncmp(str, option->name, strlen(option->name))) {
+*ptr = option->value;
+return 0;
+}
+option++;
+}
+return -EINVAL;
+}
+
+static int print_enum(DeviceState *dev, Property *prop, char *dest, size_t len)
+{
+uint32_t *p = qdev_get_prop_ptr(dev, prop);
+EnumTable *option = (EnumTable*)prop->data;
+while (option->name != NULL) {
+if (*p == option->value) {
+return snprintf(dest, len, "%s", option->name);
+}
+option++;
+}
+return 0;
+}
+
+static int print_enum_options(DeviceInfo *info, Property *prop, char *dest, 
size_t len)
+{
+int ret = 0;
+EnumTable *option = (EnumTable*)prop->data;
+while (option->name != NULL) {
+ret += snprintf(dest + ret, len - ret, "%s", option->name);
+if (option[1].name != NULL) {
+ret += snprintf(dest + ret, len - ret, "/");
+}
+option++;
+}
+return ret;
+}
+
+PropertyInfo qdev_prop_enum = {
+.name  = "enum",
+.type  = PROP_TYPE_ENUM,
+.size  = sizeof(uint32_t),
+.parse = parse_enum,
+.print = print_enum,
+.print_options = print_enum_options,
+};
+
  /* --- 8bit integer --- */

  static int parse_uint8(DeviceState *dev, Property *prop, const char *str)
diff --git a/hw/qdev.h b/hw/qdev.h
index 3d9acd7..3701d83 100644
--- a/hw/qdev.h
+++ b/hw/qdev.h
@@ -102,6 +102,7 @@ enum PropertyType {
  PROP_TYPE_VLAN,
  PROP_TYPE_PTR,
  PROP_TYPE_BIT,
+PROP_TYPE_ENUM,
  };

  struct PropertyInfo {
@@ -121,6 +122,11 @@ typedef struct GlobalProperty {
  QTAILQ_ENTRY(GlobalProperty) next;
  } GlobalProperty;

+typedef struct EnumTable {
+const char *name;
+uint32_tvalue;
+} EnumTable;
+
  /*** Board API.  This should go away once we have a machine config file.  ***/

  DeviceState *qdev_create(BusState *bus, const char *name);
@@ -235,6 +241,7 @@ extern PropertyInfo qdev_prop_drive;
  extern PropertyInfo qdev_prop_netdev;
  extern PropertyInfo qdev_prop_vlan;
  extern PropertyInfo qdev_prop_pci_devfn;
+extern PropertyInfo qdev_prop_enum;

  #define DEFINE_PROP(_name, _state, _field, _prop, _type) { \
  .name  = (_name),\
@@ -257,6 +264,14 @@ extern PropertyInfo qdev_prop_pci_devfn;
  + type_check(uint32_t,typeof_field(_state, _field)), \
  .defval= (bool[]) { (_defval) }, \
  }
+#define DEFINE_PROP_ENUM(_name, _state, _field, _defval, _options) {\
+.name  = (_name),   \
+.info  =&(qdev_prop_enum), \
+.offset= offsetof(_state, _field)   \
++ type_check(uint32_t,typeof_field(_state, _field)),\
+.defval= (uint32_t[]) { (_defval) },\
+.data  = (void*)(_options), \
+}

  #define DEFINE_PROP_UINT8(_n, _s, _f, _d)   \
  DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_uint8, uint8_t)

[Qemu-devel] Re: [PATCH] e1000: multi-buffer packet support

2011-02-03 Thread Michael S. Tsirkin

On Thu, Feb 03, 2011 at 09:36:57AM -0700, Alex Williamson wrote:
> On Thu, 2011-02-03 at 17:38 +0200, Michael S. Tsirkin wrote:
> > e1000 supports multi-buffer packets larger than rxbuf_size.
> > 
> > This fixes the following (on linux):
> > - in guest: ifconfig eth1 mtu 16110
> > - in host: ifconfig tap0 mtu 16110
> >ping -s 16082 
> > 
> > Red Hat bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=602205
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >  hw/e1000.c |   25 -
> >  1 files changed, 20 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/e1000.c b/hw/e1000.c
> > index af101bd..2c133ab 100644
> > --- a/hw/e1000.c
> > +++ b/hw/e1000.c
> > @@ -642,6 +642,8 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> > size_t size)
> >  uint16_t vlan_special = 0;
> >  uint8_t vlan_status = 0, vlan_offset = 0;
> >  uint8_t min_buf[MIN_BUF_SIZE];
> > +size_t desc_offset;
> > +size_t desc_size;
> >  
> >  if (!(s->mac_reg[RCTL] & E1000_RCTL_EN))
> >  return -1;
> > @@ -654,7 +656,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> > size_t size)
> >  size = sizeof(min_buf);
> >  }
> >  
> > -if (size > s->rxbuf_size) {
> > +if (0 && size > s->rxbuf_size) {
> >  DBGOUT(RX, "packet too large for buffers (%lu > %d)\n",
> > (unsigned long)size, s->rxbuf_size);
> >  return -1;
> 
> Why are we saving this code if it's unreachable?

Just a leftover.

> > @@ -672,8 +674,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> > size_t size)
> >  }
> >  
> >  rdh_start = s->mac_reg[RDH];
> > +desc_offset = 0;
> >  do {
> > +desc_size = size - desc_offset;
> > +if (desc_size > s->rxbuf_size) {
> > +desc_size = s->rxbuf_size;
> > +}
> >  if (s->mac_reg[RDH] == s->mac_reg[RDT] && s->check_rxov) {
> > +/* Discard all data written so far */
> > +s->mac_reg[RDH] = rdh_start;
> >  set_ics(s, 0, E1000_ICS_RXO);
> >  return -1;
> >  }
> > @@ -684,9 +693,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> > size_t size)
> >  desc.status |= (vlan_status | E1000_RXD_STAT_DD);
> >  if (desc.buffer_addr) {
> >  cpu_physical_memory_write(le64_to_cpu(desc.buffer_addr),
> > -  (void *)(buf + vlan_offset), size);
> > -desc.length = cpu_to_le16(size + fcs_len(s));
> > -desc.status |= E1000_RXD_STAT_EOP|E1000_RXD_STAT_IXSM;
> > +  (void *)(buf + desc_offset + 
> > vlan_offset),
> > +  desc_size);
> > +desc_offset += desc_size;
> > +if (desc_offset >= size) {
> > +desc.length = cpu_to_le16(desc_size + fcs_len(s));
> > +desc.status |= E1000_RXD_STAT_EOP | E1000_RXD_STAT_IXSM;
> > +} else {
> > +desc.length = cpu_to_le16(desc_size);
> > +}
> >  } else { // as per intel docs; skip descriptors with null buf addr
> >  DBGOUT(RX, "Null RX descriptor!!\n");
> >  }
> > @@ -702,7 +717,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> > size_t size)
> >  set_ics(s, 0, E1000_ICS_RXO);
> >  return -1;
> >  }
> > -} while (desc.buffer_addr == 0);
> > +} while (desc_offset < size);
> >  
> >  s->mac_reg[GPRC]++;
> >  s->mac_reg[TPR]++;
> 
>

Re: [Qemu-devel] [PATCH 05/20] usb-ccid: add CCID bus

2011-02-03 Thread Anthony Liguori


On 02/02/2011 02:28 PM, Alon Levy wrote:

A CCID device is a smart card reader. It is a USB device, defined at [1].
This patch introduces the usb-ccid device that is a ccid bus. Next patches will
introduce two card types to use it, a passthru card and an emulated card.

  [1] http://www.usb.org/developers/devclass_docs/DWG_Smart-Card_CCID_Rev110.

Signed-off-by: Alon Levy
---
  Makefile.objs |1 +
  configure |6 +
  hw/ccid.h |   35 ++
  hw/usb-ccid.c | 1355 +
  4 files changed, 1397 insertions(+), 0 deletions(-)
  create mode 100644 hw/ccid.h
  create mode 100644 hw/usb-ccid.c

diff --git a/Makefile.objs b/Makefile.objs
index f1c7bfe..a1f3853 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -195,6 +195,7 @@ hw-obj-$(CONFIG_FDC) += fdc.o
  hw-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o
  hw-obj-$(CONFIG_APM) += pm_smbus.o apm.o
  hw-obj-$(CONFIG_DMA) += dma.o
+hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o

  # PPC devices
  hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/configure b/configure
index 598e8e1..14a035a 100755
--- a/configure
+++ b/configure
@@ -174,6 +174,7 @@ trace_backend="nop"
  trace_file="trace"
  spice=""
  rbd=""
+smartcard="yes"

  # parse CC options first
  for opt do
@@ -2472,6 +2473,7 @@ echo "Trace output file $trace_file-"
  echo "spice support $spice"
  echo "rbd support   $rbd"
  echo "xfsctl support$xfs"
+echo "smartcard support $smartcard"

  if test $sdl_too_old = "yes"; then
  echo "->  Your SDL version is too old - please upgrade to have SDL support"
@@ -2744,6 +2746,10 @@ if test "$spice" = "yes" ; then
echo "CONFIG_SPICE=y">>  $config_host_mak
  fi

+if test "$smartcard" = "yes" ; then
+  echo "CONFIG_SMARTCARD=y">>  $config_host_mak
+fi
+
  # XXX: suppress that
  if [ "$bsd" = "yes" ] ; then
echo "CONFIG_BSD=y">>  $config_host_mak
diff --git a/hw/ccid.h b/hw/ccid.h
new file mode 100644
index 000..af59070
--- /dev/null
+++ b/hw/ccid.h
@@ -0,0 +1,35 @@
+#ifndef __CCID_H__
+#define __CCID_H__
   


Missing copyright, plus identifiers can't start with __.


+
+#include "qdev.h"
+
+typedef struct CCIDCardState CCIDCardState;
+typedef struct CCIDCardInfo CCIDCardInfo;
+
+struct CCIDCardState {
+DeviceState qdev;
+uint32_tslot; // For future use with multiple slot reader.
+};
   


Please stick to C89 comments.


+
+struct CCIDCardInfo {
+DeviceInfo qdev;
+void (*print)(Monitor *mon, CCIDCardState *card, int indent);
+const uint8_t *(*get_atr)(CCIDCardState *card, uint32_t *len);
+void (*apdu_from_guest)(CCIDCardState *card, const uint8_t *apdu, uint32_t 
len);
+int (*exitfn)(CCIDCardState *card);
+int (*initfn)(CCIDCardState *card);
+};
+
+void ccid_card_send_apdu_to_guest(CCIDCardState *card, uint8_t* apdu, uint32_t 
len);
+void ccid_card_card_removed(CCIDCardState *card);
+void ccid_card_card_inserted(CCIDCardState *card);
+void ccid_card_card_error(CCIDCardState *card, uint64_t error);
+void ccid_card_qdev_register(CCIDCardInfo *card);
+
+/* support guest visible insertion/removal of ccid devices based on actual
+ * devices connected/removed. Called by card implementation (passthru, local) 
*/
+int ccid_card_ccid_attach(CCIDCardState *card);
+void ccid_card_ccid_detach(CCIDCardState *card);
+
+#endif // __CCID_H__
+
diff --git a/hw/usb-ccid.c b/hw/usb-ccid.c
new file mode 100644
index 000..58f69a6
--- /dev/null
+++ b/hw/usb-ccid.c
@@ -0,0 +1,1355 @@
+/*
+ * CCID Device emulation
+ *
+ * Based on usb-serial.c:
+ * Copyright (c) 2006 CodeSourcery.
+ * Copyright (c) 2008 Samuel Thibault
+ * Written by Paul Brook, reused for FTDI by Samuel Thibault,
+ * Reused for CCID by Alon Levy.
+ * Contributed to by Robert Relyea
+ * Copyright (c) 2010 Red Hat.
+ *
+ * This code is licenced under the LGPL.
+ */
+
+/* References:
+ *
+ * CCID Specification Revision 1.1 April 22nd 2005
+ *  "Universal Serial Bus, Device Class: Smart Card"
+ *  Specification for Integrated Circuit(s) Cards Interface Devices
+ *
+ * Endianess note: from the spec (1.3)
+ *  "Fields that are larger than a byte are stored in little endian
+ *
+ * KNOWN BUGS
+ * 1. remove/insert can sometimes result in removed state instead of inserted.
+ * This is a result of the following:
+ *  symptom: dmesg shows ERMOTEIO (-121), pcscd shows -99. This happens
+ *  when we send a too short packet, seen in uhci-usb.c, resulting from
+ *  a urb requesting SPD and us returning a smaller packet.
+ *  Not sure which messages trigger this.
+ *
+ * Migration note:
+ *
+ * All the VMStateDescription's are left here for future use, but
+ * not enabled right now since there is no support for USB migration.
+ *
+ * To enable define ENABLE_MIGRATION
+ */
+
+#include "qemu-common.h"
+#include "qemu-error.h"
+#include "usb.h"
+#include "monitor.h"
+
+#include "hw/ccid.h"
+
+//#define DEBUG_CCID
+
+#define DPRINTF(s, lvl, fmt, ...) \
+do { if (lvl<= s->debug) { printf("usb-ccid: " fmt , ## __VA_ARGS__); } } 
whi

Re: [Qemu-devel] [PATCH 06/20] usb-ccid: review fixes (v15-v16)

2011-02-03 Thread Anthony Liguori


On 02/02/2011 02:28 PM, Alon Levy wrote:

I'll fold it before submitting the version to be applied, but
I hope keeping it as a separate patch will make reviewing easier.
   


Hrm, can you just send out the new patches?   It's actually harder to 
review like this.


Regards,

Anthony Liguori


Behavioral changes:
  * fix abort on client answer after card remove
  * enable migration
  * remove side affect code from asserts
  * return consistent self-powered state
  * mask out reserved bits in ccid_set_parameters
  * add missing abRFU in SetParameters (no affect on linux guest)

whitefixes / comments / consts defines:
  * remove stale comment
  * remove ccid_print_pending_answers if no DEBUG_CCID
  * replace printf's with DPRINTF, remove DEBUG_CCID, add verbosity defines
  * use error_report
  * update copyright (most of the code is not original)
  * reword known bug comment
  * add missing closing quote in comment
  * add missing whitespace on one line
  * s/CCID_SetParameter/CCID_SetParameters/
  * add comments
  * use define for max packet size

Comment for "return consistent self-powered state":

the Configuration Descriptor bmAttributes claims we are self powered,
but we were returning not self powered to USB_REQ_GET_STATUS control message.

In practice, this message is not sent by a linux 2.6.35.10-74.fc14.x86_64
guest (not tested on other guests), unless you issue lsusb -v as root (for
example).

Signed-off-by: Alon Levy
---
  hw/ccid.h |7 +++
  hw/usb-ccid.c |  115 -
  2 files changed, 63 insertions(+), 59 deletions(-)

diff --git a/hw/ccid.h b/hw/ccid.h
index af59070..f5dcfae 100644
--- a/hw/ccid.h
+++ b/hw/ccid.h
@@ -6,11 +6,16 @@
  typedef struct CCIDCardState CCIDCardState;
  typedef struct CCIDCardInfo CCIDCardInfo;

+/* state of the CCID Card device (i.e. hw/ccid-card-*.c)
+ */
  struct CCIDCardState {
  DeviceState qdev;
  uint32_tslot; // For future use with multiple slot reader.
  };

+/* callbacks to be used by the CCID device (hw/usb-ccid.c) to call
+ * into the smartcard device (hw/ccid-card-*.c)
+ */
  struct CCIDCardInfo {
  DeviceInfo qdev;
  void (*print)(Monitor *mon, CCIDCardState *card, int indent);
@@ -20,6 +25,8 @@ struct CCIDCardInfo {
  int (*initfn)(CCIDCardState *card);
  };

+/* API for smartcard calling the CCID device (used by hw/ccid-card-*.c)
+ */
  void ccid_card_send_apdu_to_guest(CCIDCardState *card, uint8_t* apdu, 
uint32_t len);
  void ccid_card_card_removed(CCIDCardState *card);
  void ccid_card_card_inserted(CCIDCardState *card);
diff --git a/hw/usb-ccid.c b/hw/usb-ccid.c
index 58f69a6..09e39ac 100644
--- a/hw/usb-ccid.c
+++ b/hw/usb-ccid.c
@@ -1,15 +1,18 @@
  /*
   * CCID Device emulation
   *
- * Based on usb-serial.c:
+ * Written by Alon Levy, with contributions from Robert Relyea.
+ *
+ * Based on usb-serial.c, see it's copyright and attributions below.
+ *
+ * This code is licenced under the GNU LGPL, version 2 or later.
+ *
+ * ---
+ *
+ * usb-serial.c copyright and attribution:
   * Copyright (c) 2006 CodeSourcery.
   * Copyright (c) 2008 Samuel Thibault
   * Written by Paul Brook, reused for FTDI by Samuel Thibault,
- * Reused for CCID by Alon Levy.
- * Contributed to by Robert Relyea
- * Copyright (c) 2010 Red Hat.
- *
- * This code is licenced under the LGPL.
   */

  /* References:
@@ -19,22 +22,16 @@
   *  Specification for Integrated Circuit(s) Cards Interface Devices
   *
   * Endianess note: from the spec (1.3)
- *  "Fields that are larger than a byte are stored in little endian
+ *  "Fields that are larger than a byte are stored in little endian"
   *
   * KNOWN BUGS
   * 1. remove/insert can sometimes result in removed state instead of inserted.
   * This is a result of the following:
- *  symptom: dmesg shows ERMOTEIO (-121), pcscd shows -99. This happens
- *  when we send a too short packet, seen in uhci-usb.c, resulting from
- *  a urb requesting SPD and us returning a smaller packet.
+ *  symptom: dmesg shows ERMOTEIO (-121), pcscd shows -99. This can happen
+ *  when a short packet is sent, as seen in uhci-usb.c, resulting from a urb
+ *  from the guest requesting SPD and us returning a smaller packet.
   *  Not sure which messages trigger this.
   *
- * Migration note:
- *
- * All the VMStateDescription's are left here for future use, but
- * not enabled right now since there is no support for USB migration.
- *
- * To enable define ENABLE_MIGRATION
   */

  #include "qemu-common.h"
@@ -44,11 +41,14 @@

  #include "hw/ccid.h"

-//#define DEBUG_CCID
-
  #define DPRINTF(s, lvl, fmt, ...) \
  do { if (lvl<= s->debug) { printf("usb-ccid: " fmt , ## __VA_ARGS__); } } 
while (0)

+#define D_WARN 1
+#define D_INFO 2
+#define D_MORE_INFO 3
+#define D_VERBOSE 4
+
  #define CCID_DEV_NAME "usb-ccid"

  /* The two options for variable sized buffers:
@@ -64,6 +64,8 @@ do { if (lvl<= s->debug) { printf("usb-ccid: " fmt , ## 
__VA_ARGS__); } } while
  #define Inter

[Qemu-devel] [PATCH] QCOW2: fix bug - report read success on failure

2011-02-03 Thread Chunqiang Tang

This patch fixes bugs in QCOW2's error handling paths of read operations.
When an I/O operation fails, the QCOW2 driver mistakenly reports it as success
to the uper layer.

This bug was found by Fast Virtual Disk (FVD)'s fully automated testing tool,
when it injected failures. Specifically, the following test triggered the bug.

/bin/rm -rf /var/ramdisk/truth.raw /var/ramdisk/test.qcow2 
/var/ramdisk/zero-500M.raw
dd if=/dev/zero of=/var/ramdisk/truth.raw count=0 bs=1 seek=1112250368
dd if=/dev/zero of=/var/ramdisk/zero-500M.raw count=0 bs=1 seek=575525376
./qemu-img create -f qcow2 -ocluster_size=65536,backing_fmt=blksim -b 
/var/ramdisk/zero-500M.raw /var/ramdisk/test.qcow2 1112250368
./qemu-io --auto --seed=184915369 --truth=/var/ramdisk/truth.raw --format=qcow2 
--test=blksim:/var/ramdisk/test.qcow2 --verify_write=true 
--compare_before=false --compare_after=true --round=10 --parallel=100 
--io_size=1048576 --fail_prob=0.1 --cancel_prob=0 --instant_qemubh=true

Signed-off-by: Chunqiang Tang 
---
 block/qcow2.c |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 8c906d1..6f6d56f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -479,8 +479,10 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
 BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
 acb->hd_aiocb = bdrv_aio_readv(bs->backing_hd, acb->sector_num,
 &acb->hd_qiov, n1, qcow2_aio_read_cb, acb);
-if (acb->hd_aiocb == NULL)
+if (acb->hd_aiocb == NULL) {
+ret = -EIO;
 goto done;
+}
 } else {
 ret = qcow2_schedule_bh(qcow2_aio_read_bh, acb);
 if (ret < 0)
@@ -495,8 +497,10 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
 }
 } else if (acb->cluster_offset & QCOW_OFLAG_COMPRESSED) {
 /* add AIO support for compressed blocks ? */
-if (qcow2_decompress_cluster(bs, acb->cluster_offset) < 0)
+if (qcow2_decompress_cluster(bs, acb->cluster_offset) < 0) {
+ret = -EIO;
 goto done;
+}
 
 qemu_iovec_from_buffer(&acb->hd_qiov,
 s->cluster_cache + index_in_cluster * 512,
-- 
1.7.0.4

[Qemu-devel] [PATCHv2] e1000: multi-buffer packet support

2011-02-03 Thread Michael S. Tsirkin

e1000 supports multi-buffer packets larger than rxbuf_size.

This fixes the following (on linux):
- in guest: ifconfig eth1 mtu 16110
- in host: ifconfig tap0 mtu 16110
   ping -s 16082 

Red Hat bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=602205

Signed-off-by: Michael S. Tsirkin 
---

Changes from v1:
removed dead code

 hw/e1000.c |   29 +++--
 1 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/hw/e1000.c b/hw/e1000.c
index af101bd..3427ff3 100644
--- a/hw/e1000.c
+++ b/hw/e1000.c
@@ -642,6 +642,8 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 uint16_t vlan_special = 0;
 uint8_t vlan_status = 0, vlan_offset = 0;
 uint8_t min_buf[MIN_BUF_SIZE];
+size_t desc_offset;
+size_t desc_size;
 
 if (!(s->mac_reg[RCTL] & E1000_RCTL_EN))
 return -1;
@@ -654,12 +656,6 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 size = sizeof(min_buf);
 }
 
-if (size > s->rxbuf_size) {
-DBGOUT(RX, "packet too large for buffers (%lu > %d)\n",
-   (unsigned long)size, s->rxbuf_size);
-return -1;
-}
-
 if (!receive_filter(s, buf, size))
 return size;
 
@@ -672,8 +668,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 }
 
 rdh_start = s->mac_reg[RDH];
+desc_offset = 0;
 do {
+desc_size = size - desc_offset;
+if (desc_size > s->rxbuf_size) {
+desc_size = s->rxbuf_size;
+}
 if (s->mac_reg[RDH] == s->mac_reg[RDT] && s->check_rxov) {
+/* Discard all data written so far */
+s->mac_reg[RDH] = rdh_start;
 set_ics(s, 0, E1000_ICS_RXO);
 return -1;
 }
@@ -684,9 +687,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 desc.status |= (vlan_status | E1000_RXD_STAT_DD);
 if (desc.buffer_addr) {
 cpu_physical_memory_write(le64_to_cpu(desc.buffer_addr),
-  (void *)(buf + vlan_offset), size);
-desc.length = cpu_to_le16(size + fcs_len(s));
-desc.status |= E1000_RXD_STAT_EOP|E1000_RXD_STAT_IXSM;
+  (void *)(buf + desc_offset + 
vlan_offset),
+  desc_size);
+desc_offset += desc_size;
+if (desc_offset >= size) {
+desc.length = cpu_to_le16(desc_size + fcs_len(s));
+desc.status |= E1000_RXD_STAT_EOP | E1000_RXD_STAT_IXSM;
+} else {
+desc.length = cpu_to_le16(desc_size);
+}
 } else { // as per intel docs; skip descriptors with null buf addr
 DBGOUT(RX, "Null RX descriptor!!\n");
 }
@@ -702,7 +711,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
size_t size)
 set_ics(s, 0, E1000_ICS_RXO);
 return -1;
 }
-} while (desc.buffer_addr == 0);
+} while (desc_offset < size);
 
 s->mac_reg[GPRC]++;
 s->mac_reg[TPR]++;
-- 
1.7.3.2.91.g446ac

Re: [Qemu-devel] Re: [PATCH] vnc: Fix password expiration through 'change vnc ""'

2011-02-03 Thread Daniel P. Berrange

On Thu, Feb 03, 2011 at 10:35:51AM -0600, Anthony Liguori wrote:
> On 02/03/2011 10:29 AM, Daniel P. Berrange wrote:
> >On Mon, Jan 31, 2011 at 02:43:19PM -0600, Anthony Liguori wrote:
> >>commit 52c18be9e99dabe295321153fda7fce9f76647ac introduced a regression in 
> >>the
> >>change vnc password command that changed the behavior of setting the VNC
> >>password to an empty string from disabling login to disabling 
> >>authentication.
> >>
> >>This commit refactors the code to eliminate this overloaded semantics in
> >>vnc_display_password and instead introduces the vnc_display_disable_login.  
> >> The
> >>monitor implementation then determines the behavior of an empty or missing
> >>string.
> >Personally I think this is a little overkill&  just reverting the
> >original patch was fine, but from a functional POV your patch
> >produces the same results, so I won't argue.
> 
> For 0.15, I'd like to introduce a new set of commands such that we
> don't multiplex the change command anymore.   This refactoring lays
> the ground work for that.
> 
> For instance, if you created a block device with the name 'vnc',
> you'd get very unexpected results!  Multiplexing based on special
> values on top of existing commands is pretty evil.

Doesn't Gerd's 'set_password' command already replace the functionality
of the 'change vnc' command. So we should likely declare 'change vnc'
as deprecated in 0.14 and remove it in 0.16

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Blue Swirl

On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  wrote:
> The registers of real IOAPICs can be relocated during runtime (via
> chipset registers). We don't support this yet, but qemu-kvm carries the
> current base address in its version 2 vmstate.
>
> To align both implementations for migratability, add the proper
> infrastructure to accept initial as well as updated base addresses and
> include the current address in the vmstate. This is done in a way that
> will also allow multiple IOAPICs in the future.

Nack, the addresses should be device properties.

[Qemu-devel] Re: [PATCH] QCOW2: fix bug - report read success on failure

2011-02-03 Thread Chunqiang Tang

> Oops, thanks for catching this. I thought this was fixed long ago, but
> apparently it wasn't.

Not me, the testing tool caught it without my supervision. :-)

> > @@ -495,8 +497,10 @@ static void qcow2_aio_read_cb(void *opaque, int 
ret)
> >  }
> >  } else if (acb->cluster_offset & QCOW_OFLAG_COMPRESSED) {
> >  /* add AIO support for compressed blocks ? */
> > -if (qcow2_decompress_cluster(bs, acb->cluster_offset) < 0)
> > +if (qcow2_decompress_cluster(bs, acb->cluster_offset) < 0) {
> > +ret = -EIO;
> >  goto done;
> > +}
> 
> I think here we should change qcow2_decompessed_cluster() to return the
> real error code instead of -1, so that we can pass it on instead of
> turning everything into -EIO.

I agree. Could you please prepare the official patch? I just throw in 
ideas I got from my testing tool.

[Qemu-devel] Re: [PATCH] QCOW2: fix bug - report read success on failure

2011-02-03 Thread Kevin Wolf

Am 03.02.2011 17:53, schrieb Chunqiang Tang:
> This patch fixes bugs in QCOW2's error handling paths of read operations.
> When an I/O operation fails, the QCOW2 driver mistakenly reports it as success
> to the uper layer.
> 
> This bug was found by Fast Virtual Disk (FVD)'s fully automated testing tool,
> when it injected failures. Specifically, the following test triggered the bug.
> 
> /bin/rm -rf /var/ramdisk/truth.raw /var/ramdisk/test.qcow2 
> /var/ramdisk/zero-500M.raw
> dd if=/dev/zero of=/var/ramdisk/truth.raw count=0 bs=1 seek=1112250368
> dd if=/dev/zero of=/var/ramdisk/zero-500M.raw count=0 bs=1 seek=575525376
> ./qemu-img create -f qcow2 -ocluster_size=65536,backing_fmt=blksim -b 
> /var/ramdisk/zero-500M.raw /var/ramdisk/test.qcow2 1112250368
> ./qemu-io --auto --seed=184915369 --truth=/var/ramdisk/truth.raw 
> --format=qcow2 --test=blksim:/var/ramdisk/test.qcow2 --verify_write=true 
> --compare_before=false --compare_after=true --round=10 --parallel=100 
> --io_size=1048576 --fail_prob=0.1 --cancel_prob=0 --instant_qemubh=true
> 
> Signed-off-by: Chunqiang Tang 
> ---
>  block/qcow2.c |8 ++--
>  1 files changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 8c906d1..6f6d56f 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -479,8 +479,10 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
>  BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
>  acb->hd_aiocb = bdrv_aio_readv(bs->backing_hd, 
> acb->sector_num,
>  &acb->hd_qiov, n1, qcow2_aio_read_cb, 
> acb);
> -if (acb->hd_aiocb == NULL)
> +if (acb->hd_aiocb == NULL) {
> +ret = -EIO;
>  goto done;
> +}
>  } else {
>  ret = qcow2_schedule_bh(qcow2_aio_read_bh, acb);
>  if (ret < 0)

Oops, thanks for catching this. I thought this was fixed long ago, but
apparently it wasn't.

> @@ -495,8 +497,10 @@ static void qcow2_aio_read_cb(void *opaque, int ret)
>  }
>  } else if (acb->cluster_offset & QCOW_OFLAG_COMPRESSED) {
>  /* add AIO support for compressed blocks ? */
> -if (qcow2_decompress_cluster(bs, acb->cluster_offset) < 0)
> +if (qcow2_decompress_cluster(bs, acb->cluster_offset) < 0) {
> +ret = -EIO;
>  goto done;
> +}

I think here we should change qcow2_decompessed_cluster() to return the
real error code instead of -1, so that we can pass it on instead of
turning everything into -EIO.

Kevin

[Qemu-devel] Re: QCOW2 bugs releated to qcow2_aio_cancel()

2011-02-03 Thread Kevin Wolf

Am 03.02.2011 18:21, schrieb Chunqiang Tang:
> Hi Kevin,
> 
> Fast Virtual Disk (FVD) has an automated testing tool (see 
> http://wiki.qemu.org/Features/FVD/Engineering). For a long time, I knew 
> that QCOW2 could not pass the automated tests. Today I finally sit down to 
> look into those bugs. I already submitted multiple patches for different 
> bugs, but there is one case that I am not certain how to handle. Instead 
> of creating a potentially broken patch, I though you might be able to 
> handle it better than me. Bugs showed up when the testing tool injected 
> aio cancel.

We have discussed this before and yes, aio_cancel is a problem. It is
very hard to do this correctly, and so most block drivers just wait for
completion instead of actually cancelling anything.

Given that a cancel shouldn't happen too often, I think it would be
reasonable to take the same approach for qcow2. I don't think adding a
lot of complexity for getting this right is justified.

Stefan, what do you think? Maybe we could even have a default
implementation in generic block code?

Kevin

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Jan Kiszka

On 2011-02-03 18:36, Blue Swirl wrote:
> On Thu, Feb 3, 2011 at 5:18 PM, Jan Kiszka  wrote:
>> On 2011-02-03 18:03, Blue Swirl wrote:
>>> On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  wrote:
 The registers of real IOAPICs can be relocated during runtime (via
 chipset registers). We don't support this yet, but qemu-kvm carries the
 current base address in its version 2 vmstate.

 To align both implementations for migratability, add the proper
 infrastructure to accept initial as well as updated base addresses and
 include the current address in the vmstate. This is done in a way that
 will also allow multiple IOAPICs in the future.
>>>
>>> Nack, the addresses should be device properties.
>>
>> Hmm we could make default_base_address a property. Will change that.
>> But current_base_address is just the same as apicbase and can't be a
>> property.
> 
> Oh, right. What will current_base_address used for? Why can't board
> just unmap IOAPIC from current address and remap it at the new
> address? Then the device would not need to know its base address.

The board could do this. The question is where we put this service, in
the context if the IOAPIC as ioapic_set_base_address (compare to
cpu_set_apic_base - which is buggy as it lacks sysbus_mmio_map) or into
each and every board code. In the latter case, the boards would also be
responsible for saving/restoring the address.

Well, unless we have a good reason, I would prefer to keep the split as
suggested, also to avoid that qemu-kvm needs to break its vmstate.

Jan

PS: Another bug in the APIC emulation: cpu_set_apic_base lacks a call to
sysbus_mmio_map to update its mapping.

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH] checkpatch.pl: don't complain about old lines with tabs

2011-02-03 Thread Blue Swirl

Don't complain when the patch includes lines with tabs
only in the hunk's untouched context.

Signed-off-by: Blue Swirl 
---
 scripts/checkpatch.pl |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 4fa06c0..075b614 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1495,7 +1495,7 @@ sub process {
next if ($realfile !~ /\.(h|c|pl)$/);

 # in QEMU, no tabs are allowed
-   if ($rawline =~ /\t/) {
+   if ($rawline =~ /^\+.*\t/) {
my $herevet = "$here\n" . cat_vet($rawline) . "\n";
ERROR("code indent should never use tabs\n" . $herevet);
$rpt_cleaners = 1;
-- 
1.6.2.4

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Jan Kiszka

On 2011-02-03 18:03, Blue Swirl wrote:
> On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  wrote:
>> The registers of real IOAPICs can be relocated during runtime (via
>> chipset registers). We don't support this yet, but qemu-kvm carries the
>> current base address in its version 2 vmstate.
>>
>> To align both implementations for migratability, add the proper
>> infrastructure to accept initial as well as updated base addresses and
>> include the current address in the vmstate. This is done in a way that
>> will also allow multiple IOAPICs in the future.
> 
> Nack, the addresses should be device properties.

Hmm we could make default_base_address a property. Will change that.
But current_base_address is just the same as apicbase and can't be a
property.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: [PATCH 1/2] Add virtagent file system freeze/thaw

2011-02-03 Thread Michael Roth


On 02/02/2011 02:42 AM, jes.soren...@redhat.com wrote:

From: Jes Sorensen

Implement freeze/thaw support in the guest, allowing the host to
request the guest freezes all it's file systems before a live snapshot
is performed.
  - fsfreeze(): Walk the list of mounted local real file systems,
and freeze them.
  - fsthaw():   Walk the list of previously frozen file systems and
thaw them.
  - fsstatus(): Return the current status of freeze/thaw. The host must
poll this function, in case fsfreeze() returned with a
   timeout, to wait for the operation to finish.

Signed-off-by: Jes Sorensen
---
  virtagent-common.h |8 ++
  virtagent-server.c |  190 
  2 files changed, 198 insertions(+), 0 deletions(-)

diff --git a/virtagent-common.h b/virtagent-common.h
index 5d8f5c1..7c6d9ef 100644
--- a/virtagent-common.h
+++ b/virtagent-common.h
@@ -61,6 +61,14 @@ typedef struct VAContext {
  const char *channel_path;
  } VAContext;

+enum va_fsfreeze_status {
+FREEZE_ERROR = -1,
+FREEZE_THAWED = 0,
+FREEZE_INPROGRESS = 1,
+FREEZE_FROZEN = 2,
+FREEZE_THAWINPROGRESS = 3,
+};
+
  enum va_job_status {
  VA_JOB_STATUS_PENDING = 0,
  VA_JOB_STATUS_OK,
diff --git a/virtagent-server.c b/virtagent-server.c
index 7bb35b2..ffbe163 100644
--- a/virtagent-server.c
+++ b/virtagent-server.c
@@ -14,6 +14,10 @@
  #include
  #include "qemu_socket.h"
  #include "virtagent-common.h"
+#include
+#include
+#include
+#include

  static VAServerData *va_server_data;
  static bool va_enable_syslog = false; /* enable syslog'ing of RPCs */
@@ -217,6 +221,186 @@ static xmlrpc_value *va_hello(xmlrpc_env *env,
  return result;
  }

+
+/*
+ * Walk the mount table and build a list of local file systems
+ */
+
+struct direntry {
+char *dirname;
+char *devtype;
+struct direntry *next;
+};
+
+static struct direntry *va_mount_list;
+static int va_fsfreeze_status;


And what I meant in the last RFC about using "objects" was to 
encapsulate global state information for a particular group of commands 
in single data type/variable. We're gonna end up with a similar set of 
variables for stateful RPCs like copyfile and potentially a few for 
things like spice. So to avoid having things get too cluttered up I'd 
prefer something like, in this particular case:


typedef struct VAFSFreezeState {
struct direntry *mount_list;
int status;
} VAFSFeezeState;

static VAFSFreezeState va_fsfreeze_state;


+
+static int build_mount_list(void)
+{
+struct mntent *mnt;
+struct direntry *entry;
+struct direntry *next;
+char const *mtab = MOUNTED;
+FILE *fp;
+
+fp = setmntent(mtab, "r");
+if (!fp) {
+   fprintf(stderr, "unable to read mtab\n");
+   goto fail;
+}


You have tabs instead of spaces here


+
+while ((mnt = getmntent(fp))) {
+   /*
+* An entry which device name doesn't start with a '/' is
+* either a dummy file system or a network file system.
+* Add special handling for smbfs and cifs as is done by
+* coreutils as well.
+*/
+   if ((mnt->mnt_fsname[0] != '/') ||
+   (strcmp(mnt->mnt_type, "smbfs") == 0) ||
+   (strcmp(mnt->mnt_type, "cifs") == 0)) {
+   continue;
+   }
+
+   entry = qemu_malloc(sizeof(struct direntry));
+   entry->dirname = qemu_strdup(mnt->mnt_dir);
+   entry->devtype = qemu_strdup(mnt->mnt_type);
+   entry->next = va_mount_list;
+
+   va_mount_list = entry;


Here too


+}
+
+endmntent(fp);
+
+return 0;
+
+fail:
+while(va_mount_list) {
+   next = va_mount_list->next;


Here too. Might be some other spots but you get the point :)


+qemu_free(va_mount_list->dirname);
+qemu_free(va_mount_list->devtype);
+qemu_free(va_mount_list);
+va_mount_list = next;
+}
+
+return -1;
+}
+
+/*
+ * va_fsfreeze(): Walk list of mounted file systems in the guest, and
+ *   freeze the ones which are real local file systems.
+ * rpc return values: Number of file systems frozen, -1 on error.
+ */
+static xmlrpc_value *va_fsfreeze(xmlrpc_env *env,
+ xmlrpc_value *params,
+ void *user_data)
+{
+xmlrpc_int32 ret = 0, i = 0;
+xmlrpc_value *result;
+struct direntry *entry;
+int fd;
+SLOG("va_fsfreeze()");
+
+if (va_fsfreeze_status != FREEZE_THAWED) {
+ret = 0;
+goto out;
+}
+
+ret = build_mount_list();
+if (ret<  0) {
+goto out;
+}
+
+va_fsfreeze_status = FREEZE_INPROGRESS;
+
+entry = va_mount_list;
+while(entry) {
+fd = qemu_open(entry->dirname, O_RDONLY);
+if (fd == -1) {
+ret = errno;
+goto error;
+}
+ret = ioctl(fd, FIFREEZE);
+close(fd);
+if (ret<  0&&  ret != EOPNOTSUPP) {
+goto error;
+

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Jan Kiszka

On 2011-02-03 18:54, Blue Swirl wrote:
> On Thu, Feb 3, 2011 at 5:43 PM, Jan Kiszka  wrote:
>> On 2011-02-03 18:36, Blue Swirl wrote:
>>> On Thu, Feb 3, 2011 at 5:18 PM, Jan Kiszka  wrote:
 On 2011-02-03 18:03, Blue Swirl wrote:
> On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  wrote:
>> The registers of real IOAPICs can be relocated during runtime (via
>> chipset registers). We don't support this yet, but qemu-kvm carries the
>> current base address in its version 2 vmstate.
>>
>> To align both implementations for migratability, add the proper
>> infrastructure to accept initial as well as updated base addresses and
>> include the current address in the vmstate. This is done in a way that
>> will also allow multiple IOAPICs in the future.
>
> Nack, the addresses should be device properties.

 Hmm we could make default_base_address a property. Will change that.
 But current_base_address is just the same as apicbase and can't be a
 property.
>>>
>>> Oh, right. What will current_base_address used for? Why can't board
>>> just unmap IOAPIC from current address and remap it at the new
>>> address? Then the device would not need to know its base address.
>>
>> The board could do this. The question is where we put this service, in
>> the context if the IOAPIC as ioapic_set_base_address (compare to
>> cpu_set_apic_base - which is buggy as it lacks sysbus_mmio_map) or into
>> each and every board code. In the latter case, the boards would also be
>> responsible for saving/restoring the address.
> 
> How is the device relocated? Where are the chipset registers you mention?

Intel's PIIX chipsets contain a register called APICBASE (but it means
the IOAPIC), and that defines the location. The analogy in the APIC
world is the MSR_IA32_APICBASE which we maintain via the APIC state.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] Re: [PATCH v2 0/2] virtagent - fsfreeze support

2011-02-03 Thread Michael Roth


Hi Jes,

Can you add Badari (CC'd) to the next round? He's looked at potentially 
using virtagent for snapshots in the past and might have some insights 
here. Thanks.


On 02/02/2011 02:42 AM, jes.soren...@redhat.com wrote:

From: Jes Sorensen

Hi

This is a first attempt to add fsfreeze support to virtagent. The idea
is for the guest agent to walk the list of locally mounted file
systems in the guest, and issuing an ioctl to freeze them. The host
can then do a live snapshot of the guest, obtaining stable file
systems. After the snapshot, the host then calls the thaw function in
virtagent, which goes through the list of previously frozen file
systems and unfreezes them.

The list walking ignores remote file systems such as NFS and CIFS as
well as all pseudo file systems.

The guest agent code is in the first patch, and host agent code is in
the second patch. For now there is only human monitor support, but it
should be pretty straight forward to add QMP support as well.

Comments and suggestions welcome!

v2 of the patch addresses the issues pointed out by Stefan and Michael.

Cheers,
Jes


Jes Sorensen (2):
   Add virtagent file system freeze/thaw
   Add monitor commands for fsfreeze support

  hmp-commands.hx|   48 +++
  virtagent-common.h |9 ++
  virtagent-server.c |  190 ++
  virtagent.c|  235 
  virtagent.h|9 ++
  5 files changed, 491 insertions(+), 0 deletions(-)

Re: [Qemu-devel] [PATCH 1/2] Add virtagent file system freeze/thaw

2011-02-03 Thread Michael Roth


On 02/02/2011 02:48 AM, Jes Sorensen wrote:

On 02/02/11 08:57, Stefan Hajnoczi wrote:

On Tue, Feb 1, 2011 at 10:58 AM,  wrote:

From: Jes Sorensen

Implement freeze/thaw support in the guest, allowing the host to
request the guest freezes all it's file systems before a live snapshot
is performed.
  - fsfreeze(): Walk the list of mounted local real file systems,
   and freeze them.


Does this add a requirement that guest agent code issues no disk I/O
in its main loop (e.g. logging)?  Otherwise we might deadlock
ourselves waiting for I/O which is never issued.


Yes very much so[1] - one reason why it would be nice to have virtagent
use threads to execute the actual commands. We should probably add a
flag to agent commands indicating whether they issue disk I/O or not, so
we can block attempts to execute commands that do so, while the guest is
frozen.


**Warning, epic response**

For things like logging and i/o on a frozen system...I agree we'd need 
some flag for these kinds of situations. Maybe a disable_logging() 
flagi really don't like this though... I'd imagine even syslogd() 
could block virtagent in this type of situation, so that would need to 
be disabled as well.


But doing so completely subverts our attempts and providing proper 
accounting of what the agent is doing to the user. A user can freeze the 
filesystem, knowing that logging would be disabled, then prod at 
whatever he wants. So the handling should be something specific to 
fsfreeze, with stricter requirements:


If a user calls fsfreeze(), we disable logging, but also disable the 
ability to do anything other than fsthaw() or fsstatus(). This actually 
solves the potential deadlocking problem for other RPCs as well...since 
they cant be executed in the first place.


So I think that addresses the agent deadlocking itself, post-freeze.

However, fsfreeze() itself might lock-up the agent as well...I'm not 
confident we can really put any kind of bound on how long it'll take to 
execute, and if we timeout on the client-side the agent can still block 
here.


Plus there are any number of other situations where an RPC can still 
hang things...in the future when we potentially allow things like script 
execution, they might do something like attempt to connect to a socket 
that's already in use and wait on the server for an arbitrary amount of 
time, or open a file on an nfs share that in currently unresponsive.


So a solution for these situations is still needed, and I'm starting to 
agree that threads are needed, but I don't think we should do RPCs 
concurrently (not sure if that's what is being suggested or not). At 
least, there's no pressing reason for it as things currently stand 
(there aren't currently any RPCs where fast response times are all that 
important, so it's okay to serialize them behind previous RPCs, and 
HMP/QMP are command at a time), and it's something that Im fairly 
confident can be added if the need arises in the future.


But for dealing with a situation where an RPC can hang the agent, I 
think one thread should do it. Basically:


We associate each RPC with a time limit. Some RPCs, very special ones 
that we'd trust with our kids, could potentially specify an unlimited 
timeout. The client side should use this same timeout on it's end. In 
the future we might allow the user to explicitly disable the timeout for 
a certain RPC. The logic would then be:


- read in a client RPC request
- start a thread to do RPC
- if there's a timeout, register an alarm(), with a handler 
that will call something like pthread_kill(current_worker_thread). On 
the thread side, this signal will induce a pthread_exit()

- wait for the thread to return (pthread_join(current_worker_thread))
- return it's response back to the caller if it finished, return a 
timeout indication otherwise




Cheers,
Jes

[1] speaking from experience ... a Linux desktop gets really upset if
you freeze the file systems from a command in an xterm ho hum

[Qemu-devel] problem with -vga std

2011-02-03 Thread Serge E. Hallyn

Hi,

When I grabbed the latest qemu (up to commit
bfddb47a343b4718e5768aa80bce8adead0f7fca) from git://git.qemu.org/qemu.git
and did:

./qemu-system-x86_64 -net user -net nic -vga std -usb -usbdevice tablet -m 512 
-hda /home/serge/disk.img -boot c -monitor stdio -vnc :1 -L 
/opt/qemu/share/qemu/

where disk.img has a ubuntu lucid server installed on it, I get:

error: no suitable mode found
unknown command `terminal'
out of range pointer 0x8022
Aborted.  Press any key to exit.

I bisected this to:

3b3d448e01ccfc6fdcb6e3d4ebf47418075e3bb4: Add new vgabios binaries to blobs list

Any ideas what specifically is causing this?

thanks,
-serge

[Qemu-devel] QCOW2 bugs releated to qcow2_aio_cancel()

2011-02-03 Thread Chunqiang Tang

Hi Kevin,

Fast Virtual Disk (FVD) has an automated testing tool (see 
http://wiki.qemu.org/Features/FVD/Engineering). For a long time, I knew 
that QCOW2 could not pass the automated tests. Today I finally sit down to 
look into those bugs. I already submitted multiple patches for different 
bugs, but there is one case that I am not certain how to handle. Instead 
of creating a potentially broken patch, I though you might be able to 
handle it better than me. Bugs showed up when the testing tool injected 
aio cancel. 

First, the cancelled request is not taken off the list of running 
requests, i.e., s->cluster_allocs and next_in_flight. As a result, when 
the acb is freed and reused, it formed circles in s->cluster_allocs, and 
the qcow2_alloc_cluster_offset() code below went into dead loop. I tried 
to add run_dependent_requests() into qcow2_aio_cancel(), but that does not 
solve all the problem. Dead loop still occurs. 

The second bug is related to QCowAIOCB.bh. There are several issues. 1) 
When a request is cancelled, the bh is not cancelled. 2) qcow2_aio_setup() 
does not initialize bh=NULL and relies on qcow2_aio_read_bh() to set 
bh=NULL. When the acb is reused for another request, bh!=NULL. As a 
result, qcow2_schedule_bh() fails on checking "if (acb->bh) return -EIO;" 

There may be other bugs related to qcow2_aio_cancel(), as the testing tool 
could not run long enough before it hits a bug. As a result, the coverage 
is low.

static void qcow2_aio_cancel(BlockDriverAIOCB *blockacb)
{
QCowAIOCB *acb = container_of(blockacb, QCowAIOCB, common);
if (acb->hd_aiocb)
bdrv_aio_cancel(acb->hd_aiocb);
run_dependent_requests(&acb->l2meta); /*** added **/
qemu_aio_release(acb);
}


int qcow2_alloc_cluster_offset()
{
...
/*** run into dead loop here when a cancelled was not taken off 
the list. */
QLIST_FOREACH(old_alloc, &s->cluster_allocs, next_in_flight) {
 ...
}
}

Regards,
ChunQiang (CQ) Tang, Ph.D.

Tel: +1-914-784-7412   Homepage: 
http://www.research.ibm.com/people/c/ctang

[Qemu-devel] [0.14?][PATCH v2 0/4] IOAPIC fixes

2011-02-03 Thread Jan Kiszka

Second round, addressing review comments and fixing (still unused) code:
 - pass IOAPIC default address as property
 - properly update MMIO mapping after vmload
 - switch post_load callback for all fix-ups



Jan Kiszka (4):
  ioapic: Implement EOI handling for level-triggered IRQs
  ioapic: Save/restore irr
  ioapic: Prepare for base address relocation
  ioapic: Style & magics cleanup

 hw/apic.c|9 ++-
 hw/ioapic.c  |  248 +
 hw/ioapic.h  |   24 ++
 hw/pc_piix.c |4 +-
 4 files changed, 209 insertions(+), 76 deletions(-)
 create mode 100644 hw/ioapic.h

[Qemu-devel] [0.14?][PATCH v2 4/4] ioapic: Style & magics cleanup

2011-02-03 Thread Jan Kiszka

Fix a few style issues and convert magic numbers into prober symbolic
constants, also fixing the wrong but unused IOAPIC_DM_SIPI value.

Signed-off-by: Jan Kiszka 
---
 hw/ioapic.c |  177 +++---
 1 files changed, 107 insertions(+), 70 deletions(-)

diff --git a/hw/ioapic.c b/hw/ioapic.c
index 705803c..4f96caf 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -39,20 +39,48 @@
 
 #define MAX_IOAPICS 1
 
-#define IOAPIC_LVT_MASKED   (1 << 16)
-#define IOAPIC_LVT_REMOTE_IRR   (1 << 14)
+#define IOAPIC_VERSION  0x11
 
-#define IOAPIC_TRIGGER_EDGE0
-#define IOAPIC_TRIGGER_LEVEL   1
+#define IOAPIC_LVT_DEST_SHIFT   56
+#define IOAPIC_LVT_MASKED_SHIFT 16
+#define IOAPIC_LVT_TRIGGER_MODE_SHIFT   15
+#define IOAPIC_LVT_REMOTE_IRR_SHIFT 14
+#define IOAPIC_LVT_POLARITY_SHIFT   13
+#define IOAPIC_LVT_DELIV_STATUS_SHIFT   12
+#define IOAPIC_LVT_DEST_MODE_SHIFT  11
+#define IOAPIC_LVT_DELIV_MODE_SHIFT 8
+
+#define IOAPIC_LVT_MASKED   (1 << IOAPIC_LVT_MASKED_SHIFT)
+#define IOAPIC_LVT_REMOTE_IRR   (1 << IOAPIC_LVT_REMOTE_IRR_SHIFT)
+
+#define IOAPIC_TRIGGER_EDGE 0
+#define IOAPIC_TRIGGER_LEVEL1
 
 /*io{apic,sapic} delivery mode*/
-#define IOAPIC_DM_FIXED0x0
-#define IOAPIC_DM_LOWEST_PRIORITY  0x1
-#define IOAPIC_DM_PMI  0x2
-#define IOAPIC_DM_NMI  0x4
-#define IOAPIC_DM_INIT 0x5
-#define IOAPIC_DM_SIPI 0x5
-#define IOAPIC_DM_EXTINT   0x7
+#define IOAPIC_DM_FIXED 0x0
+#define IOAPIC_DM_LOWEST_PRIORITY   0x1
+#define IOAPIC_DM_PMI   0x2
+#define IOAPIC_DM_NMI   0x4
+#define IOAPIC_DM_INIT  0x5
+#define IOAPIC_DM_SIPI  0x6
+#define IOAPIC_DM_EXTINT0x7
+#define IOAPIC_DM_MASK  0x7
+
+#define IOAPIC_VECTOR_MASK  0xff
+
+#define IOAPIC_IOREGSEL 0x00
+#define IOAPIC_IOWIN0x10
+
+#define IOAPIC_REG_ID   0x00
+#define IOAPIC_REG_VER  0x01
+#define IOAPIC_REG_ARB  0x02
+#define IOAPIC_REG_REDTBL_BASE  0x10
+#define IOAPIC_ID   0x00
+
+#define IOAPIC_ID_SHIFT 24
+#define IOAPIC_ID_MASK  0xf
+
+#define IOAPIC_VER_ENTRIES_SHIFT16
 
 typedef struct IOAPICState IOAPICState;
 
@@ -60,7 +88,6 @@ struct IOAPICState {
 SysBusDevice busdev;
 uint8_t id;
 uint8_t ioregsel;
-
 uint32_t irr;
 uint64_t ioredtbl[IOAPIC_NUM_PINS];
 uint64_t default_base_address;
@@ -86,21 +113,22 @@ static void ioapic_service(IOAPICState *s)
 if (s->irr & mask) {
 entry = s->ioredtbl[i];
 if (!(entry & IOAPIC_LVT_MASKED)) {
-trig_mode = ((entry >> 15) & 1);
-dest = entry >> 56;
-dest_mode = (entry >> 11) & 1;
-delivery_mode = (entry >> 8) & 7;
-polarity = (entry >> 13) & 1;
+trig_mode = ((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1);
+dest = entry >> IOAPIC_LVT_DEST_SHIFT;
+dest_mode = (entry >> IOAPIC_LVT_DEST_MODE_SHIFT) & 1;
+delivery_mode =
+(entry >> IOAPIC_LVT_DELIV_MODE_SHIFT) & IOAPIC_DM_MASK;
+polarity = (entry >> IOAPIC_LVT_POLARITY_SHIFT) & 1;
 if (trig_mode == IOAPIC_TRIGGER_EDGE) {
 s->irr &= ~mask;
 } else {
 s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
 }
-if (delivery_mode == IOAPIC_DM_EXTINT)
+if (delivery_mode == IOAPIC_DM_EXTINT) {
 vector = pic_read_irq(isa_pic);
-else
-vector = entry & 0xff;
-
+} else {
+vector = entry & IOAPIC_VECTOR_MASK;
+}
 apic_deliver_irq(dest, dest_mode, delivery_mode,
  vector, polarity, trig_mode);
 }
@@ -116,15 +144,16 @@ static void ioapic_set_irq(void *opaque, int vector, int 
level)
  * to GSI 2.  GSI maps to ioapic 1-1.  This is not
  * the cleanest way of doing it but it should work. */
 
-DPRINTF("%s: %s vec %x\n", __func__, level? "raise" : "lower", vector);
-if (vector == 0)
+DPRINTF("%s: %s vec %x\n", __func__, level ? "raise" : "lower", vector);
+if (vector == 0) {
 vector = 2;
-
+}
 if (vector >= 0 && vector < IOAPIC_NUM_PINS) {
 uint32_t mask = 1 << vector;
 uint64_t entry = s->ioredtbl[vector];
 
-if ((entry >> 15) & 1) {
+if (((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1) ==
+IOAPIC_TRIGGER_LEVEL) {
 /* level triggered */

[Qemu-devel] [0.14?][PATCH v2 1/4] ioapic: Implement EOI handling for level-triggered IRQs

2011-02-03 Thread Jan Kiszka

Add the missing EOI broadcast from local APIC to the IOAPICs on
completion of level-triggered IRQs. This ensures that a still asserted
IRQ source properly re-triggers an APIC IRQ.

Signed-off-by: Jan Kiszka 
---
 hw/apic.c   |9 ++---
 hw/ioapic.c |   43 +--
 hw/ioapic.h |   20 
 3 files changed, 67 insertions(+), 5 deletions(-)
 create mode 100644 hw/ioapic.h

diff --git a/hw/apic.c b/hw/apic.c
index ff581f0..05a115f 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -18,6 +18,7 @@
  */
 #include "hw.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "qemu-timer.h"
 #include "host-utils.h"
 #include "sysbus.h"
@@ -57,7 +58,8 @@
 
 #define ESR_ILLEGAL_ADDRESS (1 << 7)
 
-#define APIC_SV_ENABLE (1 << 8)
+#define APIC_SV_DIRECTED_IO (1<<12)
+#define APIC_SV_ENABLE  (1<<8)
 
 #define MAX_APICS 255
 #define MAX_APIC_WORDS 8
@@ -420,8 +422,9 @@ static void apic_eoi(APICState *s)
 if (isrv < 0)
 return;
 reset_bit(s->isr, isrv);
-/* XXX: send the EOI packet to the APIC bus to allow the I/O APIC to
-set the remote IRR bit for level triggered interrupts. */
+if (!(s->spurious_vec & APIC_SV_DIRECTED_IO) && get_bit(s->tmr, isrv)) {
+ioapic_eio_broadcast(isrv);
+}
 apic_update_irq(s);
 }
 
diff --git a/hw/ioapic.c b/hw/ioapic.c
index 2109568..443c579 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -23,6 +23,7 @@
 #include "hw.h"
 #include "pc.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "qemu-timer.h"
 #include "host-utils.h"
 #include "sysbus.h"
@@ -36,7 +37,10 @@
 #define DPRINTF(fmt, ...)
 #endif
 
-#define IOAPIC_LVT_MASKED  (1<<16)
+#define MAX_IOAPICS 1
+
+#define IOAPIC_LVT_MASKED   (1 << 16)
+#define IOAPIC_LVT_REMOTE_IRR   (1 << 14)
 
 #define IOAPIC_TRIGGER_EDGE0
 #define IOAPIC_TRIGGER_LEVEL   1
@@ -61,6 +65,8 @@ struct IOAPICState {
 uint64_t ioredtbl[IOAPIC_NUM_PINS];
 };
 
+static IOAPICState *ioapics[MAX_IOAPICS];
+
 static void ioapic_service(IOAPICState *s)
 {
 uint8_t i;
@@ -83,8 +89,11 @@ static void ioapic_service(IOAPICState *s)
 dest_mode = (entry >> 11) & 1;
 delivery_mode = (entry >> 8) & 7;
 polarity = (entry >> 13) & 1;
-if (trig_mode == IOAPIC_TRIGGER_EDGE)
+if (trig_mode == IOAPIC_TRIGGER_EDGE) {
 s->irr &= ~mask;
+} else {
+s->ioredtbl[i] |= IOAPIC_LVT_REMOTE_IRR;
+}
 if (delivery_mode == IOAPIC_DM_EXTINT)
 vector = pic_read_irq(isa_pic);
 else
@@ -131,6 +140,29 @@ static void ioapic_set_irq(void *opaque, int vector, int 
level)
 }
 }
 
+void ioapic_eio_broadcast(int vector)
+{
+IOAPICState *s;
+uint64_t entry;
+int i, n;
+
+for (i = 0; i < MAX_IOAPICS; i++) {
+s = ioapics[i];
+if (!s) {
+continue;
+}
+for (n = 0; n < IOAPIC_NUM_PINS; n++) {
+entry = s->ioredtbl[n];
+if ((entry & IOAPIC_LVT_REMOTE_IRR) && (entry & 0xff) == vector) {
+s->ioredtbl[n] = entry & ~IOAPIC_LVT_REMOTE_IRR;
+if (!(entry & IOAPIC_LVT_MASKED) && (s->irr & (1 << n))) {
+ioapic_service(s);
+}
+}
+}
+}
+}
+
 static uint32_t ioapic_mem_readl(void *opaque, target_phys_addr_t addr)
 {
 IOAPICState *s = opaque;
@@ -240,6 +272,11 @@ static int ioapic_init1(SysBusDevice *dev)
 {
 IOAPICState *s = FROM_SYSBUS(IOAPICState, dev);
 int io_memory;
+static int ioapic_no;
+
+if (ioapic_no >= MAX_IOAPICS) {
+return -1;
+}
 
 io_memory = cpu_register_io_memory(ioapic_mem_read,
ioapic_mem_write, s,
@@ -248,6 +285,8 @@ static int ioapic_init1(SysBusDevice *dev)
 
 qdev_init_gpio_in(&dev->qdev, ioapic_set_irq, IOAPIC_NUM_PINS);
 
+ioapics[ioapic_no++] = s;
+
 return 0;
 }
 
diff --git a/hw/ioapic.h b/hw/ioapic.h
new file mode 100644
index 000..c441567
--- /dev/null
+++ b/hw/ioapic.h
@@ -0,0 +1,20 @@
+/*
+ *  ioapic.c IOAPIC emulation logic
+ *
+ *  Copyright (c) 2011 Jan Kiszka, Siemens AG
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see

Re: [Qemu-devel] Re: [PATCH] vnc: Fix password expiration through 'change vnc ""'

2011-02-03 Thread Anthony Liguori


On 02/03/2011 11:02 AM, Daniel P. Berrange wrote:

On Thu, Feb 03, 2011 at 10:35:51AM -0600, Anthony Liguori wrote:
   

On 02/03/2011 10:29 AM, Daniel P. Berrange wrote:
 

On Mon, Jan 31, 2011 at 02:43:19PM -0600, Anthony Liguori wrote:
   

commit 52c18be9e99dabe295321153fda7fce9f76647ac introduced a regression in the
change vnc password command that changed the behavior of setting the VNC
password to an empty string from disabling login to disabling authentication.

This commit refactors the code to eliminate this overloaded semantics in
vnc_display_password and instead introduces the vnc_display_disable_login.   The
monitor implementation then determines the behavior of an empty or missing
string.
 

Personally I think this is a little overkill&   just reverting the
original patch was fine, but from a functional POV your patch
produces the same results, so I won't argue.
   

For 0.15, I'd like to introduce a new set of commands such that we
don't multiplex the change command anymore.   This refactoring lays
the ground work for that.

For instance, if you created a block device with the name 'vnc',
you'd get very unexpected results!  Multiplexing based on special
values on top of existing commands is pretty evil.
 

Doesn't Gerd's 'set_password' command already replace the functionality
of the 'change vnc' command. So we should likely declare 'change vnc'
as deprecated in 0.14 and remove it in 0.16
   


Yup.  But it doesn't let you disable login.  Since that was a feature of 
'change vnc', I think we need to provide a proper interface to do this.


Likewise, we need a new interface for changing the block device.  The 
way password setting is handled is fubar right now.


I've got some new commands documented in a git tree if you're interested.

http://repo.or.cz/w/qemu/aliguori.git/blob/refs/heads/glib:/qmp-schema.json

Regards,

Anthony Liguori


Regards,
Daniel

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Blue Swirl

On Thu, Feb 3, 2011 at 5:18 PM, Jan Kiszka  wrote:
> On 2011-02-03 18:03, Blue Swirl wrote:
>> On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  wrote:
>>> The registers of real IOAPICs can be relocated during runtime (via
>>> chipset registers). We don't support this yet, but qemu-kvm carries the
>>> current base address in its version 2 vmstate.
>>>
>>> To align both implementations for migratability, add the proper
>>> infrastructure to accept initial as well as updated base addresses and
>>> include the current address in the vmstate. This is done in a way that
>>> will also allow multiple IOAPICs in the future.
>>
>> Nack, the addresses should be device properties.
>
> Hmm we could make default_base_address a property. Will change that.
> But current_base_address is just the same as apicbase and can't be a
> property.

Oh, right. What will current_base_address used for? Why can't board
just unmap IOAPIC from current address and remap it at the new
address? Then the device would not need to know its base address.

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Blue Swirl

On Thu, Feb 3, 2011 at 5:43 PM, Jan Kiszka  wrote:
> On 2011-02-03 18:36, Blue Swirl wrote:
>> On Thu, Feb 3, 2011 at 5:18 PM, Jan Kiszka  wrote:
>>> On 2011-02-03 18:03, Blue Swirl wrote:
 On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  wrote:
> The registers of real IOAPICs can be relocated during runtime (via
> chipset registers). We don't support this yet, but qemu-kvm carries the
> current base address in its version 2 vmstate.
>
> To align both implementations for migratability, add the proper
> infrastructure to accept initial as well as updated base addresses and
> include the current address in the vmstate. This is done in a way that
> will also allow multiple IOAPICs in the future.

 Nack, the addresses should be device properties.
>>>
>>> Hmm we could make default_base_address a property. Will change that.
>>> But current_base_address is just the same as apicbase and can't be a
>>> property.
>>
>> Oh, right. What will current_base_address used for? Why can't board
>> just unmap IOAPIC from current address and remap it at the new
>> address? Then the device would not need to know its base address.
>
> The board could do this. The question is where we put this service, in
> the context if the IOAPIC as ioapic_set_base_address (compare to
> cpu_set_apic_base - which is buggy as it lacks sysbus_mmio_map) or into
> each and every board code. In the latter case, the boards would also be
> responsible for saving/restoring the address.

How is the device relocated? Where are the chipset registers you mention?

[Qemu-devel] [0.14?][PATCH v2 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Jan Kiszka

The registers of real IOAPICs can be relocated during runtime (via
chipset registers). We don't support this yet, but qemu-kvm carries the
current base address in its version 2 vmstate.

To align both implementations for migratability, add the proper
infrastructure to accept initial as well as updated base addresses and
include the current address in the vmstate. This is done in a way that
will also allow multiple IOAPICs in the future.

Signed-off-by: Jan Kiszka 
---
 hw/ioapic.c  |   19 +++
 hw/ioapic.h  |4 
 hw/pc_piix.c |4 +---
 3 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/hw/ioapic.c b/hw/ioapic.c
index edf99cc..705803c 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -63,6 +63,8 @@ struct IOAPICState {
 
 uint32_t irr;
 uint64_t ioredtbl[IOAPIC_NUM_PINS];
+uint64_t default_base_address;
+uint64_t current_base_address;
 };
 
 static IOAPICState *ioapics[MAX_IOAPICS];
@@ -238,7 +240,9 @@ static int ioapic_post_load(void *opaque, int version_id)
 if (version_id == 1) {
 /* set sane values */
 s->irr = 0;
+s->current_base_address = s->default_base_address;
 }
+sysbus_mmio_map(&s->busdev, 0, s->current_base_address);
 return 0;
 }
 
@@ -251,17 +255,27 @@ static const VMStateDescription vmstate_ioapic = {
 .fields  = (VMStateField []) {
 VMSTATE_UINT8(id, IOAPICState),
 VMSTATE_UINT8(ioregsel, IOAPICState),
+VMSTATE_UINT64_V(current_base_address, IOAPICState, 2),
 VMSTATE_UINT32_V(irr, IOAPICState, 2),
 VMSTATE_UINT64_ARRAY(ioredtbl, IOAPICState, IOAPIC_NUM_PINS),
 VMSTATE_END_OF_LIST()
 }
 };
 
+void ioapic_set_base_address(DeviceState *d, target_phys_addr_t addr)
+{
+IOAPICState *s = DO_UPCAST(IOAPICState, busdev.qdev, d);
+
+s->current_base_address = addr;
+sysbus_mmio_map(&s->busdev, 0, addr);
+}
+
 static void ioapic_reset(DeviceState *d)
 {
 IOAPICState *s = DO_UPCAST(IOAPICState, busdev.qdev, d);
 int i;
 
+ioapic_set_base_address(d, s->default_base_address);
 s->id = 0;
 s->ioregsel = 0;
 s->irr = 0;
@@ -310,6 +324,11 @@ static SysBusDeviceInfo ioapic_info = {
 .qdev.vmsd = &vmstate_ioapic,
 .qdev.reset = ioapic_reset,
 .qdev.no_user = 1,
+.qdev.props = (Property[]) {
+DEFINE_PROP_UINT64("default_base_address", IOAPICState,
+   default_base_address, IOAPIC_DEFAULT_BASE_ADDRESS),
+DEFINE_PROP_END_OF_LIST()
+}
 };
 
 static void ioapic_register_devices(void)
diff --git a/hw/ioapic.h b/hw/ioapic.h
index c441567..1130d60 100644
--- a/hw/ioapic.h
+++ b/hw/ioapic.h
@@ -17,4 +17,8 @@
  * License along with this library; if not, see .
  */
 
+#define IOAPIC_DEFAULT_BASE_ADDRESS0xfec0
+
 void ioapic_eio_broadcast(int vector);
+
+void ioapic_set_base_address(DeviceState *d, target_phys_addr_t addr);
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 7b74473..82dce4c 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -25,6 +25,7 @@
 #include "hw.h"
 #include "pc.h"
 #include "apic.h"
+#include "ioapic.h"
 #include "pci.h"
 #include "usb-uhci.h"
 #include "usb-ohci.h"
@@ -46,13 +47,10 @@ static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
 static void ioapic_init(IsaIrqState *isa_irq_state)
 {
 DeviceState *dev;
-SysBusDevice *d;
 unsigned int i;
 
 dev = qdev_create(NULL, "ioapic");
 qdev_init_nofail(dev);
-d = sysbus_from_qdev(dev);
-sysbus_mmio_map(d, 0, 0xfec0);
 
 for (i = 0; i < IOAPIC_NUM_PINS; i++) {
 isa_irq_state->ioapic[i] = qdev_get_gpio_in(dev, i);
-- 
1.7.1

[Qemu-devel] [0.14?][PATCH v2 2/4] ioapic: Save/restore irr

2011-02-03 Thread Jan Kiszka

This is a guest modifiable state that must be saved/restored properly.

Signed-off-by: Jan Kiszka 
---
 hw/ioapic.c |   15 ++-
 1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/hw/ioapic.c b/hw/ioapic.c
index 443c579..edf99cc 100644
--- a/hw/ioapic.c
+++ b/hw/ioapic.c
@@ -231,14 +231,27 @@ static void ioapic_mem_writel(void *opaque, 
target_phys_addr_t addr, uint32_t va
 }
 }
 
+static int ioapic_post_load(void *opaque, int version_id)
+{
+IOAPICState *s = opaque;
+
+if (version_id == 1) {
+/* set sane values */
+s->irr = 0;
+}
+return 0;
+}
+
 static const VMStateDescription vmstate_ioapic = {
 .name = "ioapic",
-.version_id = 1,
+.version_id = 2,
+.post_load = ioapic_post_load,
 .minimum_version_id = 1,
 .minimum_version_id_old = 1,
 .fields  = (VMStateField []) {
 VMSTATE_UINT8(id, IOAPICState),
 VMSTATE_UINT8(ioregsel, IOAPICState),
+VMSTATE_UINT32_V(irr, IOAPICState, 2),
 VMSTATE_UINT64_ARRAY(ioredtbl, IOAPICState, IOAPIC_NUM_PINS),
 VMSTATE_END_OF_LIST()
 }
-- 
1.7.1

[Qemu-devel] Re: [PATCHv2] e1000: multi-buffer packet support

2011-02-03 Thread Alex Williamson

On Thu, 2011-02-03 at 18:49 +0200, Michael S. Tsirkin wrote:
> e1000 supports multi-buffer packets larger than rxbuf_size.
> 
> This fixes the following (on linux):
> - in guest: ifconfig eth1 mtu 16110
> - in host: ifconfig tap0 mtu 16110
>ping -s 16082 
> 
> Red Hat bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=602205
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
> 
> Changes from v1:
>   removed dead code
> 
>  hw/e1000.c |   29 +++--
>  1 files changed, 19 insertions(+), 10 deletions(-)

Acked-by: Alex Williamson 

> diff --git a/hw/e1000.c b/hw/e1000.c
> index af101bd..3427ff3 100644
> --- a/hw/e1000.c
> +++ b/hw/e1000.c
> @@ -642,6 +642,8 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  uint16_t vlan_special = 0;
>  uint8_t vlan_status = 0, vlan_offset = 0;
>  uint8_t min_buf[MIN_BUF_SIZE];
> +size_t desc_offset;
> +size_t desc_size;
>  
>  if (!(s->mac_reg[RCTL] & E1000_RCTL_EN))
>  return -1;
> @@ -654,12 +656,6 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  size = sizeof(min_buf);
>  }
>  
> -if (size > s->rxbuf_size) {
> -DBGOUT(RX, "packet too large for buffers (%lu > %d)\n",
> -   (unsigned long)size, s->rxbuf_size);
> -return -1;
> -}
> -
>  if (!receive_filter(s, buf, size))
>  return size;
>  
> @@ -672,8 +668,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  }
>  
>  rdh_start = s->mac_reg[RDH];
> +desc_offset = 0;
>  do {
> +desc_size = size - desc_offset;
> +if (desc_size > s->rxbuf_size) {
> +desc_size = s->rxbuf_size;
> +}
>  if (s->mac_reg[RDH] == s->mac_reg[RDT] && s->check_rxov) {
> +/* Discard all data written so far */
> +s->mac_reg[RDH] = rdh_start;
>  set_ics(s, 0, E1000_ICS_RXO);
>  return -1;
>  }
> @@ -684,9 +687,15 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  desc.status |= (vlan_status | E1000_RXD_STAT_DD);
>  if (desc.buffer_addr) {
>  cpu_physical_memory_write(le64_to_cpu(desc.buffer_addr),
> -  (void *)(buf + vlan_offset), size);
> -desc.length = cpu_to_le16(size + fcs_len(s));
> -desc.status |= E1000_RXD_STAT_EOP|E1000_RXD_STAT_IXSM;
> +  (void *)(buf + desc_offset + 
> vlan_offset),
> +  desc_size);
> +desc_offset += desc_size;
> +if (desc_offset >= size) {
> +desc.length = cpu_to_le16(desc_size + fcs_len(s));
> +desc.status |= E1000_RXD_STAT_EOP | E1000_RXD_STAT_IXSM;
> +} else {
> +desc.length = cpu_to_le16(desc_size);
> +}
>  } else { // as per intel docs; skip descriptors with null buf addr
>  DBGOUT(RX, "Null RX descriptor!!\n");
>  }
> @@ -702,7 +711,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, 
> size_t size)
>  set_ics(s, 0, E1000_ICS_RXO);
>  return -1;
>  }
> -} while (desc.buffer_addr == 0);
> +} while (desc_offset < size);
>  
>  s->mac_reg[GPRC]++;
>  s->mac_reg[TPR]++;

Re: [Qemu-devel] [PATCH 06/20] usb-ccid: review fixes (v15-v16)

2011-02-03 Thread Alon Levy

On Thu, Feb 03, 2011 at 10:48:03AM -0600, Anthony Liguori wrote:
> On 02/02/2011 02:28 PM, Alon Levy wrote:
> >I'll fold it before submitting the version to be applied, but
> >I hope keeping it as a separate patch will make reviewing easier.
> 
> Hrm, can you just send out the new patches?   It's actually harder
> to review like this.

Yes.

> 
> Regards,
> 
> Anthony Liguori
> 
> >Behavioral changes:
> >  * fix abort on client answer after card remove
> >  * enable migration
> >  * remove side affect code from asserts
> >  * return consistent self-powered state
> >  * mask out reserved bits in ccid_set_parameters
> >  * add missing abRFU in SetParameters (no affect on linux guest)
> >
> >whitefixes / comments / consts defines:
> >  * remove stale comment
> >  * remove ccid_print_pending_answers if no DEBUG_CCID
> >  * replace printf's with DPRINTF, remove DEBUG_CCID, add verbosity defines
> >  * use error_report
> >  * update copyright (most of the code is not original)
> >  * reword known bug comment
> >  * add missing closing quote in comment
> >  * add missing whitespace on one line
> >  * s/CCID_SetParameter/CCID_SetParameters/
> >  * add comments
> >  * use define for max packet size
> >
> >Comment for "return consistent self-powered state":
> >
> >the Configuration Descriptor bmAttributes claims we are self powered,
> >but we were returning not self powered to USB_REQ_GET_STATUS control message.
> >
> >In practice, this message is not sent by a linux 2.6.35.10-74.fc14.x86_64
> >guest (not tested on other guests), unless you issue lsusb -v as root (for
> >example).
> >
> >Signed-off-by: Alon Levy
> >---
> >  hw/ccid.h |7 +++
> >  hw/usb-ccid.c |  115 
> > -
> >  2 files changed, 63 insertions(+), 59 deletions(-)
> >
> >diff --git a/hw/ccid.h b/hw/ccid.h
> >index af59070..f5dcfae 100644
> >--- a/hw/ccid.h
> >+++ b/hw/ccid.h
> >@@ -6,11 +6,16 @@
> >  typedef struct CCIDCardState CCIDCardState;
> >  typedef struct CCIDCardInfo CCIDCardInfo;
> >
> >+/* state of the CCID Card device (i.e. hw/ccid-card-*.c)
> >+ */
> >  struct CCIDCardState {
> >  DeviceState qdev;
> >  uint32_tslot; // For future use with multiple slot reader.
> >  };
> >
> >+/* callbacks to be used by the CCID device (hw/usb-ccid.c) to call
> >+ * into the smartcard device (hw/ccid-card-*.c)
> >+ */
> >  struct CCIDCardInfo {
> >  DeviceInfo qdev;
> >  void (*print)(Monitor *mon, CCIDCardState *card, int indent);
> >@@ -20,6 +25,8 @@ struct CCIDCardInfo {
> >  int (*initfn)(CCIDCardState *card);
> >  };
> >
> >+/* API for smartcard calling the CCID device (used by hw/ccid-card-*.c)
> >+ */
> >  void ccid_card_send_apdu_to_guest(CCIDCardState *card, uint8_t* apdu, 
> > uint32_t len);
> >  void ccid_card_card_removed(CCIDCardState *card);
> >  void ccid_card_card_inserted(CCIDCardState *card);
> >diff --git a/hw/usb-ccid.c b/hw/usb-ccid.c
> >index 58f69a6..09e39ac 100644
> >--- a/hw/usb-ccid.c
> >+++ b/hw/usb-ccid.c
> >@@ -1,15 +1,18 @@
> >  /*
> >   * CCID Device emulation
> >   *
> >- * Based on usb-serial.c:
> >+ * Written by Alon Levy, with contributions from Robert Relyea.
> >+ *
> >+ * Based on usb-serial.c, see it's copyright and attributions below.
> >+ *
> >+ * This code is licenced under the GNU LGPL, version 2 or later.
> >+ *
> >+ * ---
> >+ *
> >+ * usb-serial.c copyright and attribution:
> >   * Copyright (c) 2006 CodeSourcery.
> >   * Copyright (c) 2008 Samuel Thibault
> >   * Written by Paul Brook, reused for FTDI by Samuel Thibault,
> >- * Reused for CCID by Alon Levy.
> >- * Contributed to by Robert Relyea
> >- * Copyright (c) 2010 Red Hat.
> >- *
> >- * This code is licenced under the LGPL.
> >   */
> >
> >  /* References:
> >@@ -19,22 +22,16 @@
> >   *  Specification for Integrated Circuit(s) Cards Interface Devices
> >   *
> >   * Endianess note: from the spec (1.3)
> >- *  "Fields that are larger than a byte are stored in little endian
> >+ *  "Fields that are larger than a byte are stored in little endian"
> >   *
> >   * KNOWN BUGS
> >   * 1. remove/insert can sometimes result in removed state instead of 
> > inserted.
> >   * This is a result of the following:
> >- *  symptom: dmesg shows ERMOTEIO (-121), pcscd shows -99. This happens
> >- *  when we send a too short packet, seen in uhci-usb.c, resulting from
> >- *  a urb requesting SPD and us returning a smaller packet.
> >+ *  symptom: dmesg shows ERMOTEIO (-121), pcscd shows -99. This can happen
> >+ *  when a short packet is sent, as seen in uhci-usb.c, resulting from a urb
> >+ *  from the guest requesting SPD and us returning a smaller packet.
> >   *  Not sure which messages trigger this.
> >   *
> >- * Migration note:
> >- *
> >- * All the VMStateDescription's are left here for future use, but
> >- * not enabled right now since there is no support for USB migration.
> >- *
> >- * To enable define ENABLE_MIGRATION
> >   */
> >
> >  #include "qemu-common.h"
> >@@ -44,

[Qemu-devel] [Bug 712416] Re: kvm_intel kernel module crash with via nano vmx

2011-02-03 Thread khetzal

Confirmation: totaly crash with latest build. Nothing in logs.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/712416

Title:
  kvm_intel kernel module crash with via nano vmx

Status in QEMU:
  New
Status in “kvm” package in Ubuntu:
  Incomplete
Status in “kvm” package in Debian:
  New

Bug description:
  kvm module for hardware virtualisation not work properly on via nano
  processors.

  Tested with processor: VIA Nano processor U2250.
  Processors flags (visible in /proc/cpuinfo): fpu vme de pse tsc msr pae mce 
cx8 apic sep mtrr pge mca cmov pat clflush acpi mmx fxsr sse sse2 ss tm syscall 
nx lm constant_tsc up rep_good pni monitor vmx est tm2 ssse3 cx16 xtpr rng 
rng_en ace ace_en ace2 phe phe_en lahf_lm

  With kernel 2.6.32: kvm not work and dmesg contains a lot of:
  handle_exception: unexpected, vectoring info 0x800d intr info 0x8b0d

  With kernel 2.6.35: all the system crash. Nothing visible in logs

Re: [Qemu-devel] [PATCH 05/20] usb-ccid: add CCID bus

2011-02-03 Thread Alon Levy

On Thu, Feb 03, 2011 at 10:46:59AM -0600, Anthony Liguori wrote:
> On 02/02/2011 02:28 PM, Alon Levy wrote:
> >A CCID device is a smart card reader. It is a USB device, defined at [1].
> >This patch introduces the usb-ccid device that is a ccid bus. Next patches 
> >will
> >introduce two card types to use it, a passthru card and an emulated card.
> >
> >  [1] http://www.usb.org/developers/devclass_docs/DWG_Smart-Card_CCID_Rev110.
> >
> >Signed-off-by: Alon Levy
> >---
> >  Makefile.objs |1 +
> >  configure |6 +
> >  hw/ccid.h |   35 ++
> >  hw/usb-ccid.c | 1355 
> > +
> >  4 files changed, 1397 insertions(+), 0 deletions(-)
> >  create mode 100644 hw/ccid.h
> >  create mode 100644 hw/usb-ccid.c
> >
> >diff --git a/Makefile.objs b/Makefile.objs
> >index f1c7bfe..a1f3853 100644
> >--- a/Makefile.objs
> >+++ b/Makefile.objs
> >@@ -195,6 +195,7 @@ hw-obj-$(CONFIG_FDC) += fdc.o
> >  hw-obj-$(CONFIG_ACPI) += acpi.o acpi_piix4.o
> >  hw-obj-$(CONFIG_APM) += pm_smbus.o apm.o
> >  hw-obj-$(CONFIG_DMA) += dma.o
> >+hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o
> >
> >  # PPC devices
> >  hw-obj-$(CONFIG_OPENPIC) += openpic.o
> >diff --git a/configure b/configure
> >index 598e8e1..14a035a 100755
> >--- a/configure
> >+++ b/configure
> >@@ -174,6 +174,7 @@ trace_backend="nop"
> >  trace_file="trace"
> >  spice=""
> >  rbd=""
> >+smartcard="yes"
> >
> >  # parse CC options first
> >  for opt do
> >@@ -2472,6 +2473,7 @@ echo "Trace output file $trace_file-"
> >  echo "spice support $spice"
> >  echo "rbd support   $rbd"
> >  echo "xfsctl support$xfs"
> >+echo "smartcard support $smartcard"
> >
> >  if test $sdl_too_old = "yes"; then
> >  echo "->  Your SDL version is too old - please upgrade to have SDL support"
> >@@ -2744,6 +2746,10 @@ if test "$spice" = "yes" ; then
> >echo "CONFIG_SPICE=y">>  $config_host_mak
> >  fi
> >
> >+if test "$smartcard" = "yes" ; then
> >+  echo "CONFIG_SMARTCARD=y">>  $config_host_mak
> >+fi
> >+
> >  # XXX: suppress that
> >  if [ "$bsd" = "yes" ] ; then
> >echo "CONFIG_BSD=y">>  $config_host_mak
> >diff --git a/hw/ccid.h b/hw/ccid.h
> >new file mode 100644
> >index 000..af59070
> >--- /dev/null
> >+++ b/hw/ccid.h
> >@@ -0,0 +1,35 @@
> >+#ifndef __CCID_H__
> >+#define __CCID_H__
> 
> Missing copyright, plus identifiers can't start with __.

Will fix. Again.

> 
> >+
> >+#include "qdev.h"
> >+
> >+typedef struct CCIDCardState CCIDCardState;
> >+typedef struct CCIDCardInfo CCIDCardInfo;
> >+
> >+struct CCIDCardState {
> >+DeviceState qdev;
> >+uint32_tslot; // For future use with multiple slot reader.
> >+};
> 
> Please stick to C89 comments.
> 

Another oops.

> >+
> >+struct CCIDCardInfo {
> >+DeviceInfo qdev;
> >+void (*print)(Monitor *mon, CCIDCardState *card, int indent);
> >+const uint8_t *(*get_atr)(CCIDCardState *card, uint32_t *len);
> >+void (*apdu_from_guest)(CCIDCardState *card, const uint8_t *apdu, 
> >uint32_t len);
> >+int (*exitfn)(CCIDCardState *card);
> >+int (*initfn)(CCIDCardState *card);
> >+};
> >+
> >+void ccid_card_send_apdu_to_guest(CCIDCardState *card, uint8_t* apdu, 
> >uint32_t len);
> >+void ccid_card_card_removed(CCIDCardState *card);
> >+void ccid_card_card_inserted(CCIDCardState *card);
> >+void ccid_card_card_error(CCIDCardState *card, uint64_t error);
> >+void ccid_card_qdev_register(CCIDCardInfo *card);
> >+
> >+/* support guest visible insertion/removal of ccid devices based on actual
> >+ * devices connected/removed. Called by card implementation (passthru, 
> >local) */
> >+int ccid_card_ccid_attach(CCIDCardState *card);
> >+void ccid_card_ccid_detach(CCIDCardState *card);
> >+
> >+#endif // __CCID_H__
> >+
> >diff --git a/hw/usb-ccid.c b/hw/usb-ccid.c
> >new file mode 100644
> >index 000..58f69a6
> >--- /dev/null
> >+++ b/hw/usb-ccid.c
> >@@ -0,0 +1,1355 @@
> >+/*
> >+ * CCID Device emulation
> >+ *
> >+ * Based on usb-serial.c:
> >+ * Copyright (c) 2006 CodeSourcery.
> >+ * Copyright (c) 2008 Samuel Thibault
> >+ * Written by Paul Brook, reused for FTDI by Samuel Thibault,
> >+ * Reused for CCID by Alon Levy.
> >+ * Contributed to by Robert Relyea
> >+ * Copyright (c) 2010 Red Hat.
> >+ *
> >+ * This code is licenced under the LGPL.
> >+ */
> >+
> >+/* References:
> >+ *
> >+ * CCID Specification Revision 1.1 April 22nd 2005
> >+ *  "Universal Serial Bus, Device Class: Smart Card"
> >+ *  Specification for Integrated Circuit(s) Cards Interface Devices
> >+ *
> >+ * Endianess note: from the spec (1.3)
> >+ *  "Fields that are larger than a byte are stored in little endian
> >+ *
> >+ * KNOWN BUGS
> >+ * 1. remove/insert can sometimes result in removed state instead of 
> >inserted.
> >+ * This is a result of the following:
> >+ *  symptom: dmesg shows ERMOTEIO (-121), pcscd shows -99. This happens
> >+ *  when we send a too short packet, seen in uhci-usb.c, resulting from
> >+ *  a urb requesting SPD and us returning a smal

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Blue Swirl

On Thu, Feb 3, 2011 at 6:01 PM, Jan Kiszka  wrote:
> On 2011-02-03 18:54, Blue Swirl wrote:
>> On Thu, Feb 3, 2011 at 5:43 PM, Jan Kiszka  wrote:
>>> On 2011-02-03 18:36, Blue Swirl wrote:
 On Thu, Feb 3, 2011 at 5:18 PM, Jan Kiszka  wrote:
> On 2011-02-03 18:03, Blue Swirl wrote:
>> On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  
>> wrote:
>>> The registers of real IOAPICs can be relocated during runtime (via
>>> chipset registers). We don't support this yet, but qemu-kvm carries the
>>> current base address in its version 2 vmstate.
>>>
>>> To align both implementations for migratability, add the proper
>>> infrastructure to accept initial as well as updated base addresses and
>>> include the current address in the vmstate. This is done in a way that
>>> will also allow multiple IOAPICs in the future.
>>
>> Nack, the addresses should be device properties.
>
> Hmm we could make default_base_address a property. Will change that.
> But current_base_address is just the same as apicbase and can't be a
> property.

 Oh, right. What will current_base_address used for? Why can't board
 just unmap IOAPIC from current address and remap it at the new
 address? Then the device would not need to know its base address.
>>>
>>> The board could do this. The question is where we put this service, in
>>> the context if the IOAPIC as ioapic_set_base_address (compare to
>>> cpu_set_apic_base - which is buggy as it lacks sysbus_mmio_map) or into
>>> each and every board code. In the latter case, the boards would also be
>>> responsible for saving/restoring the address.
>>
>> How is the device relocated? Where are the chipset registers you mention?
>
> Intel's PIIX chipsets contain a register called APICBASE (but it means
> the IOAPIC), and that defines the location. The analogy in the APIC
> world is the MSR_IA32_APICBASE which we maintain via the APIC state.

In ICH10 the register is called OIC—Other Interrupt Control Register
and the interesting bits APIC Range Select (ASEL).

So actually PIIX should manage IOAPIC mapping, not board level.

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Jan Kiszka

On 2011-02-03 20:01, Blue Swirl wrote:
> On Thu, Feb 3, 2011 at 6:01 PM, Jan Kiszka  wrote:
>> On 2011-02-03 18:54, Blue Swirl wrote:
>>> On Thu, Feb 3, 2011 at 5:43 PM, Jan Kiszka  wrote:
 On 2011-02-03 18:36, Blue Swirl wrote:
> On Thu, Feb 3, 2011 at 5:18 PM, Jan Kiszka  wrote:
>> On 2011-02-03 18:03, Blue Swirl wrote:
>>> On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  
>>> wrote:
 The registers of real IOAPICs can be relocated during runtime (via
 chipset registers). We don't support this yet, but qemu-kvm carries the
 current base address in its version 2 vmstate.

 To align both implementations for migratability, add the proper
 infrastructure to accept initial as well as updated base addresses and
 include the current address in the vmstate. This is done in a way that
 will also allow multiple IOAPICs in the future.
>>>
>>> Nack, the addresses should be device properties.
>>
>> Hmm we could make default_base_address a property. Will change that.
>> But current_base_address is just the same as apicbase and can't be a
>> property.
>
> Oh, right. What will current_base_address used for? Why can't board
> just unmap IOAPIC from current address and remap it at the new
> address? Then the device would not need to know its base address.

 The board could do this. The question is where we put this service, in
 the context if the IOAPIC as ioapic_set_base_address (compare to
 cpu_set_apic_base - which is buggy as it lacks sysbus_mmio_map) or into
 each and every board code. In the latter case, the boards would also be
 responsible for saving/restoring the address.
>>>
>>> How is the device relocated? Where are the chipset registers you mention?
>>
>> Intel's PIIX chipsets contain a register called APICBASE (but it means
>> the IOAPIC), and that defines the location. The analogy in the APIC
>> world is the MSR_IA32_APICBASE which we maintain via the APIC state.
> 
> In ICH10 the register is called OIC—Other Interrupt Control Register
> and the interesting bits APIC Range Select (ASEL).
> 
> So actually PIIX should manage IOAPIC mapping, not board level.

The point is we need ioapic_set_base_address logic in multiple places
(once chipsets start to implement it). Better push it to a central place
from the beginning. Also the bit keeping. There is no difference to
apicbase.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Blue Swirl

On Thu, Feb 3, 2011 at 7:06 PM, Jan Kiszka  wrote:
> On 2011-02-03 20:01, Blue Swirl wrote:
>> On Thu, Feb 3, 2011 at 6:01 PM, Jan Kiszka  wrote:
>>> On 2011-02-03 18:54, Blue Swirl wrote:
 On Thu, Feb 3, 2011 at 5:43 PM, Jan Kiszka  wrote:
> On 2011-02-03 18:36, Blue Swirl wrote:
>> On Thu, Feb 3, 2011 at 5:18 PM, Jan Kiszka  
>> wrote:
>>> On 2011-02-03 18:03, Blue Swirl wrote:
 On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  
 wrote:
> The registers of real IOAPICs can be relocated during runtime (via
> chipset registers). We don't support this yet, but qemu-kvm carries 
> the
> current base address in its version 2 vmstate.
>
> To align both implementations for migratability, add the proper
> infrastructure to accept initial as well as updated base addresses and
> include the current address in the vmstate. This is done in a way that
> will also allow multiple IOAPICs in the future.

 Nack, the addresses should be device properties.
>>>
>>> Hmm we could make default_base_address a property. Will change that.
>>> But current_base_address is just the same as apicbase and can't be a
>>> property.
>>
>> Oh, right. What will current_base_address used for? Why can't board
>> just unmap IOAPIC from current address and remap it at the new
>> address? Then the device would not need to know its base address.
>
> The board could do this. The question is where we put this service, in
> the context if the IOAPIC as ioapic_set_base_address (compare to
> cpu_set_apic_base - which is buggy as it lacks sysbus_mmio_map) or into
> each and every board code. In the latter case, the boards would also be
> responsible for saving/restoring the address.

 How is the device relocated? Where are the chipset registers you mention?
>>>
>>> Intel's PIIX chipsets contain a register called APICBASE (but it means
>>> the IOAPIC), and that defines the location. The analogy in the APIC
>>> world is the MSR_IA32_APICBASE which we maintain via the APIC state.
>>
>> In ICH10 the register is called OIC—Other Interrupt Control Register
>> and the interesting bits APIC Range Select (ASEL).
>>
>> So actually PIIX should manage IOAPIC mapping, not board level.
>
> The point is we need ioapic_set_base_address logic in multiple places
> (once chipsets start to implement it). Better push it to a central place
> from the beginning. Also the bit keeping. There is no difference to
> apicbase.

In that case, the function should be made inline version in ioapic.h.

[Qemu-devel] [PATCH v3] make tsc stable over migration and machine start

2011-02-03 Thread Glauber Costa

If the machine is stopped, we should not record two different tsc values
upon a save operation. The same problem happens with kvmclock.

But kvmclock is taking a different diretion, being now seen as a separate
device. Since this is unlikely to happen with the tsc, I am taking the
approach here of simply registering a handler for state change, and
using a per-CPUState variable that prevents double updates for the TSC.

Signed-off-by: Glauber Costa 
CC: Jan Kiszka 

---
v2: updated tsc validation logic, as asked by Jan
v3: regenerated against uq/master
---
 target-i386/cpu.h |1 +
 target-i386/kvm.c |   18 +-
 2 files changed, 18 insertions(+), 1 deletions(-)

diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index af701a4..5f1df8b 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -734,6 +734,7 @@ typedef struct CPUX86State {
 uint32_t sipi_vector;
 uint32_t cpuid_kvm_features;
 uint32_t cpuid_svm_features;
+bool tsc_valid;
 
 /* in order to simplify APIC support, we leave this pointer to the
user */
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 05010bb..ed9557e 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -301,6 +301,15 @@ void kvm_inject_x86_mce(CPUState *cenv, int bank, uint64_t 
status,
 #endif
 }
 
+static void cpu_update_state(void *opaque, int running, int reason)
+{
+CPUState *env = opaque;
+
+if (running) {
+env->tsc_valid = false;
+}
+}
+
 int kvm_arch_init_vcpu(CPUState *env)
 {
 struct {
@@ -434,6 +443,8 @@ int kvm_arch_init_vcpu(CPUState *env)
 }
 #endif
 
+qemu_add_vm_change_state_handler(cpu_update_state, env);
+
 return kvm_vcpu_ioctl(env, KVM_SET_CPUID2, &cpuid_data);
 }
 
@@ -1061,7 +1072,12 @@ static int kvm_get_msrs(CPUState *env)
 if (has_msr_hsave_pa) {
 msrs[n++].index = MSR_VM_HSAVE_PA;
 }
-msrs[n++].index = MSR_IA32_TSC;
+
+if (!env->tsc_valid) {
+msrs[n++].index = MSR_IA32_TSC;
+env->tsc_valid = !vm_running;
+}
+
 #ifdef TARGET_X86_64
 if (lm_capable_kernel) {
 msrs[n++].index = MSR_CSTAR;
-- 
1.7.2.3

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Jan Kiszka

On 2011-02-03 20:11, Blue Swirl wrote:
> On Thu, Feb 3, 2011 at 7:06 PM, Jan Kiszka  wrote:
>> On 2011-02-03 20:01, Blue Swirl wrote:
>>> On Thu, Feb 3, 2011 at 6:01 PM, Jan Kiszka  wrote:
 On 2011-02-03 18:54, Blue Swirl wrote:
> On Thu, Feb 3, 2011 at 5:43 PM, Jan Kiszka  wrote:
>> On 2011-02-03 18:36, Blue Swirl wrote:
>>> On Thu, Feb 3, 2011 at 5:18 PM, Jan Kiszka  
>>> wrote:
 On 2011-02-03 18:03, Blue Swirl wrote:
> On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  
> wrote:
>> The registers of real IOAPICs can be relocated during runtime (via
>> chipset registers). We don't support this yet, but qemu-kvm carries 
>> the
>> current base address in its version 2 vmstate.
>>
>> To align both implementations for migratability, add the proper
>> infrastructure to accept initial as well as updated base addresses 
>> and
>> include the current address in the vmstate. This is done in a way 
>> that
>> will also allow multiple IOAPICs in the future.
>
> Nack, the addresses should be device properties.

 Hmm we could make default_base_address a property. Will change 
 that.
 But current_base_address is just the same as apicbase and can't be a
 property.
>>>
>>> Oh, right. What will current_base_address used for? Why can't board
>>> just unmap IOAPIC from current address and remap it at the new
>>> address? Then the device would not need to know its base address.
>>
>> The board could do this. The question is where we put this service, in
>> the context if the IOAPIC as ioapic_set_base_address (compare to
>> cpu_set_apic_base - which is buggy as it lacks sysbus_mmio_map) or into
>> each and every board code. In the latter case, the boards would also be
>> responsible for saving/restoring the address.
>
> How is the device relocated? Where are the chipset registers you mention?

 Intel's PIIX chipsets contain a register called APICBASE (but it means
 the IOAPIC), and that defines the location. The analogy in the APIC
 world is the MSR_IA32_APICBASE which we maintain via the APIC state.
>>>
>>> In ICH10 the register is called OIC—Other Interrupt Control Register
>>> and the interesting bits APIC Range Select (ASEL).
>>>
>>> So actually PIIX should manage IOAPIC mapping, not board level.
>>
>> The point is we need ioapic_set_base_address logic in multiple places
>> (once chipsets start to implement it). Better push it to a central place
>> from the beginning. Also the bit keeping. There is no difference to
>> apicbase.
> 
> In that case, the function should be made inline version in ioapic.h.

That still replicates the bit keeping.

I don't see the benefit of moving it over, even less when we want to
consolidate with a vmstate layout that is already in use.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[Qemu-devel] [PATCH 1/4] target-arm: Add CPU feature flag for v7MP

2011-02-03 Thread Peter Maydell

Add a CPU feature flag for v7MP (the multiprocessing extensions); some
instructions exist only for v7MP and not for the base v7 architecture.

Signed-off-by: Peter Maydell 
---
 target-arm/cpu.h|3 ++-
 target-arm/helper.c |6 ++
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 5bcd53a..0d96325 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -362,7 +362,8 @@ enum arm_features {
 ARM_FEATURE_DIV,
 ARM_FEATURE_M, /* Microcontroller profile.  */
 ARM_FEATURE_OMAPCP, /* OMAP specific CP15 ops handling.  */
-ARM_FEATURE_THUMB2EE
+ARM_FEATURE_THUMB2EE,
+ARM_FEATURE_V7MP/* v7 Multiprocessing Extensions */
 };
 
 static inline int arm_feature(CPUARMState *env, int feature)
diff --git a/target-arm/helper.c b/target-arm/helper.c
index b562767..3cf9181 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -123,6 +123,11 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 set_feature(env, ARM_FEATURE_VFP_FP16);
 set_feature(env, ARM_FEATURE_NEON);
 set_feature(env, ARM_FEATURE_THUMB2EE);
+/* Note that A9 supports the MP extensions even for
+ * A9UP and single-core A9MP (which are both different
+ * and valid configurations; we don't model A9UP).
+ */
+set_feature(env, ARM_FEATURE_V7MP);
 env->vfp.xregs[ARM_VFP_FPSID] = 0x41034000; /* Guess */
 env->vfp.xregs[ARM_VFP_MVFR0] = 0x0222;
 env->vfp.xregs[ARM_VFP_MVFR1] = 0x0111;
@@ -152,6 +157,7 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t 
id)
 set_feature(env, ARM_FEATURE_NEON);
 set_feature(env, ARM_FEATURE_THUMB2EE);
 set_feature(env, ARM_FEATURE_DIV);
+set_feature(env, ARM_FEATURE_V7MP);
 break;
 case ARM_CPUID_TI915T:
 case ARM_CPUID_TI925T:
-- 
1.7.1

Re: [Qemu-devel] [0.14?][PATCH 3/4] ioapic: Prepare for base address relocation

2011-02-03 Thread Blue Swirl

On Thu, Feb 3, 2011 at 7:25 PM, Jan Kiszka  wrote:
> On 2011-02-03 20:11, Blue Swirl wrote:
>> On Thu, Feb 3, 2011 at 7:06 PM, Jan Kiszka  wrote:
>>> On 2011-02-03 20:01, Blue Swirl wrote:
 On Thu, Feb 3, 2011 at 6:01 PM, Jan Kiszka  wrote:
> On 2011-02-03 18:54, Blue Swirl wrote:
>> On Thu, Feb 3, 2011 at 5:43 PM, Jan Kiszka  
>> wrote:
>>> On 2011-02-03 18:36, Blue Swirl wrote:
 On Thu, Feb 3, 2011 at 5:18 PM, Jan Kiszka  
 wrote:
> On 2011-02-03 18:03, Blue Swirl wrote:
>> On Thu, Feb 3, 2011 at 2:55 PM, Jan Kiszka  
>> wrote:
>>> The registers of real IOAPICs can be relocated during runtime (via
>>> chipset registers). We don't support this yet, but qemu-kvm carries 
>>> the
>>> current base address in its version 2 vmstate.
>>>
>>> To align both implementations for migratability, add the proper
>>> infrastructure to accept initial as well as updated base addresses 
>>> and
>>> include the current address in the vmstate. This is done in a way 
>>> that
>>> will also allow multiple IOAPICs in the future.
>>
>> Nack, the addresses should be device properties.
>
> Hmm we could make default_base_address a property. Will change 
> that.
> But current_base_address is just the same as apicbase and can't be a
> property.

 Oh, right. What will current_base_address used for? Why can't board
 just unmap IOAPIC from current address and remap it at the new
 address? Then the device would not need to know its base address.
>>>
>>> The board could do this. The question is where we put this service, in
>>> the context if the IOAPIC as ioapic_set_base_address (compare to
>>> cpu_set_apic_base - which is buggy as it lacks sysbus_mmio_map) or into
>>> each and every board code. In the latter case, the boards would also be
>>> responsible for saving/restoring the address.
>>
>> How is the device relocated? Where are the chipset registers you mention?
>
> Intel's PIIX chipsets contain a register called APICBASE (but it means
> the IOAPIC), and that defines the location. The analogy in the APIC
> world is the MSR_IA32_APICBASE which we maintain via the APIC state.

 In ICH10 the register is called OIC—Other Interrupt Control Register
 and the interesting bits APIC Range Select (ASEL).

 So actually PIIX should manage IOAPIC mapping, not board level.
>>>
>>> The point is we need ioapic_set_base_address logic in multiple places
>>> (once chipsets start to implement it). Better push it to a central place
>>> from the beginning. Also the bit keeping. There is no difference to
>>> apicbase.
>>
>> In that case, the function should be made inline version in ioapic.h.
>
> That still replicates the bit keeping.
>
> I don't see the benefit of moving it over, even less when we want to
> consolidate with a vmstate layout that is already in use.

The benefit is that the device model is improved.

[Qemu-devel] Re: [RFC][PATCH] apic: Fix relocation

2011-02-03 Thread Blue Swirl

On Thu, Feb 3, 2011 at 7:32 PM, Jan Kiszka  wrote:
> When the guest remaps an APIC by modifying MSR_IA32_APICBASE, we need to
> update its mmio mapping. This is a bit tricky as multiple APICs might be
> mapped to the same address. So walk through the full list to avoid
> unmapping a region that is still in use.
>
> Signed-off-by: Jan Kiszka 
> ---
>
> RFC as I did not yet have a chance to test actual relocation. Standard
> OSes don't do this, otherwise we would have noticed this earlier.
>
>  hw/apic.c |   38 +-
>  hw/pc.c   |   10 --
>  2 files changed, 37 insertions(+), 11 deletions(-)
>
> diff --git a/hw/apic.c b/hw/apic.c
> index 05a115f..b64af59 100644
> --- a/hw/apic.c
> +++ b/hw/apic.c
> @@ -294,6 +294,40 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
>                      trigger_mode);
>  }
>
> +static void apic_update_mapping(APICState *s)
> +{
> +    target_phys_addr_t new_addr;
> +    bool overlap = false;
> +    APICState *apic_iter;
> +    int i;
> +
> +    for (i = 0; i < MAX_APICS; i++) {
> +        apic_iter = local_apics[i];
> +        if (!apic_iter || apic_iter == s) {
> +            continue;
> +        }
> +        if ((apic_iter->apicbase & MSR_IA32_APICBASE_BASE) ==
> +            s->busdev.mmio[0].addr) {
> +            overlap = true;
> +            break;
> +        }
> +    }
> +    if (overlap) {
> +        /*
> +         * As APICs are pre-CPU devices, they may have identical base
> +         * addresses. We must avoid unregistering an old io-region that is
> +         * still in use by another APIC.
> +         */
> +        s->busdev.mmio[0].addr = (target_phys_addr_t)-1;
> +    }
> +    if (s->apicbase & MSR_IA32_APICBASE_ENABLE) {
> +        new_addr = s->apicbase & MSR_IA32_APICBASE_BASE;
> +    } else {
> +        new_addr = (target_phys_addr_t)-1;
> +    }
> +    sysbus_mmio_map(&s->busdev, 0, new_addr);
> +}
> +
>  void cpu_set_apic_base(DeviceState *d, uint64_t val)
>  {
>     APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
> @@ -302,7 +336,7 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
>
>     if (!s)
>         return;
> -    s->apicbase = (val & 0xf000) |
> +    s->apicbase = (val & MSR_IA32_APICBASE_BASE) |
>         (s->apicbase & (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
>     /* if disabled, cannot be enabled again */
>     if (!(val & MSR_IA32_APICBASE_ENABLE)) {
> @@ -310,6 +344,7 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
>         cpu_clear_apic_feature(s->cpu_env);
>         s->spurious_vec &= ~APIC_SV_ENABLE;
>     }
> +    apic_update_mapping(s);
>  }
>
>  uint64_t cpu_get_apic_base(DeviceState *d)
> @@ -948,6 +983,7 @@ static void apic_reset(DeviceState *d)
>     bsp = cpu_is_bsp(s->cpu_env);
>     s->apicbase = 0xfee0 |
>         (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
> +    apic_update_mapping(s);

Here the device maps itself at reset, which is not OK.

>
>     apic_init_reset(d);
>
> diff --git a/hw/pc.c b/hw/pc.c
> index 4dfdc0b..294aa66 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -859,7 +859,6 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
>  {
>     DeviceState *dev;
>     SysBusDevice *d;
> -    static int apic_mapped;
>
>     dev = qdev_create(NULL, "apic");
>     qdev_prop_set_uint8(dev, "id", apic_id);
> @@ -867,15 +866,6 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
>     qdev_init_nofail(dev);
>     d = sysbus_from_qdev(dev);
>
> -    /* XXX: mapping more APICs at the same memory location */
> -    if (apic_mapped == 0) {
> -        /* NOTE: the APIC is directly connected to the CPU - it is not
> -           on the global memory bus. */
> -        /* XXX: what if the base changes? */
> -        sysbus_mmio_map(d, 0, MSI_ADDR_BASE);
> -        apic_mapped = 1;

TBH, this is not so cool either. Maybe APIC should not be a Sysbus
device at all.

[Qemu-devel] [RFC][PATCH] apic: Fix relocation

2011-02-03 Thread Jan Kiszka

When the guest remaps an APIC by modifying MSR_IA32_APICBASE, we need to
update its mmio mapping. This is a bit tricky as multiple APICs might be
mapped to the same address. So walk through the full list to avoid
unmapping a region that is still in use.

Signed-off-by: Jan Kiszka 
---

RFC as I did not yet have a chance to test actual relocation. Standard
OSes don't do this, otherwise we would have noticed this earlier.

 hw/apic.c |   38 +-
 hw/pc.c   |   10 --
 2 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/hw/apic.c b/hw/apic.c
index 05a115f..b64af59 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -294,6 +294,40 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode,
  trigger_mode);
 }
 
+static void apic_update_mapping(APICState *s)
+{
+target_phys_addr_t new_addr;
+bool overlap = false;
+APICState *apic_iter;
+int i;
+
+for (i = 0; i < MAX_APICS; i++) {
+apic_iter = local_apics[i];
+if (!apic_iter || apic_iter == s) {
+continue;
+}
+if ((apic_iter->apicbase & MSR_IA32_APICBASE_BASE) ==
+s->busdev.mmio[0].addr) {
+overlap = true;
+break;
+}
+}
+if (overlap) {
+/*
+ * As APICs are pre-CPU devices, they may have identical base
+ * addresses. We must avoid unregistering an old io-region that is
+ * still in use by another APIC.
+ */
+s->busdev.mmio[0].addr = (target_phys_addr_t)-1;
+}
+if (s->apicbase & MSR_IA32_APICBASE_ENABLE) {
+new_addr = s->apicbase & MSR_IA32_APICBASE_BASE;
+} else {
+new_addr = (target_phys_addr_t)-1;
+}
+sysbus_mmio_map(&s->busdev, 0, new_addr);
+}
+
 void cpu_set_apic_base(DeviceState *d, uint64_t val)
 {
 APICState *s = DO_UPCAST(APICState, busdev.qdev, d);
@@ -302,7 +336,7 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
 
 if (!s)
 return;
-s->apicbase = (val & 0xf000) |
+s->apicbase = (val & MSR_IA32_APICBASE_BASE) |
 (s->apicbase & (MSR_IA32_APICBASE_BSP | MSR_IA32_APICBASE_ENABLE));
 /* if disabled, cannot be enabled again */
 if (!(val & MSR_IA32_APICBASE_ENABLE)) {
@@ -310,6 +344,7 @@ void cpu_set_apic_base(DeviceState *d, uint64_t val)
 cpu_clear_apic_feature(s->cpu_env);
 s->spurious_vec &= ~APIC_SV_ENABLE;
 }
+apic_update_mapping(s);
 }
 
 uint64_t cpu_get_apic_base(DeviceState *d)
@@ -948,6 +983,7 @@ static void apic_reset(DeviceState *d)
 bsp = cpu_is_bsp(s->cpu_env);
 s->apicbase = 0xfee0 |
 (bsp ? MSR_IA32_APICBASE_BSP : 0) | MSR_IA32_APICBASE_ENABLE;
+apic_update_mapping(s);
 
 apic_init_reset(d);
 
diff --git a/hw/pc.c b/hw/pc.c
index 4dfdc0b..294aa66 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -859,7 +859,6 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
 {
 DeviceState *dev;
 SysBusDevice *d;
-static int apic_mapped;
 
 dev = qdev_create(NULL, "apic");
 qdev_prop_set_uint8(dev, "id", apic_id);
@@ -867,15 +866,6 @@ static DeviceState *apic_init(void *env, uint8_t apic_id)
 qdev_init_nofail(dev);
 d = sysbus_from_qdev(dev);
 
-/* XXX: mapping more APICs at the same memory location */
-if (apic_mapped == 0) {
-/* NOTE: the APIC is directly connected to the CPU - it is not
-   on the global memory bus. */
-/* XXX: what if the base changes? */
-sysbus_mmio_map(d, 0, MSI_ADDR_BASE);
-apic_mapped = 1;
-}
-
 msix_supported = 1;
 
 return dev;
-- 
1.7.1

[Qemu-devel] [PATCH 0/4] target-arm: Fix decoding of preload and hint insns

2011-02-03 Thread Peter Maydell

The primary aim of this patch series is to fix the decoding of the
preload and hint instruction space (PLD, PLDW, PLI). Some of these
instructions (PLDW and some unallocated space which should NOP) are
v7MP only, so we introduce a feature flag for cores with the v7MP
extensions.

The flag also lets us clean up the MPIDR coprocessor register
implementation so it isn't A9 specific any more.

Tested as usual with random instruction sequences.

Peter Maydell (4):
  target-arm: Add CPU feature flag for v7MP
  target-arm: Clean up handling of MPIDR
  target-arm: Fix decoding of preload and memory hint space
  target-arm: Fix decoding of Thumb preload and hint space

 target-arm/cpu.h   |3 +-
 target-arm/helper.c|   32 +--
 target-arm/translate.c |   98 +++
 3 files changed, 102 insertions(+), 31 deletions(-)

[Qemu-devel] [PATCH 2/4] target-arm: Clean up handling of MPIDR

2011-02-03 Thread Peter Maydell

The ARM cp15 register 0,c0,c0,5 is standardised in the v7 architecture
as the MPIDR. Clean up its implementation to remove A9 specific handling.

This commit includes fixing an error in the value returned for the
MPIDR on A9, where we were erroneously claiming a cluster ID of 9.

Signed-off-by: Peter Maydell 
---
 target-arm/helper.c |   26 +-
 1 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/target-arm/helper.c b/target-arm/helper.c
index 3cf9181..d46defc 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -1608,12 +1608,28 @@ uint32_t HELPER(get_cp15)(CPUState *env, uint32_t insn)
 return 0;
 case 3: /* TLB type register.  */
 return 0; /* No lockable TLB entries.  */
-case 5: /* CPU ID */
-if (ARM_CPUID(env) == ARM_CPUID_CORTEXA9) {
-return env->cpu_index | 0x8900;
-} else {
-return env->cpu_index;
+case 5: /* MPIDR */
+/* The MPIDR was standardised in v7; prior to
+ * this it was implemented only in the 11MPCore.
+ * For all other pre-v7 cores it does not exist.
+ */
+if (arm_feature(env, ARM_FEATURE_V7) ||
+ARM_CPUID(env) == ARM_CPUID_ARM11MPCORE) {
+int mpidr = env->cpu_index;
+/* We don't support setting cluster ID ([8..11])
+ * so these bits always RAZ.
+ */
+if (arm_feature(env, ARM_FEATURE_V7MP)) {
+mpidr |= (1 << 31);
+/* Cores which are uniprocessor (non-coherent)
+ * but still implement the MP extensions set
+ * bit 30. (For instance, A9UP.) However we do
+ * not currently model any of those cores.
+ */
+}
+return mpidr;
 }
+/* otherwise fall through to the unimplemented-reg case */
 default:
 goto bad_reg;
 }
-- 
1.7.1

1 2 >

1 - 100 of 181 matches

Mail list logo