Re: [Qemu-devel] [PATCH] tests/check-qjson: fix a leak
On 2018-09-03 07:51, Markus Armbruster wrote: > Thomas Huth writes: > >> On 2018-09-01 23:19, Marc-André Lureau wrote: >>> Spotted by ASAN: >>> = >>> ==11893==ERROR: LeakSanitizer: detected memory leaks >>> >>> Direct leak of 1120 byte(s) in 28 object(s) allocated from: >>> #0 0x7fd0515b0c48 in malloc (/lib64/libasan.so.5+0xeec48) >>> #1 0x7fd050ffa3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) >>> #2 0x559e708b56a4 in qstring_from_str >>> /home/elmarco/src/qq/qobject/qstring.c:66 >>> #3 0x559e708b4fe0 in qstring_new >>> /home/elmarco/src/qq/qobject/qstring.c:23 >>> #4 0x559e708bda7d in parse_string >>> /home/elmarco/src/qq/qobject/json-parser.c:143 >>> #5 0x559e708c1009 in parse_literal >>> /home/elmarco/src/qq/qobject/json-parser.c:484 >>> #6 0x559e708c1627 in parse_value >>> /home/elmarco/src/qq/qobject/json-parser.c:547 >>> #7 0x559e708c1c67 in json_parser_parse >>> /home/elmarco/src/qq/qobject/json-parser.c:573 >>> #8 0x559e708bc0ff in json_message_process_token >>> /home/elmarco/src/qq/qobject/json-streamer.c:92 >>> #9 0x559e708d1655 in json_lexer_feed_char >>> /home/elmarco/src/qq/qobject/json-lexer.c:292 >>> #10 0x559e708d1fe1 in json_lexer_feed >>> /home/elmarco/src/qq/qobject/json-lexer.c:339 >>> #11 0x559e708bc856 in json_message_parser_feed >>> /home/elmarco/src/qq/qobject/json-streamer.c:121 >>> #12 0x559e708b8b4b in qobject_from_jsonv >>> /home/elmarco/src/qq/qobject/qjson.c:69 >>> #13 0x559e708b8d02 in qobject_from_json >>> /home/elmarco/src/qq/qobject/qjson.c:83 >>> #14 0x559e708a74ae in from_json_str >>> /home/elmarco/src/qq/tests/check-qjson.c:30 >>> #15 0x559e708a9f83 in utf8_string >>> /home/elmarco/src/qq/tests/check-qjson.c:781 >>> #16 0x7fd05101bc49 in test_case_run gtestutils.c:2255 >>> #17 0x7fd05101bc49 in g_test_run_suite_internal gtestutils.c:2339 >>> >>> Signed-off-by: Marc-André Lureau >>> --- >>> tests/check-qjson.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/tests/check-qjson.c b/tests/check-qjson.c >>> index cc13f3d41e..d876a7a96e 100644 >>> --- a/tests/check-qjson.c >>> +++ b/tests/check-qjson.c >>> @@ -780,6 +780,7 @@ static void utf8_string(void) >>> if (!strstr(json_out, "\\uFFFD")) { >>> str = from_json_str(json_out, j, _abort); >>> g_assert_cmpstr(qstring_get_try_str(str), ==, utf8_in); >>> +qobject_unref(str); >>> } >>> } >>> } >>> >> >> There are other occurances of from_json_str() which are not followed by >> an object_unref() ... do they need to be fixed as well? >> (e.g. at the end of escaped_string(), or earlier in utf8_string()) > > These are all asserted to be null, aren't they? Ah, right, sorry, it's still too early in the morning here, I guess ;-) Thomas
Re: [Qemu-devel] [PATCH] tests/check-qjson: fix a leak
Thomas Huth writes: > On 2018-09-01 23:19, Marc-André Lureau wrote: >> Spotted by ASAN: >> = >> ==11893==ERROR: LeakSanitizer: detected memory leaks >> >> Direct leak of 1120 byte(s) in 28 object(s) allocated from: >> #0 0x7fd0515b0c48 in malloc (/lib64/libasan.so.5+0xeec48) >> #1 0x7fd050ffa3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) >> #2 0x559e708b56a4 in qstring_from_str >> /home/elmarco/src/qq/qobject/qstring.c:66 >> #3 0x559e708b4fe0 in qstring_new >> /home/elmarco/src/qq/qobject/qstring.c:23 >> #4 0x559e708bda7d in parse_string >> /home/elmarco/src/qq/qobject/json-parser.c:143 >> #5 0x559e708c1009 in parse_literal >> /home/elmarco/src/qq/qobject/json-parser.c:484 >> #6 0x559e708c1627 in parse_value >> /home/elmarco/src/qq/qobject/json-parser.c:547 >> #7 0x559e708c1c67 in json_parser_parse >> /home/elmarco/src/qq/qobject/json-parser.c:573 >> #8 0x559e708bc0ff in json_message_process_token >> /home/elmarco/src/qq/qobject/json-streamer.c:92 >> #9 0x559e708d1655 in json_lexer_feed_char >> /home/elmarco/src/qq/qobject/json-lexer.c:292 >> #10 0x559e708d1fe1 in json_lexer_feed >> /home/elmarco/src/qq/qobject/json-lexer.c:339 >> #11 0x559e708bc856 in json_message_parser_feed >> /home/elmarco/src/qq/qobject/json-streamer.c:121 >> #12 0x559e708b8b4b in qobject_from_jsonv >> /home/elmarco/src/qq/qobject/qjson.c:69 >> #13 0x559e708b8d02 in qobject_from_json >> /home/elmarco/src/qq/qobject/qjson.c:83 >> #14 0x559e708a74ae in from_json_str >> /home/elmarco/src/qq/tests/check-qjson.c:30 >> #15 0x559e708a9f83 in utf8_string >> /home/elmarco/src/qq/tests/check-qjson.c:781 >> #16 0x7fd05101bc49 in test_case_run gtestutils.c:2255 >> #17 0x7fd05101bc49 in g_test_run_suite_internal gtestutils.c:2339 >> >> Signed-off-by: Marc-André Lureau >> --- >> tests/check-qjson.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/tests/check-qjson.c b/tests/check-qjson.c >> index cc13f3d41e..d876a7a96e 100644 >> --- a/tests/check-qjson.c >> +++ b/tests/check-qjson.c >> @@ -780,6 +780,7 @@ static void utf8_string(void) >> if (!strstr(json_out, "\\uFFFD")) { >> str = from_json_str(json_out, j, _abort); >> g_assert_cmpstr(qstring_get_try_str(str), ==, utf8_in); >> +qobject_unref(str); >> } >> } >> } >> > > There are other occurances of from_json_str() which are not followed by > an object_unref() ... do they need to be fixed as well? > (e.g. at the end of escaped_string(), or earlier in utf8_string()) These are all asserted to be null, aren't they?
Re: [Qemu-devel] [PATCH] tests/check-qjson: fix a leak
Marc-André Lureau writes: > Spotted by ASAN: > = > ==11893==ERROR: LeakSanitizer: detected memory leaks > > Direct leak of 1120 byte(s) in 28 object(s) allocated from: > #0 0x7fd0515b0c48 in malloc (/lib64/libasan.so.5+0xeec48) > #1 0x7fd050ffa3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) > #2 0x559e708b56a4 in qstring_from_str > /home/elmarco/src/qq/qobject/qstring.c:66 > #3 0x559e708b4fe0 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:23 > #4 0x559e708bda7d in parse_string > /home/elmarco/src/qq/qobject/json-parser.c:143 > #5 0x559e708c1009 in parse_literal > /home/elmarco/src/qq/qobject/json-parser.c:484 > #6 0x559e708c1627 in parse_value > /home/elmarco/src/qq/qobject/json-parser.c:547 > #7 0x559e708c1c67 in json_parser_parse > /home/elmarco/src/qq/qobject/json-parser.c:573 > #8 0x559e708bc0ff in json_message_process_token > /home/elmarco/src/qq/qobject/json-streamer.c:92 > #9 0x559e708d1655 in json_lexer_feed_char > /home/elmarco/src/qq/qobject/json-lexer.c:292 > #10 0x559e708d1fe1 in json_lexer_feed > /home/elmarco/src/qq/qobject/json-lexer.c:339 > #11 0x559e708bc856 in json_message_parser_feed > /home/elmarco/src/qq/qobject/json-streamer.c:121 > #12 0x559e708b8b4b in qobject_from_jsonv > /home/elmarco/src/qq/qobject/qjson.c:69 > #13 0x559e708b8d02 in qobject_from_json > /home/elmarco/src/qq/qobject/qjson.c:83 > #14 0x559e708a74ae in from_json_str > /home/elmarco/src/qq/tests/check-qjson.c:30 > #15 0x559e708a9f83 in utf8_string > /home/elmarco/src/qq/tests/check-qjson.c:781 > #16 0x7fd05101bc49 in test_case_run gtestutils.c:2255 > #17 0x7fd05101bc49 in g_test_run_suite_internal gtestutils.c:2339 > > Signed-off-by: Marc-André Lureau > --- > tests/check-qjson.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/tests/check-qjson.c b/tests/check-qjson.c > index cc13f3d41e..d876a7a96e 100644 > --- a/tests/check-qjson.c > +++ b/tests/check-qjson.c > @@ -780,6 +780,7 @@ static void utf8_string(void) > if (!strstr(json_out, "\\uFFFD")) { > str = from_json_str(json_out, j, _abort); > g_assert_cmpstr(qstring_get_try_str(str), ==, utf8_in); > +qobject_unref(str); > } > } > } Reviewed-by: Markus Armbruster Queued, thanks!
Re: [Qemu-devel] [PATCH v7 0/7] monitor: enable OOB by default
Peter Xu writes: > (this series is based on Markus's monitor-next tree so if patchew > spits something out with "apply failure" then it's expected) Easy to avoid with suitable Based: tags in the cover letter: Based-on: <20180828191048.29806-1-arm...@redhat.com> Based-on: <2018090716.1675-1-arm...@redhat.com> The first one gets you "[PULL 0/6] QAPI patches for 2018-08-28", which includes your PATCH 1+2. The second one gets you "[PULL 0/6] Monitor patches for 2018-09-01", which is monitor-next. I pushed the result for reviewers' convenience: http://repo.or.cz/qemu/armbru.git/shortlog/refs/heads/peterx-oob > > v7: > - use Markus's commit message for patch "qapi: Drop > qapi_event_send_FOO()'s Error ** argument" [Markus] > - update commit message for "qapi: remove COMMAND_DROPPED event" since > QEMU 3.0 is released [Eric/Dave] > - rebase to Markus's monitor-next tree: > http://repo.or.cz/qemu/armbru.git/shortlog/refs/heads/monitor-next > the patch "monitor: suspend monitor instead of send CMD_DROP" > re-written since people prefer to drop need_resume flag so now I > hand-made it. Dropped a few patches since not appliable any more. > > Please review. Thanks, > > Markus Armbruster (1): > qapi: Fix build_params() for empty parameter list > > Peter Xu (6): > qapi: Drop qapi_event_send_FOO()'s Error ** argument > monitor: suspend monitor instead of send CMD_DROP > qapi: remove COMMAND_DROPPED event > monitor: remove "x-oob", turn oob on by default > Revert "tests: Add parameter to qtest_init_without_qmp_handshake" > tests: add oob functional test for test-qmp-cmds > > block/block-backend.c| 8 ++--- > block/qcow2.c| 2 +- > block/quorum.c | 4 +-- > block/write-threshold.c | 3 +- > blockjob.c | 13 > cpus.c | 8 ++--- > docs/devel/qapi-code-gen.txt | 6 ++-- > docs/interop/qmp-spec.txt| 5 ++-- > dump.c | 3 +- > hw/acpi/core.c | 2 +- > hw/acpi/cpu.c| 2 +- > hw/acpi/memory_hotplug.c | 5 ++-- > hw/char/virtio-console.c | 3 +- > hw/core/qdev.c | 3 +- > hw/net/virtio-net.c | 2 +- > hw/ppc/spapr_rtc.c | 2 +- > hw/timer/mc146818rtc.c | 2 +- > hw/virtio/virtio-balloon.c | 3 +- > hw/watchdog/watchdog.c | 15 +- > include/monitor/monitor.h| 1 - > include/qapi/qmp-event.h | 3 +- > job.c| 2 +- > migration/migration.c| 4 +-- > migration/ram.c | 2 +- > monitor.c| 58 ++-- > qapi/misc.json | 40 - > scripts/qapi/common.py | 10 +++ > scripts/qapi/events.py | 23 -- > scsi/pr-manager-helper.c | 3 +- > tests/libqtest.c | 10 +++ > tests/libqtest.h | 4 +-- > tests/qmp-test.c | 6 ++-- > tests/test-qmp-cmds.c| 16 ++ > tests/test-qmp-event.c | 11 --- > ui/spice-core.c | 10 +++ > ui/vnc.c | 7 ++--- > vl.c | 21 + > 37 files changed, 120 insertions(+), 202 deletions(-) Diffstat looks friendlier without PATCH 1+2: docs/interop/qmp-spec.txt | 5 +++-- include/monitor/monitor.h | 1 - monitor.c | 55 ++- qapi/misc.json| 40 -- tests/libqtest.c | 10 - tests/libqtest.h | 4 +--- tests/qmp-test.c | 6 +++--- tests/test-qmp-cmds.c | 16 ++ vl.c | 5 - 9 files changed, 48 insertions(+), 94 deletions(-)
Re: [Qemu-devel] [PATCH v4 10/10] qmp: common 'id' handling & make QGA conform to QMP spec
Marc-André Lureau writes: > On Sat, Sep 1, 2018 at 12:59 PM, Markus Armbruster wrote: >> Marc-André Lureau writes: >> >>> Let qmp_dispatch() copy the 'id' field. That way any qmp client will >>> conform to the specification, including QGA. Furthermore, it >>> simplifies the work for qemu monitor. >>> >>> Signed-off-by: Marc-André Lureau >>> Reviewed-by: Markus Armbruster >>> Reviewed-by: Michael Roth >>> --- >>> monitor.c | 33 - >>> qapi/qmp-dispatch.c | 10 -- >>> tests/test-qga.c| 13 + >>> 3 files changed, 25 insertions(+), 31 deletions(-) >> >> Let's squash in >> >> diff --git a/docs/interop/qmp-spec.txt b/docs/interop/qmp-spec.txt >> index 8f7da0245d..5fdfb2a4d0 100644 >> --- a/docs/interop/qmp-spec.txt >> +++ b/docs/interop/qmp-spec.txt >> @@ -109,6 +109,7 @@ or >>command execution, it is optional and will be part of the response >>if provided. The "id" member can be any json-value. A json-number >>incremented for each successive command works fine. >> +- Older versions of the QEMU Guest agent do not support "id". > > Good idea. Are you taking and updating the patch, or Michael? I left the QGA stuff out of my "[PULL 0/6] Monitor patches for 2018-09-01". If it's still pending when I do my next pull request for QMP stuff, I can include it. Of course, Michael is welcome to take it earlier.
Re: [Qemu-devel] [PATCH] tests/check-qjson: fix a leak
On 2018-09-01 23:19, Marc-André Lureau wrote: > Spotted by ASAN: > = > ==11893==ERROR: LeakSanitizer: detected memory leaks > > Direct leak of 1120 byte(s) in 28 object(s) allocated from: > #0 0x7fd0515b0c48 in malloc (/lib64/libasan.so.5+0xeec48) > #1 0x7fd050ffa3c5 in g_malloc (/lib64/libglib-2.0.so.0+0x523c5) > #2 0x559e708b56a4 in qstring_from_str > /home/elmarco/src/qq/qobject/qstring.c:66 > #3 0x559e708b4fe0 in qstring_new /home/elmarco/src/qq/qobject/qstring.c:23 > #4 0x559e708bda7d in parse_string > /home/elmarco/src/qq/qobject/json-parser.c:143 > #5 0x559e708c1009 in parse_literal > /home/elmarco/src/qq/qobject/json-parser.c:484 > #6 0x559e708c1627 in parse_value > /home/elmarco/src/qq/qobject/json-parser.c:547 > #7 0x559e708c1c67 in json_parser_parse > /home/elmarco/src/qq/qobject/json-parser.c:573 > #8 0x559e708bc0ff in json_message_process_token > /home/elmarco/src/qq/qobject/json-streamer.c:92 > #9 0x559e708d1655 in json_lexer_feed_char > /home/elmarco/src/qq/qobject/json-lexer.c:292 > #10 0x559e708d1fe1 in json_lexer_feed > /home/elmarco/src/qq/qobject/json-lexer.c:339 > #11 0x559e708bc856 in json_message_parser_feed > /home/elmarco/src/qq/qobject/json-streamer.c:121 > #12 0x559e708b8b4b in qobject_from_jsonv > /home/elmarco/src/qq/qobject/qjson.c:69 > #13 0x559e708b8d02 in qobject_from_json > /home/elmarco/src/qq/qobject/qjson.c:83 > #14 0x559e708a74ae in from_json_str > /home/elmarco/src/qq/tests/check-qjson.c:30 > #15 0x559e708a9f83 in utf8_string > /home/elmarco/src/qq/tests/check-qjson.c:781 > #16 0x7fd05101bc49 in test_case_run gtestutils.c:2255 > #17 0x7fd05101bc49 in g_test_run_suite_internal gtestutils.c:2339 > > Signed-off-by: Marc-André Lureau > --- > tests/check-qjson.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/tests/check-qjson.c b/tests/check-qjson.c > index cc13f3d41e..d876a7a96e 100644 > --- a/tests/check-qjson.c > +++ b/tests/check-qjson.c > @@ -780,6 +780,7 @@ static void utf8_string(void) > if (!strstr(json_out, "\\uFFFD")) { > str = from_json_str(json_out, j, _abort); > g_assert_cmpstr(qstring_get_try_str(str), ==, utf8_in); > +qobject_unref(str); > } > } > } > There are other occurances of from_json_str() which are not followed by an object_unref() ... do they need to be fixed as well? (e.g. at the end of escaped_string(), or earlier in utf8_string()) Thomas
[Qemu-devel] [Bug 1789751] Re: Using -overcommit flag leads to qemu crashing
It seems to be fixed already: https://git.qemu.org/?p=qemu.git;a=commitdiff;h=1fdd4748711a62d863744 ** Changed in: qemu Status: New => Fix Committed -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1789751 Title: Using -overcommit flag leads to qemu crashing Status in QEMU: Fix Committed Bug description: Running the following leads to a qemu crash on startup: jwhite@laptop:~/os$ qemu-system-i386 -overcommit cpu-pm=on qemu-system-i386: -overcommit cpu-pm=on: There is no option group 'overcommit' Segmentation fault (core dumped) jwhite@laptop:~/os$ This fixes the issue: --- ../tmp/qemu-3.0.0/vl.c2018-08-14 12:10:35.0 -0700 +++ vl.c 2018-08-29 14:59:30.151554120 -0700 @@ -2987,6 +2987,7 @@ qemu_add_opts(_object_opts); qemu_add_opts(_tpmdev_opts); qemu_add_opts(_realtime_opts); +qemu_add_opts(_overcommit_opts); qemu_add_opts(_msg_opts); qemu_add_opts(_name_opts); qemu_add_opts(_numa_opts); To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1789751/+subscriptions
Re: [Qemu-devel] [PULL 8/9] tests: add qmp/qom-set-without-value test
On 2018-08-31 16:35, Markus Armbruster wrote: > Thomas Huth writes: > >> On 2018-08-31 15:24, Marc-André Lureau wrote: >>> Hi >>> On Fri, Aug 31, 2018 at 3:18 PM Thomas Huth wrote: On 2018-08-31 14:04, Markus Armbruster wrote: > Thomas Huth writes: > >> From: Marc-André Lureau >> >> test_qom_set_without_value() is about a bug in infrastructure used by >> the QMP core, fixed in commit c489780203. We covered the bug in >> infrastructure unit tests (commit bce3035a44). I wrote that test >> earlier, to cover QMP level as well, the test could go into qmp-test. >> >> Signed-off-by: Marc-André Lureau >> Reviewed-by: Thomas Huth >> Signed-off-by: Thomas Huth >> --- >> tests/qmp-test.c | 14 ++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/tests/qmp-test.c b/tests/qmp-test.c >> index b347228..2b923f0 100644 >> --- a/tests/qmp-test.c >> +++ b/tests/qmp-test.c >> @@ -321,6 +321,19 @@ static void test_qmp_preconfig(void) >> qtest_quit(qs); >> } >> >> +static void test_qom_set_without_value(void) >> +{ >> +QTestState *qts; >> +QDict *resp; >> + >> +qts = qtest_init(common_args); >> +resp = qtest_qmp(qts, "{'execute': 'qom-set', 'arguments':" >> + " { 'path': '/machine', 'property': 'rtc-time' } >> }"); >> +g_assert_nonnull(resp); >> +qmp_assert_error_class(resp, "GenericError"); >> +qtest_quit(qts); >> +} >> + >> int main(int argc, char *argv[]) >> { >> g_test_init(, , NULL); >> @@ -328,6 +341,7 @@ int main(int argc, char *argv[]) >> qtest_add_func("qmp/protocol", test_qmp_protocol); >> qtest_add_func("qmp/oob", test_qmp_oob); >> qtest_add_func("qmp/preconfig", test_qmp_preconfig); >> +qtest_add_func("qmp/qom-set-without-value", >> test_qom_set_without_value); >> >> return g_test_run(); >> } > > I'm afraid you missed my objection to naming: > Message-ID: <8736uvujxx@dusky.pond.sub.org> > https://lists.nongnu.org/archive/html/qemu-devel/2018-08/msg06368.html Sorry about that, I was not on CC: for that series. I used the patches from v5 where Marc-André put me on CC:. > If you could work that into PULL v2, I'd be obliged. If not, I'll have > to address it in a follow-up patch. IMHO the naming is not that bad ... OTOH, I think Peter might already be away? ... so we've got plenty of time to sort this out anyway. Marc-André, could you send a new version of the patch? >>> >>> Tbh, I don't care so much about the naming of the test, but (for once) >>> I respectfully don't think Markus suggestion is better. >>> >>> The function checks "qom-set" without 'value' argument: >>> "qom-set-without-value", no brainer.. > > Nope, that's not what it tests. It tests the visitor, the marshalling > code generator, and the QMP core handle a certain kind of invalid > arguments correctly. It does not test qom-set. I explained all that > already. > >>> Naming it "invalid-arg" is so generic that I wouldn't be able what it does. > > I can accept "missing-any" or "missing-any-arg". I object to any name > involving qom-set, because the test is not about qom-set at all. > > If it was, then putting it in qmp-test.c would be wrong. > >> Ok, then let's keep it this way. As I said, IMHO the current naming is >> not really bad, and I also don't have any suggestions for a perfect name >> right now. > > We don't need a perfect name. We need one that's not actively > misleading. Ok, then let's cancel this PULL request. I'll send a new one after the "vacation freeze" (i.e. in three weeks), that should be enough time for both of you to come to an agreement about the best naming here. Thomas
[Qemu-devel] [PATCH V12 17/19] COLO: notify net filters about checkpoint/failover event
From: zhanghailiang Notify all net filters about the checkpoint and failover event. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 25d279decf..365e913e51 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -31,6 +31,7 @@ #include "qapi/qapi-events-migration.h" #include "qapi/qmp/qerror.h" #include "sysemu/cpus.h" +#include "net/filter.h" static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -83,6 +84,12 @@ static void secondary_vm_do_failover(void) error_report_err(local_err); } +/* Notify all filters of all NIC to do checkpoint */ +colo_notify_filters_event(COLO_EVENT_FAILOVER, _err); +if (local_err) { +error_report_err(local_err); +} + if (!autostart) { error_report("\"-S\" qemu option will be ignored in secondary side"); /* recover runstate to normal migration finish state */ @@ -782,6 +789,14 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +/* Notify all filters of all NIC to do checkpoint */ +colo_notify_filters_event(COLO_EVENT_CHECKPOINT, _err); + +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} + vmstate_loading = false; vm_start(); trace_colo_vm_state_change("stop", "run"); -- 2.17.GIT
[Qemu-devel] [PATCH V12 16/19] filter-rewriter: handle checkpoint and failover event
After one round of checkpoint, the states between PVM and SVM become consistent, so it is unnecessary to adjust the sequence of net packets for old connections, besides, while failover happens, filter-rewriter will into failover mode that needn't handle the new TCP connection. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen --- net/colo-compare.c| 12 - net/colo.c| 8 ++ net/colo.h| 2 ++ net/filter-rewriter.c | 57 +++ 4 files changed, 73 insertions(+), 6 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 426eab5973..a39191d522 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -116,6 +116,12 @@ enum { SECONDARY_IN, }; +static void colo_compare_inconsistency_notify(void) +{ +notifier_list_notify(_compare_notifiers, +migrate_get_current()); +} + static int compare_chr_send(CompareState *s, const uint8_t *buf, uint32_t size, @@ -562,12 +568,6 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time) } } -static void colo_compare_inconsistency_notify(void) -{ -notifier_list_notify(_compare_notifiers, -migrate_get_current()); -} - void colo_compare_register_notifier(Notifier *notify) { notifier_list_add(_compare_notifiers, notify); diff --git a/net/colo.c b/net/colo.c index 97c8fc928f..49176bf07b 100644 --- a/net/colo.c +++ b/net/colo.c @@ -221,3 +221,11 @@ Connection *connection_get(GHashTable *connection_track_table, return conn; } + +bool connection_has_tracked(GHashTable *connection_track_table, +ConnectionKey *key) +{ +Connection *conn = g_hash_table_lookup(connection_track_table, key); + +return conn ? true : false; +} diff --git a/net/colo.h b/net/colo.h index 0277e0e9ba..11c5226488 100644 --- a/net/colo.h +++ b/net/colo.h @@ -98,6 +98,8 @@ void connection_destroy(void *opaque); Connection *connection_get(GHashTable *connection_track_table, ConnectionKey *key, GQueue *conn_list); +bool connection_has_tracked(GHashTable *connection_track_table, +ConnectionKey *key); void connection_hashtable_reset(GHashTable *connection_track_table); Packet *packet_new(const void *data, int size, int vnet_hdr_len); void packet_destroy(void *opaque, void *user_data); diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index f18a71bf2e..359494fc72 100644 --- a/net/filter-rewriter.c +++ b/net/filter-rewriter.c @@ -20,11 +20,15 @@ #include "qemu/main-loop.h" #include "qemu/iov.h" #include "net/checksum.h" +#include "net/colo.h" +#include "migration/colo.h" #define FILTER_COLO_REWRITER(obj) \ OBJECT_CHECK(RewriterState, (obj), TYPE_FILTER_REWRITER) #define TYPE_FILTER_REWRITER "filter-rewriter" +#define FAILOVER_MODE_ON true +#define FAILOVER_MODE_OFF false typedef struct RewriterState { NetFilterState parent_obj; @@ -32,8 +36,14 @@ typedef struct RewriterState { /* hashtable to save connection */ GHashTable *connection_track_table; bool vnet_hdr; +bool failover_mode; } RewriterState; +static void filter_rewriter_failover_mode(RewriterState *s) +{ +s->failover_mode = FAILOVER_MODE_ON; +} + static void filter_rewriter_flush(NetFilterState *nf) { RewriterState *s = FILTER_COLO_REWRITER(nf); @@ -269,6 +279,13 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf, */ reverse_connection_key(); } + +/* After failover we needn't change new TCP packet */ +if (s->failover_mode && +!connection_has_tracked(s->connection_track_table, )) { +goto out; +} + conn = connection_get(s->connection_track_table, , NULL); @@ -302,11 +319,49 @@ static ssize_t colo_rewriter_receive_iov(NetFilterState *nf, } } +out: packet_destroy(pkt, NULL); pkt = NULL; return 0; } +static void reset_seq_offset(gpointer key, gpointer value, gpointer user_data) +{ +Connection *conn = (Connection *)value; + +conn->offset = 0; +} + +static gboolean offset_is_nonzero(gpointer key, + gpointer value, + gpointer user_data) +{ +Connection *conn = (Connection *)value; + +return conn->offset ? true : false; +} + +static void colo_rewriter_handle_event(NetFilterState *nf, int event, + Error **errp) +{ +RewriterState *rs = FILTER_COLO_REWRITER(nf); + +switch (event) { +case COLO_EVENT_CHECKPOINT: +g_hash_table_foreach(rs->connection_track_table, +reset_seq_offset, NULL); +break; +case COLO_EVENT_FAILOVER: +if
[Qemu-devel] [PATCH V12 18/19] COLO: quick failover process by kick COLO thread
From: zhanghailiang COLO thread may sleep at qemu_sem_wait(>colo_checkpoint_sem), while failover works begin, It's better to wakeup it to quick the process. Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 8 1 file changed, 8 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 365e913e51..3a3efb916b 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -131,6 +131,11 @@ static void primary_vm_do_failover(void) migrate_set_state(>state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); +/* + * kick COLO thread which might wait at + * qemu_sem_wait(>colo_checkpoint_sem). + */ +colo_checkpoint_notify(migrate_get_current()); /* * Wake up COLO thread which may blocked in recv() or send(), @@ -539,6 +544,9 @@ static void colo_process_checkpoint(MigrationState *s) qemu_sem_wait(>colo_checkpoint_sem); +if (s->state != MIGRATION_STATUS_COLO) { +goto out; +} ret = colo_do_checkpoint_transaction(s, bioc, fb); if (ret < 0) { goto out; -- 2.17.GIT
[Qemu-devel] [PATCH V12 15/19] filter: Add handle_event method for NetFilterClass
Filter needs to process the event of checkpoint/failover or other event passed by COLO frame. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen --- include/net/filter.h | 5 + net/filter.c | 17 + net/net.c| 19 +++ 3 files changed, 41 insertions(+) diff --git a/include/net/filter.h b/include/net/filter.h index 435acd6f82..49da666ac0 100644 --- a/include/net/filter.h +++ b/include/net/filter.h @@ -38,6 +38,8 @@ typedef ssize_t (FilterReceiveIOV)(NetFilterState *nc, typedef void (FilterStatusChanged) (NetFilterState *nf, Error **errp); +typedef void (FilterHandleEvent) (NetFilterState *nf, int event, Error **errp); + typedef struct NetFilterClass { ObjectClass parent_class; @@ -45,6 +47,7 @@ typedef struct NetFilterClass { FilterSetup *setup; FilterCleanup *cleanup; FilterStatusChanged *status_changed; +FilterHandleEvent *handle_event; /* mandatory */ FilterReceiveIOV *receive_iov; } NetFilterClass; @@ -77,4 +80,6 @@ ssize_t qemu_netfilter_pass_to_next(NetClientState *sender, int iovcnt, void *opaque); +void colo_notify_filters_event(int event, Error **errp); + #endif /* QEMU_NET_FILTER_H */ diff --git a/net/filter.c b/net/filter.c index 2fd7d7d663..c9f9e5fa08 100644 --- a/net/filter.c +++ b/net/filter.c @@ -17,6 +17,8 @@ #include "net/vhost_net.h" #include "qom/object_interfaces.h" #include "qemu/iov.h" +#include "net/colo.h" +#include "migration/colo.h" static inline bool qemu_can_skip_netfilter(NetFilterState *nf) { @@ -245,11 +247,26 @@ static void netfilter_finalize(Object *obj) g_free(nf->netdev_id); } +static void default_handle_event(NetFilterState *nf, int event, Error **errp) +{ +switch (event) { +case COLO_EVENT_CHECKPOINT: +break; +case COLO_EVENT_FAILOVER: +object_property_set_str(OBJECT(nf), "off", "status", errp); +break; +default: +break; +} +} + static void netfilter_class_init(ObjectClass *oc, void *data) { UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc); +NetFilterClass *nfc = NETFILTER_CLASS(oc); ucc->complete = netfilter_complete; +nfc->handle_event = default_handle_event; } static const TypeInfo netfilter_info = { diff --git a/net/net.c b/net/net.c index 2a3133990c..fd8efebfdb 100644 --- a/net/net.c +++ b/net/net.c @@ -1331,6 +1331,25 @@ void hmp_info_network(Monitor *mon, const QDict *qdict) } } +void colo_notify_filters_event(int event, Error **errp) +{ +NetClientState *nc; +NetFilterState *nf; +NetFilterClass *nfc = NULL; +Error *local_err = NULL; + +QTAILQ_FOREACH(nc, _clients, next) { +QTAILQ_FOREACH(nf, >filters, next) { +nfc = NETFILTER_GET_CLASS(OBJECT(nf)); +nfc->handle_event(nf, event, _err); +if (local_err) { +error_propagate(errp, local_err); +return; +} +} +} +} + void qmp_set_link(const char *name, bool up, Error **errp) { NetClientState *ncs[MAX_QUEUE_NUM]; -- 2.17.GIT
[Qemu-devel] [PATCH V12 13/19] savevm: split the process of different stages for loadvm/savevm
There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public, and simplify the codes of qemu_save_device_state() by calling the wrapper qemu_savevm_state_header(). Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 41 - migration/savevm.c | 36 +--- migration/savevm.h | 4 3 files changed, 65 insertions(+), 16 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 8c4ba373aa..25d279decf 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -30,6 +30,7 @@ #include "block/block.h" #include "qapi/qapi-events-migration.h" #include "qapi/qmp/qerror.h" +#include "sysemu/cpus.h" static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -401,23 +402,34 @@ static int colo_do_checkpoint_transaction(MigrationState *s, /* Disable block migration */ migrate_set_block_enabled(false, _err); -qemu_savevm_state_header(fb); -qemu_savevm_state_setup(fb); qemu_mutex_lock_iothread(); replication_do_checkpoint_all(_err); if (local_err) { qemu_mutex_unlock_iothread(); goto out; } -qemu_savevm_state_complete_precopy(fb, false, false); -qemu_mutex_unlock_iothread(); - -qemu_fflush(fb); colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, _err); if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} +/* Note: device state is saved into buffer */ +ret = qemu_save_device_state(fb); + +qemu_mutex_unlock_iothread(); +if (ret < 0) { goto out; } +/* + * Only save VM's live state, which not including device state. + * TODO: We may need a timeout mechanism to prevent COLO process + * to be blocked here. + */ +qemu_savevm_live_state(s->to_dst_file); + +qemu_fflush(fb); + /* * We need the size of the VMstate data in Secondary side, * With which we can decide how much data should be read. @@ -635,6 +647,7 @@ void *colo_process_incoming_thread(void *opaque) uint64_t total_size; uint64_t value; Error *local_err = NULL; +int ret; rcu_register_thread(); qemu_sem_init(>colo_incoming_sem, 0); @@ -708,6 +721,16 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +qemu_mutex_lock_iothread(); +cpu_synchronize_all_pre_loadvm(); +ret = qemu_loadvm_state_main(mis->from_src_file, mis); +qemu_mutex_unlock_iothread(); + +if (ret < 0) { +error_report("Load VM's live state (ram) error"); +goto out; +} + value = colo_receive_message_value(mis->from_src_file, COLO_MESSAGE_VMSTATE_SIZE, _err); if (local_err) { @@ -739,10 +762,10 @@ void *colo_process_incoming_thread(void *opaque) } qemu_mutex_lock_iothread(); -qemu_system_reset(SHUTDOWN_CAUSE_NONE); vmstate_loading = true; -if (qemu_loadvm_state(fb) < 0) { -error_report("COLO: loadvm failed"); +ret = qemu_load_device_state(fb); +if (ret < 0) { +error_report("COLO: load device state failed"); qemu_mutex_unlock_iothread(); goto out; } diff --git a/migration/savevm.c b/migration/savevm.c index 96db539064..5636cbeae2 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1378,13 +1378,21 @@ done: return ret; } -static int qemu_save_device_state(QEMUFile *f) +void qemu_savevm_live_state(QEMUFile *f) { -SaveStateEntry *se; +/* save QEMU_VM_SECTION_END section */ +qemu_savevm_state_complete_precopy(f, true, false); +qemu_put_byte(f, QEMU_VM_EOF); +} -qemu_put_be32(f, QEMU_VM_FILE_MAGIC); -qemu_put_be32(f, QEMU_VM_FILE_VERSION); +int qemu_save_device_state(QEMUFile *f) +{ +SaveStateEntry *se; +if (!migration_in_colo_state()) { +qemu_put_be32(f, QEMU_VM_FILE_MAGIC); +qemu_put_be32(f, QEMU_VM_FILE_VERSION); +} cpu_synchronize_all_states(); QTAILQ_FOREACH(se, _state.handlers, entry) { @@ -1440,8 +1448,6 @@ enum LoadVMExitCodes { LOADVM_QUIT = 1, }; -static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis); - /* -- incoming postcopy
[Qemu-devel] [PATCH V12 14/19] COLO: flush host dirty ram from cache
From: zhanghailiang Don't need to flush all VM's ram from cache, only flush the dirty pages since last checkpoint Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 9 + 1 file changed, 9 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index a478d85740..739458be5b 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3665,6 +3665,7 @@ int colo_init_ram_cache(void) } ram_state = g_new0(RAMState, 1); ram_state->migration_dirty_pages = 0; +memory_global_dirty_log_start(); return 0; @@ -3686,6 +3687,7 @@ void colo_release_ram_cache(void) { RAMBlock *block; +memory_global_dirty_log_stop(); RAMBLOCK_FOREACH_MIGRATABLE(block) { g_free(block->bmap); block->bmap = NULL; @@ -3936,6 +3938,13 @@ static void colo_flush_ram_cache(void) void *src_host; unsigned long offset = 0; +memory_global_dirty_log_sync(); +rcu_read_lock(); +RAMBLOCK_FOREACH_MIGRATABLE(block) { +migration_bitmap_sync_range(ram_state, block, 0, block->used_length); +} +rcu_read_unlock(); + trace_colo_flush_ram_cache_begin(ram_state->migration_dirty_pages); rcu_read_lock(); block = QLIST_FIRST_RCU(_list.blocks); -- 2.17.GIT
[Qemu-devel] [PATCH V12 11/19] qapi/migration.json: Rename COLO unknown mode to none mode.
From: Zhang Chen Suggested by Markus Armbruster rename COLO unknown mode to none mode. Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Eric Blake Reviewed-by: Markus Armbruster --- migration/colo-failover.c | 2 +- migration/colo.c | 2 +- qapi/migration.json | 10 +- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/migration/colo-failover.c b/migration/colo-failover.c index 0ae0c41221..4854a96c92 100644 --- a/migration/colo-failover.c +++ b/migration/colo-failover.c @@ -77,7 +77,7 @@ FailoverStatus failover_get_state(void) void qmp_x_colo_lost_heartbeat(Error **errp) { -if (get_colo_mode() == COLO_MODE_UNKNOWN) { +if (get_colo_mode() == COLO_MODE_NONE) { error_setg(errp, QERR_FEATURE_DISABLED, "colo"); return; } diff --git a/migration/colo.c b/migration/colo.c index 761fbb9f78..9107ac124a 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -160,7 +160,7 @@ COLOMode get_colo_mode(void) } else if (migration_incoming_in_colo_state()) { return COLO_MODE_SECONDARY; } else { -return COLO_MODE_UNKNOWN; +return COLO_MODE_NONE; } } diff --git a/qapi/migration.json b/qapi/migration.json index 9b3b1c0c4e..c0bc77fd8d 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -899,18 +899,18 @@ ## # @COLOMode: # -# The colo mode +# The COLO current mode. # -# @unknown: unknown mode +# @none: COLO is disabled. # -# @primary: master side +# @primary: COLO node in primary side. # -# @secondary: slave side +# @secondary: COLO node in slave side. # # Since: 2.8 ## { 'enum': 'COLOMode', - 'data': [ 'unknown', 'primary', 'secondary'] } + 'data': [ 'none', 'primary', 'secondary'] } ## # @FailoverStatus: -- 2.17.GIT
[Qemu-devel] [PATCH V12 12/19] qapi: Add new command to query colo status
Libvirt or other high level software can use this command query colo status. You can test this command like that: {'execute':'query-colo-status'} Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen --- migration/colo.c| 21 + qapi/migration.json | 32 2 files changed, 53 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index 9107ac124a..8c4ba373aa 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -29,6 +29,7 @@ #include "net/colo.h" #include "block/block.h" #include "qapi/qapi-events-migration.h" +#include "qapi/qmp/qerror.h" static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -237,6 +238,26 @@ void qmp_xen_colo_do_checkpoint(Error **errp) #endif } +COLOStatus *qmp_query_colo_status(Error **errp) +{ +COLOStatus *s = g_new0(COLOStatus, 1); + +s->mode = get_colo_mode(); + +switch (failover_get_state()) { +case FAILOVER_STATUS_NONE: +s->reason = COLO_EXIT_REASON_NONE; +break; +case FAILOVER_STATUS_REQUIRE: +s->reason = COLO_EXIT_REASON_REQUEST; +break; +default: +s->reason = COLO_EXIT_REASON_ERROR; +} + +return s; +} + static void colo_send_message(QEMUFile *f, COLOMessage msg, Error **errp) { diff --git a/qapi/migration.json b/qapi/migration.json index c0bc77fd8d..840a118d11 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1283,6 +1283,38 @@ ## { 'command': 'xen-colo-do-checkpoint' } +## +# @COLOStatus: +# +# The result format for 'query-colo-status'. +# +# @mode: COLO running mode. If COLO is running, this field will return +#'primary' or 'secondary'. +# +# @reason: describes the reason for the COLO exit. +# +# Since: 3.0 +## +{ 'struct': 'COLOStatus', + 'data': { 'mode': 'COLOMode', 'reason': 'COLOExitReason' } } + +## +# @query-colo-status: +# +# Query COLO status while the vm is running. +# +# Returns: A @COLOStatus object showing the status. +# +# Example: +# +# -> { "execute": "query-colo-status" } +# <- { "return": { "mode": "primary", "active": true, "reason": "request" } } +# +# Since: 3.0 +## +{ 'command': 'query-colo-status', + 'returns': 'COLOStatus' } + ## # @migrate-recover: # -- 2.17.GIT
[Qemu-devel] [PATCH V12 19/19] docs: Add COLO status diagram to COLO-FT.txt
From: Zhang Chen This diagram make user better understand COLO. Suggested by Markus Armbruster. Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen --- docs/COLO-FT.txt | 34 ++ 1 file changed, 34 insertions(+) diff --git a/docs/COLO-FT.txt b/docs/COLO-FT.txt index d7c7dcda8f..d5007895d1 100644 --- a/docs/COLO-FT.txt +++ b/docs/COLO-FT.txt @@ -110,6 +110,40 @@ Note: HeartBeat has not been implemented yet, so you need to trigger failover process by using 'x-colo-lost-heartbeat' command. +== COLO operation status == + ++-+ +| | +|Start COLO | +| | ++++ + | + | Main qmp command: + | migrate-set-capabilities with x-colo + | migrate + | + v ++++ +| | +| COLO running | +| | ++++ + | + | Main qmp command: + | x-colo-lost-heartbeat + | or + | some error happened + v ++++ +| | send qmp event: +| COLO failover | COLO_EXIT +| | ++-+ + +COLO use the qmp command switching and report operation status. +The diagram just write the main qmp command, you can get the detail +in test procedure. + == Test procedure == 1. Startup qemu Primary: -- 2.17.GIT
[Qemu-devel] [PATCH V12 10/19] qmp event: Add COLO_EXIT event to notify users while exited COLO
From: zhanghailiang If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x-colo-lost-heartbeat', Users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users that we exited COLO mode. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Eric Blake Reviewed-by: Markus Armbruster --- migration/colo.c| 31 +++ qapi/migration.json | 38 ++ 2 files changed, 69 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index d3163b51c8..761fbb9f78 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -28,6 +28,7 @@ #include "net/colo-compare.h" #include "net/colo.h" #include "block/block.h" +#include "qapi/qapi-events-migration.h" static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -514,6 +515,23 @@ out: qemu_fclose(fb); } +/* + * There are only two reasons we can get here, some error happened + * or the user triggered failover. + */ +switch (failover_get_state()) { +case FAILOVER_STATUS_NONE: +qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_ERROR, NULL); +break; +case FAILOVER_STATUS_REQUIRE: +qapi_event_send_colo_exit(COLO_MODE_PRIMARY, + COLO_EXIT_REASON_REQUEST, NULL); +break; +default: +abort(); +} + /* Hope this not to be too long to wait here */ qemu_sem_wait(>colo_exit_sem); qemu_sem_destroy(>colo_exit_sem); @@ -746,6 +764,19 @@ out: error_report_err(local_err); } +switch (failover_get_state()) { +case FAILOVER_STATUS_NONE: +qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_ERROR, NULL); +break; +case FAILOVER_STATUS_REQUIRE: +qapi_event_send_colo_exit(COLO_MODE_SECONDARY, + COLO_EXIT_REASON_REQUEST, NULL); +break; +default: +abort(); +} + if (fb) { qemu_fclose(fb); } diff --git a/qapi/migration.json b/qapi/migration.json index f62d3f9a4b..9b3b1c0c4e 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -932,6 +932,44 @@ { 'enum': 'FailoverStatus', 'data': [ 'none', 'require', 'active', 'completed', 'relaunch' ] } +## +# @COLO_EXIT: +# +# Emitted when VM finishes COLO mode due to some errors happening or +# at the request of users. +# +# @mode: report COLO mode when COLO exited. +# +# @reason: describes the reason for the COLO exit. +# +# Since: 3.1 +# +# Example: +# +# <- { "timestamp": {"seconds": 2032141960, "microseconds": 417172}, +# "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } } +# +## +{ 'event': 'COLO_EXIT', + 'data': {'mode': 'COLOMode', 'reason': 'COLOExitReason' } } + +## +# @COLOExitReason: +# +# The reason for a COLO exit +# +# @none: no failover has ever happened. This can't occur in the +# COLO_EXIT event, only in the result of query-colo-status. +# +# @request: COLO exit is due to an external request +# +# @error: COLO exit is due to an internal error +# +# Since: 3.1 +## +{ 'enum': 'COLOExitReason', + 'data': [ 'none', 'request', 'error' ] } + ## # @x-colo-lost-heartbeat: # -- 2.17.GIT
[Qemu-devel] [PATCH V12 06/19] COLO: Remove colo_state migration struct
We need to know if migration is going into COLO state for incoming side before start normal migration. Instead by using the VMStateDescription to send colo_state from source side to destination side, we use MIG_CMD_ENABLE_COLO to indicate whether COLO is enabled or not. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert --- include/migration/colo.h | 5 +-- migration/Makefile.objs | 2 +- migration/colo-comm.c| 76 migration/colo.c | 13 ++- migration/migration.c| 23 +++- migration/savevm.c | 17 + migration/savevm.h | 1 + migration/trace-events | 1 + vl.c | 2 -- 9 files changed, 57 insertions(+), 83 deletions(-) delete mode 100644 migration/colo-comm.c diff --git a/include/migration/colo.h b/include/migration/colo.h index fefb2fcf4c..99ce17aca7 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -28,8 +28,9 @@ void migrate_start_colo_process(MigrationState *s); bool migration_in_colo_state(void); /* loadvm */ -bool migration_incoming_enable_colo(void); -void migration_incoming_exit_colo(void); +void migration_incoming_enable_colo(void); +void migration_incoming_disable_colo(void); +bool migration_incoming_colo_enabled(void); void *colo_process_incoming_thread(void *opaque); bool migration_incoming_in_colo_state(void); diff --git a/migration/Makefile.objs b/migration/Makefile.objs index c83ec47ba8..a4f3bafd86 100644 --- a/migration/Makefile.objs +++ b/migration/Makefile.objs @@ -1,6 +1,6 @@ common-obj-y += migration.o socket.o fd.o exec.o common-obj-y += tls.o channel.o savevm.o -common-obj-y += colo-comm.o colo.o colo-failover.o +common-obj-y += colo.o colo-failover.o common-obj-y += vmstate.o vmstate-types.o page_cache.o common-obj-y += qemu-file.o global_state.o common-obj-y += qemu-file-channel.o diff --git a/migration/colo-comm.c b/migration/colo-comm.c deleted file mode 100644 index df26e4dfe7..00 --- a/migration/colo-comm.c +++ /dev/null @@ -1,76 +0,0 @@ -/* - * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO) - * (a.k.a. Fault Tolerance or Continuous Replication) - * - * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD. - * Copyright (c) 2016 FUJITSU LIMITED - * Copyright (c) 2016 Intel Corporation - * - * This work is licensed under the terms of the GNU GPL, version 2 or - * later. See the COPYING file in the top-level directory. - * - */ - -#include "qemu/osdep.h" -#include "migration.h" -#include "migration/colo.h" -#include "migration/vmstate.h" -#include "trace.h" - -typedef struct { - bool colo_requested; -} COLOInfo; - -static COLOInfo colo_info; - -COLOMode get_colo_mode(void) -{ -if (migration_in_colo_state()) { -return COLO_MODE_PRIMARY; -} else if (migration_incoming_in_colo_state()) { -return COLO_MODE_SECONDARY; -} else { -return COLO_MODE_UNKNOWN; -} -} - -static int colo_info_pre_save(void *opaque) -{ -COLOInfo *s = opaque; - -s->colo_requested = migrate_colo_enabled(); - -return 0; -} - -static bool colo_info_need(void *opaque) -{ - return migrate_colo_enabled(); -} - -static const VMStateDescription colo_state = { -.name = "COLOState", -.version_id = 1, -.minimum_version_id = 1, -.pre_save = colo_info_pre_save, -.needed = colo_info_need, -.fields = (VMStateField[]) { -VMSTATE_BOOL(colo_requested, COLOInfo), -VMSTATE_END_OF_LIST() -}, -}; - -void colo_info_init(void) -{ -vmstate_register(NULL, 0, _state, _info); -} - -bool migration_incoming_enable_colo(void) -{ -return colo_info.colo_requested; -} - -void migration_incoming_exit_colo(void) -{ -colo_info.colo_requested = false; -} diff --git a/migration/colo.c b/migration/colo.c index af04010061..d3163b51c8 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -152,6 +152,17 @@ static void primary_vm_do_failover(void) qemu_sem_post(>colo_exit_sem); } +COLOMode get_colo_mode(void) +{ +if (migration_in_colo_state()) { +return COLO_MODE_PRIMARY; +} else if (migration_incoming_in_colo_state()) { +return COLO_MODE_SECONDARY; +} else { +return COLO_MODE_UNKNOWN; +} +} + void colo_do_failover(MigrationState *s) { /* Make sure VM stopped while failover happened. */ @@ -746,7 +757,7 @@ out: if (mis->to_src_file) { qemu_fclose(mis->to_src_file); } -migration_incoming_exit_colo(); +migration_incoming_disable_colo(); rcu_unregister_thread(); return NULL; diff --git a/migration/migration.c b/migration/migration.c index f4ab7b2d01..950b22df63 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -296,6 +296,22 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname, return migrate_send_rp_message(mis, msg_type, msglen, bufc); } +static bool
[Qemu-devel] [PATCH V12 09/19] COLO: Flush memory data from ram cache
During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint time. So, the content of SVM's RAM cache will always be same with PVM's memory after checkpoint. Instead of flushing all content of PVM's RAM cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM since last checkpoint. In this way, we can ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c| 37 + migration/trace-events | 2 ++ 2 files changed, 39 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index df8202216f..a478d85740 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3925,6 +3925,39 @@ static bool postcopy_is_running(void) return ps >= POSTCOPY_INCOMING_LISTENING && ps < POSTCOPY_INCOMING_END; } +/* + * Flush content of RAM cache into SVM's memory. + * Only flush the pages that be dirtied by PVM or SVM or both. + */ +static void colo_flush_ram_cache(void) +{ +RAMBlock *block = NULL; +void *dst_host; +void *src_host; +unsigned long offset = 0; + +trace_colo_flush_ram_cache_begin(ram_state->migration_dirty_pages); +rcu_read_lock(); +block = QLIST_FIRST_RCU(_list.blocks); + +while (block) { +offset = migration_bitmap_find_dirty(ram_state, block, offset); + +if (offset << TARGET_PAGE_BITS >= block->used_length) { +offset = 0; +block = QLIST_NEXT_RCU(block, next); +} else { +migration_bitmap_clear_dirty(ram_state, block, offset); +dst_host = block->host + (offset << TARGET_PAGE_BITS); +src_host = block->colo_cache + (offset << TARGET_PAGE_BITS); +memcpy(dst_host, src_host, TARGET_PAGE_SIZE); +} +} + +rcu_read_unlock(); +trace_colo_flush_ram_cache_end(); +} + static int ram_load(QEMUFile *f, void *opaque, int version_id) { int flags = 0, ret = 0, invalid_flags = 0; @@ -4101,6 +4134,10 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) ret |= wait_for_decompress_done(); rcu_read_unlock(); trace_ram_load_complete(ret, seq_iter); + +if (!ret && migration_incoming_in_colo_state()) { +colo_flush_ram_cache(); +} return ret; } diff --git a/migration/trace-events b/migration/trace-events index fa0ff3f3bf..bd2d0cd25a 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -102,6 +102,8 @@ ram_dirty_bitmap_sync_start(void) "" ram_dirty_bitmap_sync_wait(void) "" ram_dirty_bitmap_sync_complete(void) "" ram_state_resume_prepare(uint64_t v) "%" PRId64 +colo_flush_ram_cache_begin(uint64_t dirty_pages) "dirty_pages %" PRIu64 +colo_flush_ram_cache_end(void) "" # migration/migration.c await_return_path_close_on_source_close(void) "" -- 2.17.GIT
[Qemu-devel] [PATCH V12 05/19] COLO: Add block replication into colo process
Make sure master start block replication after slave's block replication started. Besides, we need to activate VM's blocks before goes into COLO state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen --- migration/colo.c | 43 +++ migration/migration.c | 10 ++ 2 files changed, 53 insertions(+) diff --git a/migration/colo.c b/migration/colo.c index f4bdfde170..af04010061 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -27,6 +27,7 @@ #include "replication.h" #include "net/colo-compare.h" #include "net/colo.h" +#include "block/block.h" static bool vmstate_loading; static Notifier packets_compare_notifier; @@ -56,6 +57,7 @@ static void secondary_vm_do_failover(void) { int old_state; MigrationIncomingState *mis = migration_incoming_get_current(); +Error *local_err = NULL; /* Can not do failover during the process of VM's loading VMstate, Or * it will break the secondary VM. @@ -73,6 +75,11 @@ static void secondary_vm_do_failover(void) migrate_set_state(>state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); +replication_stop_all(true, _err); +if (local_err) { +error_report_err(local_err); +} + if (!autostart) { error_report("\"-S\" qemu option will be ignored in secondary side"); /* recover runstate to normal migration finish state */ @@ -110,6 +117,7 @@ static void primary_vm_do_failover(void) { MigrationState *s = migrate_get_current(); int old_state; +Error *local_err = NULL; migrate_set_state(>state, MIGRATION_STATUS_COLO, MIGRATION_STATUS_COMPLETED); @@ -133,6 +141,13 @@ static void primary_vm_do_failover(void) FailoverStatus_str(old_state)); return; } + +replication_stop_all(true, _err); +if (local_err) { +error_report_err(local_err); +local_err = NULL; +} + /* Notify COLO thread that failover work is finished */ qemu_sem_post(>colo_exit_sem); } @@ -356,6 +371,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s, qemu_savevm_state_header(fb); qemu_savevm_state_setup(fb); qemu_mutex_lock_iothread(); +replication_do_checkpoint_all(_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} qemu_savevm_state_complete_precopy(fb, false, false); qemu_mutex_unlock_iothread(); @@ -446,6 +466,12 @@ static void colo_process_checkpoint(MigrationState *s) object_unref(OBJECT(bioc)); qemu_mutex_lock_iothread(); +replication_start_all(REPLICATION_MODE_PRIMARY, _err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} + vm_start(); qemu_mutex_unlock_iothread(); trace_colo_vm_state_change("stop", "run"); @@ -586,6 +612,11 @@ void *colo_process_incoming_thread(void *opaque) object_unref(OBJECT(bioc)); qemu_mutex_lock_iothread(); +replication_start_all(REPLICATION_MODE_SECONDARY, _err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} vm_start(); trace_colo_vm_state_change("stop", "run"); qemu_mutex_unlock_iothread(); @@ -666,6 +697,18 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +replication_get_error_all(_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} +/* discard colo disk buffer */ +replication_do_checkpoint_all(_err); +if (local_err) { +qemu_mutex_unlock_iothread(); +goto out; +} + vmstate_loading = false; vm_start(); trace_colo_vm_state_change("stop", "run"); diff --git a/migration/migration.c b/migration/migration.c index 2f3e9d40d1..f4ab7b2d01 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -386,6 +386,7 @@ static void process_incoming_migration_co(void *opaque) MigrationIncomingState *mis = migration_incoming_get_current(); PostcopyState ps; int ret; +Error *local_err = NULL; assert(mis->from_src_file); mis->migration_incoming_co = qemu_coroutine_self(); @@ -418,6 +419,15 @@ static void process_incoming_migration_co(void *opaque) /* we get COLO info, and know if we are in COLO mode */ if (!ret && migration_incoming_enable_colo()) { +/* Make sure all file formats flush their mutable metadata */ +bdrv_invalidate_cache_all(_err); +if (local_err) { +migrate_set_state(>state, MIGRATION_STATUS_ACTIVE, +MIGRATION_STATUS_FAILED); +error_report_err(local_err); +exit(EXIT_FAILURE); +} + qemu_thread_create(>colo_incoming_thread, "COLO incoming", colo_process_incoming_thread, mis,
[Qemu-devel] [PATCH V12 08/19] ram/COLO: Record the dirty pages that SVM received
We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. Here, it is a trick, we record dirty pages by re-using migration dirty bitmap. In the later patch, we will start the dirty log for SVM, just like migration, in this way, we can record both the dirty pages caused by PVM and SVM, we only flush those dirty pages from RAM cache while do checkpoint. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert --- migration/ram.c | 43 --- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index a63315533e..df8202216f 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3409,6 +3409,15 @@ static inline void *colo_cache_from_block_offset(RAMBlock *block, __func__, block->idstr); return NULL; } + +/* +* During colo checkpoint, we need bitmap of these migrated pages. +* It help us to decide which pages in ram cache should be flushed +* into VM's RAM later. +*/ +if (!test_and_set_bit(offset >> TARGET_PAGE_BITS, block->bmap)) { +ram_state->migration_dirty_pages++; +} return block->colo_cache + offset; } @@ -3626,7 +3635,7 @@ int colo_init_ram_cache(void) RAMBlock *block; rcu_read_lock(); -QLIST_FOREACH_RCU(block, _list.blocks, next) { +RAMBLOCK_FOREACH_MIGRATABLE(block) { block->colo_cache = qemu_anon_ram_alloc(block->used_length, NULL, false); @@ -3639,10 +3648,29 @@ int colo_init_ram_cache(void) memcpy(block->colo_cache, block->host, block->used_length); } rcu_read_unlock(); +/* +* Record the dirty pages that sent by PVM, we use this dirty bitmap together +* with to decide which page in cache should be flushed into SVM's RAM. Here +* we use the same name 'ram_bitmap' as for migration. +*/ +if (ram_bytes_total()) { +RAMBlock *block; + +RAMBLOCK_FOREACH_MIGRATABLE(block) { +unsigned long pages = block->max_length >> TARGET_PAGE_BITS; + +block->bmap = bitmap_new(pages); +bitmap_set(block->bmap, 0, pages); +} +} +ram_state = g_new0(RAMState, 1); +ram_state->migration_dirty_pages = 0; + return 0; out_locked: -QLIST_FOREACH_RCU(block, _list.blocks, next) { + +RAMBLOCK_FOREACH_MIGRATABLE(block) { if (block->colo_cache) { qemu_anon_ram_free(block->colo_cache, block->used_length); block->colo_cache = NULL; @@ -3658,14 +3686,23 @@ void colo_release_ram_cache(void) { RAMBlock *block; +RAMBLOCK_FOREACH_MIGRATABLE(block) { +g_free(block->bmap); +block->bmap = NULL; +} + rcu_read_lock(); -QLIST_FOREACH_RCU(block, _list.blocks, next) { + +RAMBLOCK_FOREACH_MIGRATABLE(block) { if (block->colo_cache) { qemu_anon_ram_free(block->colo_cache, block->used_length); block->colo_cache = NULL; } } + rcu_read_unlock(); +g_free(ram_state); +ram_state = NULL; } /** -- 2.17.GIT
[Qemu-devel] [PATCH V12 04/19] COLO: integrate colo compare with colo frame
For COLO FT, both the PVM and SVM run at the same time, only sync the state while it needs. So here, let SVM runs while not doing checkpoint, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100. Besides, we forgot to release colo_checkpoint_semd and colo_delay_timer, fix them here. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 42 -- migration/migration.c | 6 ++ 2 files changed, 42 insertions(+), 6 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 88936f5962..f4bdfde170 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -25,8 +25,11 @@ #include "qemu/error-report.h" #include "migration/failover.h" #include "replication.h" +#include "net/colo-compare.h" +#include "net/colo.h" static bool vmstate_loading; +static Notifier packets_compare_notifier; #define COLO_BUFFER_BASE_SIZE (4 * 1024 * 1024) @@ -343,6 +346,11 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } +colo_notify_compares_event(NULL, COLO_EVENT_CHECKPOINT, _err); +if (local_err) { +goto out; +} + /* Disable block migration */ migrate_set_block_enabled(false, _err); qemu_savevm_state_header(fb); @@ -400,6 +408,11 @@ out: return ret; } +static void colo_compare_notify_checkpoint(Notifier *notifier, void *data) +{ +colo_checkpoint_notify(data); +} + static void colo_process_checkpoint(MigrationState *s) { QIOChannelBuffer *bioc; @@ -416,6 +429,9 @@ static void colo_process_checkpoint(MigrationState *s) goto out; } +packets_compare_notifier.notify = colo_compare_notify_checkpoint; +colo_compare_register_notifier(_compare_notifier); + /* * Wait for Secondary finish loading VM states and enter COLO * restore. @@ -461,11 +477,21 @@ out: qemu_fclose(fb); } -timer_del(s->colo_delay_timer); - /* Hope this not to be too long to wait here */ qemu_sem_wait(>colo_exit_sem); qemu_sem_destroy(>colo_exit_sem); + +/* + * It is safe to unregister notifier after failover finished. + * Besides, colo_delay_timer and colo_checkpoint_sem can't be + * released befor unregister notifier, or there will be use-after-free + * error. + */ +colo_compare_unregister_notifier(_compare_notifier); +timer_del(s->colo_delay_timer); +timer_free(s->colo_delay_timer); +qemu_sem_destroy(>colo_checkpoint_sem); + /* * Must be called after failover BH is completed, * Or the failover BH may shutdown the wrong fd that @@ -559,6 +585,11 @@ void *colo_process_incoming_thread(void *opaque) fb = qemu_fopen_channel_input(QIO_CHANNEL(bioc)); object_unref(OBJECT(bioc)); +qemu_mutex_lock_iothread(); +vm_start(); +trace_colo_vm_state_change("stop", "run"); +qemu_mutex_unlock_iothread(); + colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY, _err); if (local_err) { @@ -578,6 +609,11 @@ void *colo_process_incoming_thread(void *opaque) goto out; } +qemu_mutex_lock_iothread(); +vm_stop_force_state(RUN_STATE_COLO); +trace_colo_vm_state_change("run", "stop"); +qemu_mutex_unlock_iothread(); + /* FIXME: This is unnecessary for periodic checkpoint mode */ colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY, _err); @@ -631,6 +667,8 @@ void *colo_process_incoming_thread(void *opaque) } vmstate_loading = false; +vm_start(); +trace_colo_vm_state_change("stop", "run"); qemu_mutex_unlock_iothread(); if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) { diff --git a/migration/migration.c b/migration/migration.c index 4b316ec343..2f3e9d40d1 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -76,10 +76,8 @@ /* Migration XBZRLE default cache size */ #define DEFAULT_MIGRATE_XBZRLE_CACHE_SIZE (64 * 1024 * 1024) -/* The delay time (in ms) between two COLO checkpoints - * Note: Please change this default value to 1 when we support hybrid mode. - */ -#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200 +/* The delay time (in ms) between two COLO checkpoints */ +#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100) #define DEFAULT_MIGRATE_MULTIFD_CHANNELS 2 #define DEFAULT_MIGRATE_MULTIFD_PAGE_COUNT 16 -- 2.17.GIT
[Qemu-devel] [PATCH V12 03/19] colo-compare: use notifier to notify packets comparing result
It's a good idea to use notifier to notify COLO frame of inconsistent packets comparing. Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Signed-off-by: zhanghailiang --- net/colo-compare.c | 37 ++--- net/colo-compare.h | 2 ++ 2 files changed, 28 insertions(+), 11 deletions(-) diff --git a/net/colo-compare.c b/net/colo-compare.c index 80e6532e8b..426eab5973 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -29,6 +29,7 @@ #include "sysemu/iothread.h" #include "net/colo-compare.h" #include "migration/colo.h" +#include "migration/migration.h" #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ @@ -37,6 +38,9 @@ static QTAILQ_HEAD(, CompareState) net_compares = QTAILQ_HEAD_INITIALIZER(net_compares); +static NotifierList colo_compare_notifiers = +NOTIFIER_LIST_INITIALIZER(colo_compare_notifiers); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 @@ -427,10 +431,7 @@ sec: qemu_hexdump((char *)spkt->data, stderr, "colo-compare spkt", spkt->size); -/* - * colo_compare_inconsistent_notify(); - * TODO: notice to checkpoint(); - */ +colo_compare_inconsistency_notify(); } } @@ -561,8 +562,24 @@ static int colo_old_packet_check_one(Packet *pkt, int64_t *check_time) } } +static void colo_compare_inconsistency_notify(void) +{ +notifier_list_notify(_compare_notifiers, +migrate_get_current()); +} + +void colo_compare_register_notifier(Notifier *notify) +{ +notifier_list_add(_compare_notifiers, notify); +} + +void colo_compare_unregister_notifier(Notifier *notify) +{ +notifier_remove(notify); +} + static int colo_old_packet_check_one_conn(Connection *conn, - void *user_data) + void *user_data) { GList *result = NULL; int64_t check_time = REGULAR_PACKET_CHECK_MS; @@ -573,10 +590,7 @@ static int colo_old_packet_check_one_conn(Connection *conn, if (result) { /* Do checkpoint will flush old packet */ -/* - * TODO: Notify colo frame to do checkpoint. - * colo_compare_inconsistent_notify(); - */ +colo_compare_inconsistency_notify(); return 0; } @@ -620,11 +634,12 @@ static void colo_compare_packet(CompareState *s, Connection *conn, /* * If one packet arrive late, the secondary_list or * primary_list will be empty, so we can't compare it - * until next comparison. + * until next comparison. If the packets in the list are + * timeout, it will trigger a checkpoint request. */ trace_colo_compare_main("packet different"); g_queue_push_head(>primary_list, pkt); -/* TODO: colo_notify_checkpoint();*/ +colo_compare_inconsistency_notify(); break; } } diff --git a/net/colo-compare.h b/net/colo-compare.h index 1b1ce76aea..22ddd512e2 100644 --- a/net/colo-compare.h +++ b/net/colo-compare.h @@ -18,5 +18,7 @@ #define QEMU_COLO_COMPARE_H void colo_notify_compares_event(void *opaque, int event, Error **errp); +void colo_compare_register_notifier(Notifier *notify); +void colo_compare_unregister_notifier(Notifier *notify); #endif /* QEMU_COLO_COMPARE_H */ -- 2.17.GIT
[Qemu-devel] [PATCH V12 07/19] COLO: Load dirty pages into SVM's RAM cache firstly
We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram cache always the same as PVM's memory at every checkpoint, then we flush this cached ram to SVM after we receive all PVM's state. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen Reviewed-by: Dr. David Alan Gilbert --- include/exec/ram_addr.h | 1 + migration/migration.c | 7 migration/ram.c | 83 - migration/ram.h | 4 ++ migration/savevm.c | 2 +- 5 files changed, 94 insertions(+), 3 deletions(-) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 3abb639056..9ecd911c3e 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -27,6 +27,7 @@ struct RAMBlock { struct rcu_head rcu; struct MemoryRegion *mr; uint8_t *host; +uint8_t *colo_cache; /* For colo, VM's ram cache */ ram_addr_t offset; ram_addr_t used_length; ram_addr_t max_length; diff --git a/migration/migration.c b/migration/migration.c index 950b22df63..d67fcff359 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -444,6 +444,11 @@ static void process_incoming_migration_co(void *opaque) exit(EXIT_FAILURE); } +if (colo_init_ram_cache() < 0) { +error_report("Init ram cache failed"); +exit(EXIT_FAILURE); +} + qemu_thread_create(>colo_incoming_thread, "COLO incoming", colo_process_incoming_thread, mis, QEMU_THREAD_JOINABLE); mis->have_colo_incoming_thread = true; @@ -451,6 +456,8 @@ static void process_incoming_migration_co(void *opaque) /* Wait checkpoint incoming thread exit before free resource */ qemu_thread_join(>colo_incoming_thread); +/* We hold the global iothread lock, so it is safe here */ +colo_release_ram_cache(); } if (ret < 0) { diff --git a/migration/ram.c b/migration/ram.c index 79c89425a3..a63315533e 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3398,6 +3398,20 @@ static inline void *host_from_ram_block_offset(RAMBlock *block, return block->host + offset; } +static inline void *colo_cache_from_block_offset(RAMBlock *block, + ram_addr_t offset) +{ +if (!offset_in_ramblock(block, offset)) { +return NULL; +} +if (!block->colo_cache) { +error_report("%s: colo_cache is NULL in block :%s", + __func__, block->idstr); +return NULL; +} +return block->colo_cache + offset; +} + /** * ram_handle_compressed: handle the zero page case * @@ -3602,6 +3616,58 @@ static void decompress_data_with_multi_threads(QEMUFile *f, qemu_mutex_unlock(_done_lock); } +/* + * colo cache: this is for secondary VM, we cache the whole + * memory of the secondary VM, it is need to hold the global lock + * to call this helper. + */ +int colo_init_ram_cache(void) +{ +RAMBlock *block; + +rcu_read_lock(); +QLIST_FOREACH_RCU(block, _list.blocks, next) { +block->colo_cache = qemu_anon_ram_alloc(block->used_length, +NULL, +false); +if (!block->colo_cache) { +error_report("%s: Can't alloc memory for COLO cache of block %s," + "size 0x" RAM_ADDR_FMT, __func__, block->idstr, + block->used_length); +goto out_locked; +} +memcpy(block->colo_cache, block->host, block->used_length); +} +rcu_read_unlock(); +return 0; + +out_locked: +QLIST_FOREACH_RCU(block, _list.blocks, next) { +if (block->colo_cache) { +qemu_anon_ram_free(block->colo_cache, block->used_length); +block->colo_cache = NULL; +} +} + +rcu_read_unlock(); +return -errno; +} + +/* It is need to hold the global lock to call this helper */ +void colo_release_ram_cache(void) +{ +RAMBlock *block; + +rcu_read_lock(); +QLIST_FOREACH_RCU(block, _list.blocks, next) { +if (block->colo_cache) { +qemu_anon_ram_free(block->colo_cache, block->used_length); +block->colo_cache = NULL; +} +} +rcu_read_unlock(); +} + /** * ram_load_setup: Setup RAM for migration incoming side * @@ -3618,6 +3684,7 @@ static int ram_load_setup(QEMUFile *f, void *opaque) xbzrle_load_setup(); ramblock_recv_map_init(); + return 0; } @@ -3638,6 +3705,7 @@ static int
[Qemu-devel] [Bug 1787505] Re: Solaris host: no network connection, mouse pointer mismatch
** Changed in: qemu Status: New => Invalid -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1787505 Title: Solaris host: no network connection, mouse pointer mismatch Status in QEMU: Invalid Bug description: This is probably a bit far afield but on a Solaris 10 SPARC host (Sun M3000) running a Windows XP guest like this: ./qemu-system-x86_64 -m 1024 -boot d -smp 3 -net nic -net user -hda /bkpool/qemuimages/XP.img -cdrom /bkpool/qemuimages/xp.iso & the vnc server starts up and Windows boots normally. However, there is no network connectivity. There are no network devices visible in XP's Networking tab of Control Panel and a ping of the local router reports "unreachable". Also, the keyboard works fine but the guest mouse pointer is offset from the host mouse position by an amount that varies by screen position. This makes it impossible to point to locations near the edge of the qemu window. This seems to be a previously reported problem, but the suggested fix, " -device usb-tablet", prevents qemu from even starting: qemu-system-x86_64: -device usb-tablet: No 'usb-bus' bus found for device 'usb-tablet' The physical mouse is connected to the USB port of a Sun Ray 2fs controlling the M3000 via Sun Ray server. I apologize if this is a configuration issue and not a bug but I don't know where else to report it and have been unable to find a solution in the documentation. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1787505/+subscriptions
[Qemu-devel] [PATCH V12 02/19] colo-compare: implement the process of checkpoint
While do checkpoint, we need to flush all the unhandled packets, By using the filter notifier mechanism, we can easily to notify every compare object to do this process, which runs inside of compare threads as a coroutine. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen --- include/migration/colo.h | 6 net/colo-compare.c | 78 net/colo-compare.h | 22 3 files changed, 106 insertions(+) create mode 100644 net/colo-compare.h diff --git a/include/migration/colo.h b/include/migration/colo.h index 2fe48ad353..fefb2fcf4c 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -16,6 +16,12 @@ #include "qemu-common.h" #include "qapi/qapi-types-migration.h" +enum colo_event { +COLO_EVENT_NONE, +COLO_EVENT_CHECKPOINT, +COLO_EVENT_FAILOVER, +}; + void colo_info_init(void); void migrate_start_colo_process(MigrationState *s); diff --git a/net/colo-compare.c b/net/colo-compare.c index dd745a491b..80e6532e8b 100644 --- a/net/colo-compare.c +++ b/net/colo-compare.c @@ -27,11 +27,16 @@ #include "qemu/sockets.h" #include "colo.h" #include "sysemu/iothread.h" +#include "net/colo-compare.h" +#include "migration/colo.h" #define TYPE_COLO_COMPARE "colo-compare" #define COLO_COMPARE(obj) \ OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE) +static QTAILQ_HEAD(, CompareState) net_compares = + QTAILQ_HEAD_INITIALIZER(net_compares); + #define COMPARE_READ_LEN_MAX NET_BUFSIZE #define MAX_QUEUE_SIZE 1024 @@ -41,6 +46,10 @@ /* TODO: Should be configurable */ #define REGULAR_PACKET_CHECK_MS 3000 +static QemuMutex event_mtx; +static QemuCond event_complete_cond; +static int event_unhandled_count; + /* * + CompareState ++ * | | @@ -87,6 +96,11 @@ typedef struct CompareState { IOThread *iothread; GMainContext *worker_context; QEMUTimer *packet_check_timer; + +QEMUBH *event_bh; +enum colo_event event; + +QTAILQ_ENTRY(CompareState) next; } CompareState; typedef struct CompareClass { @@ -736,6 +750,25 @@ static void check_old_packet_regular(void *opaque) REGULAR_PACKET_CHECK_MS); } +/* Public API, Used for COLO frame to notify compare event */ +void colo_notify_compares_event(void *opaque, int event, Error **errp) +{ +CompareState *s; + +qemu_mutex_lock(_mtx); +QTAILQ_FOREACH(s, _compares, next) { +s->event = event; +qemu_bh_schedule(s->event_bh); +event_unhandled_count++; +} +/* Wait all compare threads to finish handling this event */ +while (event_unhandled_count > 0) { +qemu_cond_wait(_complete_cond, _mtx); +} + +qemu_mutex_unlock(_mtx); +} + static void colo_compare_timer_init(CompareState *s) { AioContext *ctx = iothread_get_aio_context(s->iothread); @@ -756,6 +789,30 @@ static void colo_compare_timer_del(CompareState *s) } } +static void colo_flush_packets(void *opaque, void *user_data); + +static void colo_compare_handle_event(void *opaque) +{ +CompareState *s = opaque; + +switch (s->event) { +case COLO_EVENT_CHECKPOINT: +g_queue_foreach(>conn_list, colo_flush_packets, s); +break; +case COLO_EVENT_FAILOVER: +break; +default: +break; +} + +assert(event_unhandled_count > 0); + +qemu_mutex_lock(_mtx); +event_unhandled_count--; +qemu_cond_broadcast(_complete_cond); +qemu_mutex_unlock(_mtx); +} + static void colo_compare_iothread(CompareState *s) { object_ref(OBJECT(s->iothread)); @@ -769,6 +826,7 @@ static void colo_compare_iothread(CompareState *s) s, s->worker_context, true); colo_compare_timer_init(s); +s->event_bh = qemu_bh_new(colo_compare_handle_event, s); } static char *compare_get_pri_indev(Object *obj, Error **errp) @@ -926,8 +984,13 @@ static void colo_compare_complete(UserCreatable *uc, Error **errp) net_socket_rs_init(>pri_rs, compare_pri_rs_finalize, s->vnet_hdr); net_socket_rs_init(>sec_rs, compare_sec_rs_finalize, s->vnet_hdr); +QTAILQ_INSERT_TAIL(_compares, s, next); + g_queue_init(>conn_list); +qemu_mutex_init(_mtx); +qemu_cond_init(_complete_cond); + s->connection_track_table = g_hash_table_new_full(connection_key_hash, connection_key_equal, g_free, @@ -990,6 +1053,7 @@ static void colo_compare_init(Object *obj) static void colo_compare_finalize(Object *obj) { CompareState *s = COLO_COMPARE(obj); +CompareState *tmp = NULL; qemu_chr_fe_deinit(>chr_pri_in, false); qemu_chr_fe_deinit(>chr_sec_in, false); @@ -997,6 +1061,16 @@ static void colo_compare_finalize(Object *obj) if (s->iothread) { colo_compare_timer_del(s); } + +qemu_bh_delete(s->event_bh); + +QTAILQ_FOREACH(tmp,
[Qemu-devel] [PATCH V12 01/19] filter-rewriter: Add TCP state machine and fix memory leak in connection_track_table
We add almost full TCP state machine in filter-rewriter, except TCPS_LISTEN and some simplify in VM active close FIN states. After a net connection is closed, we didn't clear its releated resources in connection_track_table, which will lead to memory leak. Let't track the state of net connection, if it is closed, its related resources will be cleared up. Signed-off-by: zhanghailiang Signed-off-by: Zhang Chen Signed-off-by: Zhang Chen --- net/colo.c| 2 +- net/colo.h| 9 ++-- net/filter-rewriter.c | 105 ++ 3 files changed, 100 insertions(+), 16 deletions(-) diff --git a/net/colo.c b/net/colo.c index 6dda4ed66e..97c8fc928f 100644 --- a/net/colo.c +++ b/net/colo.c @@ -137,7 +137,7 @@ Connection *connection_new(ConnectionKey *key) conn->ip_proto = key->ip_proto; conn->processing = false; conn->offset = 0; -conn->syn_flag = 0; +conn->tcp_state = TCPS_CLOSED; conn->pack = 0; conn->sack = 0; g_queue_init(>primary_list); diff --git a/net/colo.h b/net/colo.h index da6c36dcf7..0277e0e9ba 100644 --- a/net/colo.h +++ b/net/colo.h @@ -18,6 +18,7 @@ #include "slirp/slirp.h" #include "qemu/jhash.h" #include "qemu/timer.h" +#include "slirp/tcp.h" #define HASHTABLE_MAX_SIZE 16384 @@ -81,11 +82,9 @@ typedef struct Connection { uint32_t sack; /* offset = secondary_seq - primary_seq */ tcp_seq offset; -/* - * we use this flag update offset func - * run once in independent tcp connection - */ -int syn_flag; + +int tcp_state; /* TCP FSM state */ +tcp_seq fin_ack_seq; /* the seq of 'fin=1,ack=1' */ } Connection; uint32_t connection_key_hash(const void *opaque); diff --git a/net/filter-rewriter.c b/net/filter-rewriter.c index f584e4eba4..f18a71bf2e 100644 --- a/net/filter-rewriter.c +++ b/net/filter-rewriter.c @@ -59,9 +59,9 @@ static int is_tcp_packet(Packet *pkt) } /* handle tcp packet from primary guest */ -static int handle_primary_tcp_pkt(NetFilterState *nf, +static int handle_primary_tcp_pkt(RewriterState *rf, Connection *conn, - Packet *pkt) + Packet *pkt, ConnectionKey *key) { struct tcphdr *tcp_pkt; @@ -74,23 +74,28 @@ static int handle_primary_tcp_pkt(NetFilterState *nf, trace_colo_filter_rewriter_conn_offset(conn->offset); } +if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == (TH_ACK | TH_SYN)) && +conn->tcp_state == TCPS_SYN_SENT) { +conn->tcp_state = TCPS_ESTABLISHED; +} + if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_SYN)) { /* * we use this flag update offset func * run once in independent tcp connection */ -conn->syn_flag = 1; +conn->tcp_state = TCPS_SYN_RECEIVED; } if (((tcp_pkt->th_flags & (TH_ACK | TH_SYN)) == TH_ACK)) { -if (conn->syn_flag) { +if (conn->tcp_state == TCPS_SYN_RECEIVED) { /* * offset = secondary_seq - primary seq * ack packet sent by guest from primary node, * so we use th_ack - 1 get primary_seq */ conn->offset -= (ntohl(tcp_pkt->th_ack) - 1); -conn->syn_flag = 0; +conn->tcp_state = TCPS_ESTABLISHED; } if (conn->offset) { /* handle packets to the secondary from the primary */ @@ -99,15 +104,63 @@ static int handle_primary_tcp_pkt(NetFilterState *nf, net_checksum_calculate((uint8_t *)pkt->data + pkt->vnet_hdr_len, pkt->size - pkt->vnet_hdr_len); } +/* + * Case 1: + * Step 3: + * The *server* side of this connect is VM, *client* tries to close + * the connection. + * + * We got 'ack=1' packets from client side, it acks 'fin=1, ack=1' + * packet from server side. From this point, we can ensure that there + * will be no packets in the connection, except that, some errors + * happen between the path of 'filter object' and vNIC, if this rare + * case really happen, we can still create a new connection, + * So it is safe to remove the connection from connection_track_table. + * + */ +if ((conn->tcp_state == TCPS_LAST_ACK) && +(ntohl(tcp_pkt->th_ack) == (conn->fin_ack_seq + 1))) { +conn->tcp_state = TCPS_CLOSED; +g_hash_table_remove(rf->connection_track_table, key); +} +} + +if ((tcp_pkt->th_flags & TH_FIN) == TH_FIN) { +/* + * Case 1: + * Step 1: + * The *server* side of this connect is VM, *client* tries to close + * the connection. We will into CLOSE_WAIT status. + */ +if (conn->tcp_state == TCPS_ESTABLISHED) { +conn->tcp_state = TCPS_CLOSE_WAIT; +} + +
[Qemu-devel] [PATCH V12 00/19] COLO: integrate colo frame with block replication and COLO proxy
Hi~ All~ COLO Frame, block replication and COLO proxy(colo-compare,filter-mirror, filter-redirector,filter-rewriter) have been exist in qemu for long time, it's time to integrate these three parts to make COLO really works. In this series, we have some optimizations for COLO frame, including separating the process of saving ram and device state, using an COLO_EXIT event to notify users that VM exits COLO, for these parts, most of them have been reviewed long time ago in old version, but since this series have just rebased on upstream which had merged a new series of migration, parts of pathes in this series deserve review again. We use notifier/callback method for COLO compare to notify COLO frame about net packets inconsistent event, and add a handle_event method for NetFilterClass to help COLO frame to notify filters and colo-compare about checkpoint/failover event, it is flexible. For the neweset version, please refer to: https://github.com/zhangckid/qemu/tree/qemu-colo-18sep1 Please review, thanks. V12: - Rebased on upstream. - Removed the patch 15/20 for feature work as Jason's comments. - Fixed failover bugs in patch 16/19. - Renamed the dummy_handle_event to default_handle_event in patch 15/19. - Cleaned needless check job. V11: - Rebased on upstream. - Used "RAMBLOCK_FOREACH_MIGRATABLE()" to replace "QLIST_FOREACH_RCU()" in patch 08/20. - Fixed COLO related qapi command's since version in patch 10/20. V10: - Rebased on upstream. - Removed the "active" in COLOState. - Fixed some comments. V9: - Rebased on upstream codes. - Addressed Jason's comments add TCP state machine track in filter-rewriter. - Fix some bug in colo-compare. - Fix typo. - Add filter-rewriter failover handle. - Add net client type check in colo-compare. - Add COLO state diagram. - Addressed Markus and Daive's comments. V8: - Rebased on upstream codes. - Addressed Markus's comments in patch 10/17. - Addressed Markus's comments in patch 11/17. - Removed some comments in patch 4/17. - Moved the "migration_bitmap_clear_dirty()" to suitable position in patch 9/17. - Rewrote the patch 07/17 to address Davie's comments. - Moved the "qemu_savevm_live_state" out of the qemu_mutex_lock_iothread. - Fixed the bug that in some status COLO vm crash with segmentation fault. V7: - Addressed Markus's comments in 11/17. - Rebased on upstream. V6: - Addressed Eric Blake's comments, use the enum to feedback in patch 11/17. - Fixed QAPI command separator problem in patch 11/17. Zhang Chen (15): filter-rewriter: Add TCP state machine and fix memory leak in connection_track_table colo-compare: implement the process of checkpoint colo-compare: use notifier to notify packets comparing result COLO: integrate colo compare with colo frame COLO: Add block replication into colo process COLO: Remove colo_state migration struct COLO: Load dirty pages into SVM's RAM cache firstly ram/COLO: Record the dirty pages that SVM received COLO: Flush memory data from ram cache qapi/migration.json: Rename COLO unknown mode to none mode. qapi: Add new command to query colo status savevm: split the process of different stages for loadvm/savevm filter: Add handle_event method for NetFilterClass filter-rewriter: handle checkpoint and failover event docs: Add COLO status diagram to COLO-FT.txt zhanghailiang (4): qmp event: Add COLO_EXIT event to notify users while exited COLO COLO: flush host dirty ram from cache COLO: notify net filters about checkpoint/failover event COLO: quick failover process by kick COLO thread docs/COLO-FT.txt | 34 ++ include/exec/ram_addr.h | 1 + include/migration/colo.h | 11 +- include/net/filter.h | 5 + migration/Makefile.objs | 2 +- migration/colo-comm.c | 76 -- migration/colo-failover.c | 2 +- migration/colo.c | 212 -- migration/migration.c | 46 - migration/ram.c | 166 - migration/ram.h | 4 + migration/savevm.c| 53 -- migration/savevm.h| 5 + migration/trace-events| 3 + net/colo-compare.c| 115 +++-- net/colo-compare.h| 24 + net/colo.c| 10 +- net/colo.h| 11 +- net/filter-rewriter.c | 162 +++-- net/filter.c | 17 +++ net/net.c | 19 qapi/migration.json | 80 +- vl.c | 2 - 23 files changed, 921 insertions(+), 139 deletions(-) delete mode 100644 migration/colo-comm.c create mode 100644 net/colo-compare.h -- 2.17.GIT
[Qemu-devel] [PATCH v7 5/7] monitor: remove "x-oob", turn oob on by default
OOB commands were introduced in commit cf869d53172. Unfortunately, we ran into a regression, and had to disable them by default for 2.12 (commit be933ffc23). http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06231.html The regression has since been fixed (commit 951702f39c7 "monitor: bind dispatch bh to iohandler context"). Time to re-enable OOB. This patch partly reverts be933ffc23 (monitor: new parameter "x-oob"), and turns it on again for non-MUX QMPs. Note that we can't enable Out-Of-Band for monitors with MUX-typed chardev backends, because not all the chardev frontends can run without main thread, or can run in multiple threads. Some trivial touch-up in the test code is required to make sure qmp-test won't break. Reviewed-by: Markus Armbruster Signed-off-by: Peter Xu --- include/monitor/monitor.h | 1 - monitor.c | 22 ++ tests/libqtest.c | 2 +- tests/qmp-test.c | 2 +- vl.c | 5 - 5 files changed, 8 insertions(+), 24 deletions(-) diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h index 2ef5e04b37..00ded7972c 100644 --- a/include/monitor/monitor.h +++ b/include/monitor/monitor.h @@ -13,7 +13,6 @@ extern __thread Monitor *cur_mon; #define MONITOR_USE_READLINE 0x02 #define MONITOR_USE_CONTROL 0x04 #define MONITOR_USE_PRETTY0x08 -#define MONITOR_USE_OOB 0x10 bool monitor_cur_is_qmp(void); diff --git a/monitor.c b/monitor.c index 9e6cad2af6..c2fd742f91 100644 --- a/monitor.c +++ b/monitor.c @@ -4529,19 +4529,12 @@ void monitor_init(Chardev *chr, int flags) { Monitor *mon = g_malloc(sizeof(*mon)); bool use_readline = flags & MONITOR_USE_READLINE; -bool use_oob = flags & MONITOR_USE_OOB; - -if (use_oob) { -if (CHARDEV_IS_MUX(chr)) { -error_report("Monitor out-of-band is not supported with " - "MUX typed chardev backend"); -exit(1); -} -if (use_readline) { -error_report("Monitor out-of-band is only supported by QMP"); -exit(1); -} -} +/* + * Note: we can't enable Out-Of-Band for monitors with MUX-typed + * chardev backends, because not all the chardev frontends can run + * without main thread, or can run in multiple threads. + */ +bool use_oob = (flags & MONITOR_USE_CONTROL) && !CHARDEV_IS_MUX(chr); monitor_data_init(mon, false, use_oob); @@ -4631,9 +4624,6 @@ QemuOptsList qemu_mon_opts = { },{ .name = "pretty", .type = QEMU_OPT_BOOL, -},{ -.name = "x-oob", -.type = QEMU_OPT_BOOL, }, { /* end of list */ } }, diff --git a/tests/libqtest.c b/tests/libqtest.c index d635c5bea0..ebd92f22f6 100644 --- a/tests/libqtest.c +++ b/tests/libqtest.c @@ -231,7 +231,7 @@ QTestState *qtest_init_without_qmp_handshake(bool use_oob, "-display none " "%s", qemu_binary, socket_path, getenv("QTEST_LOG") ? "/dev/fd/2" : "/dev/null", - qmp_socket_path, use_oob ? ",x-oob=on" : "", + qmp_socket_path, "", extra_args ?: ""); execlp("/bin/sh", "sh", "-c", command, NULL); exit(1); diff --git a/tests/qmp-test.c b/tests/qmp-test.c index 4ae2245484..5302bd07b9 100644 --- a/tests/qmp-test.c +++ b/tests/qmp-test.c @@ -143,7 +143,7 @@ static void test_qmp_protocol(void) g_assert(q); test_version(qdict_get(q, "version")); capabilities = qdict_get_qlist(q, "capabilities"); -g_assert(capabilities && qlist_empty(capabilities)); +g_assert(capabilities); qobject_unref(resp); /* Test valid command before handshake */ diff --git a/vl.c b/vl.c index 10dd690e30..0206f0c512 100644 --- a/vl.c +++ b/vl.c @@ -2323,11 +2323,6 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp) if (qemu_opt_get_bool(opts, "pretty", 0)) flags |= MONITOR_USE_PRETTY; -/* OOB is off by default */ -if (qemu_opt_get_bool(opts, "x-oob", 0)) { -flags |= MONITOR_USE_OOB; -} - chardev = qemu_opt_get(opts, "chardev"); chr = qemu_chr_find(chardev); if (chr == NULL) { -- 2.17.1
Re: [Qemu-devel] [PATCH 1/1] kvm: add call to qemu_add_opts() for -overcommit option
On 2018-08-15 19:57, prasad.singamse...@oracle.com wrote: > From: Prasad Singamsetty > > qemu command fails to process -overcommit option. Add the missing > call to qemu_add_opts() in vl.c. > > Signed-off-by: Prasad Singamsetty > --- > vl.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/vl.c b/vl.c > index 16b913f9d5..12d27fa028 100644 > --- a/vl.c > +++ b/vl.c > @@ -2987,6 +2987,7 @@ int main(int argc, char **argv, char **envp) > qemu_add_opts(_object_opts); > qemu_add_opts(_tpmdev_opts); > qemu_add_opts(_realtime_opts); > +qemu_add_opts(_overcommit_opts); > qemu_add_opts(_msg_opts); > qemu_add_opts(_name_opts); > qemu_add_opts(_numa_opts); > I think this should go into the stable release as well. Thomas
[Qemu-devel] [PATCH v7 4/7] qapi: remove COMMAND_DROPPED event
Now it was not used any more, drop it. We can still do that since out-of-band is still experimental, and this event is only used when out-of-band is enabled. Signed-off-by: Peter Xu --- docs/interop/qmp-spec.txt | 5 +++-- qapi/misc.json| 40 --- 2 files changed, 3 insertions(+), 42 deletions(-) diff --git a/docs/interop/qmp-spec.txt b/docs/interop/qmp-spec.txt index 8f7da0245d..67e44a8120 100644 --- a/docs/interop/qmp-spec.txt +++ b/docs/interop/qmp-spec.txt @@ -130,8 +130,9 @@ to pass "id" with out-of-band commands. Passing it with all commands is recommended for clients that accept capability "oob". If the client sends in-band commands faster than the server can -execute them, the server will eventually drop commands to limit the -queue length. The sever sends event COMMAND_DROPPED then. +execute them, the server will stop reading the requests from the QMP +channel until the request queue length is reduced to an acceptable +range. Only a few commands support out-of-band execution. The ones that do have "allow-oob": true in output of query-qmp-schema. diff --git a/qapi/misc.json b/qapi/misc.json index d450cfef21..2c1a6119bf 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -3432,46 +3432,6 @@ ## { 'command': 'query-sev-capabilities', 'returns': 'SevCapability' } -## -# @CommandDropReason: -# -# Reasons that caused one command to be dropped. -# -# @queue-full: the command queue is full. This can only occur when -# the client sends a new non-oob command before the -# response to the previous non-oob command has been -# received. -# -# Since: 2.12 -## -{ 'enum': 'CommandDropReason', - 'data': [ 'queue-full' ] } - -## -# @COMMAND_DROPPED: -# -# Emitted when a command is dropped due to some reason. Commands can -# only be dropped when the oob capability is enabled. -# -# @id: The dropped command's "id" field. -# FIXME Broken by design. Events are broadcast to all monitors. If -# another monitor's client has a command with the same ID in flight, -# the event will incorrectly claim that command was dropped. -# -# @reason: The reason why the command is dropped. -# -# Since: 2.12 -# -# Example: -# -# { "event": "COMMAND_DROPPED", -# "data": {"result": {"id": "libvirt-102", -# "reason": "queue-full" } } } -# -## -{ 'event': 'COMMAND_DROPPED' , - 'data': { 'id': 'any', 'reason': 'CommandDropReason' } } - ## # @set-numa-node: # -- 2.17.1
[Qemu-devel] [PATCH v7 6/7] Revert "tests: Add parameter to qtest_init_without_qmp_handshake"
This reverts commit ddee57e0176f6ab53b13c6c97605b62737a8fd7a. Meanwhile, revert one line from fa198ad9bdef to make sure qtest_init_without_qmp_handshake() will only pass in one parameter. Reviewed-by: Markus Armbruster Signed-off-by: Peter Xu --- tests/libqtest.c | 10 -- tests/libqtest.h | 4 +--- tests/qmp-test.c | 4 ++-- 3 files changed, 7 insertions(+), 11 deletions(-) diff --git a/tests/libqtest.c b/tests/libqtest.c index ebd92f22f6..3c594abbc2 100644 --- a/tests/libqtest.c +++ b/tests/libqtest.c @@ -191,8 +191,7 @@ static const char *qtest_qemu_binary(void) return qemu_bin; } -QTestState *qtest_init_without_qmp_handshake(bool use_oob, - const char *extra_args) +QTestState *qtest_init_without_qmp_handshake(const char *extra_args) { QTestState *s; int sock, qmpsock, i; @@ -225,13 +224,12 @@ QTestState *qtest_init_without_qmp_handshake(bool use_oob, command = g_strdup_printf("exec %s " "-qtest unix:%s,nowait " "-qtest-log %s " - "-chardev socket,path=%s,nowait,id=char0 " - "-mon chardev=char0,mode=control%s " + "-qmp unix:%s,nowait " "-machine accel=qtest " "-display none " "%s", qemu_binary, socket_path, getenv("QTEST_LOG") ? "/dev/fd/2" : "/dev/null", - qmp_socket_path, "", + qmp_socket_path, extra_args ?: ""); execlp("/bin/sh", "sh", "-c", command, NULL); exit(1); @@ -266,7 +264,7 @@ QTestState *qtest_init_without_qmp_handshake(bool use_oob, QTestState *qtest_init(const char *extra_args) { -QTestState *s = qtest_init_without_qmp_handshake(false, extra_args); +QTestState *s = qtest_init_without_qmp_handshake(extra_args); QDict *greeting; /* Read the QMP greeting and then do the handshake */ diff --git a/tests/libqtest.h b/tests/libqtest.h index 36d5caecd4..49ffc1ba9f 100644 --- a/tests/libqtest.h +++ b/tests/libqtest.h @@ -55,14 +55,12 @@ QTestState *qtest_init(const char *extra_args); /** * qtest_init_without_qmp_handshake: - * @use_oob: true to have the server advertise OOB support * @extra_args: other arguments to pass to QEMU. CAUTION: these * arguments are subject to word splitting and shell evaluation. * * Returns: #QTestState instance. */ -QTestState *qtest_init_without_qmp_handshake(bool use_oob, - const char *extra_args); +QTestState *qtest_init_without_qmp_handshake(const char *extra_args); /** * qtest_quit: diff --git a/tests/qmp-test.c b/tests/qmp-test.c index 5302bd07b9..91a90d1c9d 100644 --- a/tests/qmp-test.c +++ b/tests/qmp-test.c @@ -135,7 +135,7 @@ static void test_qmp_protocol(void) QList *capabilities; QTestState *qts; -qts = qtest_init_without_qmp_handshake(false, common_args); +qts = qtest_init_without_qmp_handshake(common_args); /* Test greeting */ resp = qtest_qmp_receive(qts); @@ -249,7 +249,7 @@ static void test_qmp_oob(void) QList *capabilities; QString *qstr; -qts = qtest_init_without_qmp_handshake(true, common_args); +qts = qtest_init_without_qmp_handshake(common_args); /* Check the greeting message. */ resp = qtest_qmp_receive(qts); -- 2.17.1
[Qemu-devel] [PATCH v7 2/7] qapi: Drop qapi_event_send_FOO()'s Error ** argument
The generated qapi_event_send_FOO() take an Error ** argument. They can't actually fail, because all they do with the argument is passing it to functions that can't fail: the QObject output visitor, and the @qmp_emit callback, which is either monitor_qapi_event_queue() or event_test_emit(). Drop the argument, and pass _abort to the QObject output visitor and @qmp_emit instead. Suggested-by: Eric Blake Suggested-by: Markus Armbruster Reviewed-by: Markus Armbruster Signed-off-by: Peter Xu --- block/block-backend.c| 8 +++- block/qcow2.c| 2 +- block/quorum.c | 4 ++-- block/write-threshold.c | 3 +-- blockjob.c | 13 + cpus.c | 8 docs/devel/qapi-code-gen.txt | 6 ++ dump.c | 3 +-- hw/acpi/core.c | 2 +- hw/acpi/cpu.c| 2 +- hw/acpi/memory_hotplug.c | 5 ++--- hw/char/virtio-console.c | 3 +-- hw/core/qdev.c | 3 +-- hw/net/virtio-net.c | 2 +- hw/ppc/spapr_rtc.c | 2 +- hw/timer/mc146818rtc.c | 2 +- hw/virtio/virtio-balloon.c | 3 +-- hw/watchdog/watchdog.c | 15 +++ include/qapi/qmp-event.h | 3 +-- job.c| 2 +- migration/migration.c| 4 ++-- migration/ram.c | 2 +- monitor.c| 5 ++--- scripts/qapi/events.py | 23 ++- scsi/pr-manager-helper.c | 3 +-- tests/test-qmp-event.c | 11 +-- ui/spice-core.c | 10 -- ui/vnc.c | 7 +++ vl.c | 16 +++- 29 files changed, 69 insertions(+), 103 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index fa120630be..14a1b7ac6a 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -980,8 +980,7 @@ void blk_dev_change_media_cb(BlockBackend *blk, bool load, Error **errp) if (tray_was_open != tray_is_open) { char *id = blk_get_attached_dev_id(blk); -qapi_event_send_device_tray_moved(blk_name(blk), id, tray_is_open, - _abort); +qapi_event_send_device_tray_moved(blk_name(blk), id, tray_is_open); g_free(id); } } @@ -1665,8 +1664,7 @@ static void send_qmp_error_event(BlockBackend *blk, qapi_event_send_block_io_error(blk_name(blk), !!bs, bs ? bdrv_get_node_name(bs) : NULL, optype, action, blk_iostatus_is_enabled(blk), - error == ENOSPC, strerror(error), - _abort); + error == ENOSPC, strerror(error)); } /* This is done by device models because, while the block layer knows @@ -1782,7 +1780,7 @@ void blk_eject(BlockBackend *blk, bool eject_flag) * the frontend experienced a tray event. */ id = blk_get_attached_dev_id(blk); qapi_event_send_device_tray_moved(blk_name(blk), id, - eject_flag, _abort); + eject_flag); g_free(id); } diff --git a/block/qcow2.c b/block/qcow2.c index ec9e6238a0..c13153735a 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -4659,7 +4659,7 @@ void qcow2_signal_corruption(BlockDriverState *bs, bool fatal, int64_t offset, *node_name != '\0', node_name, message, offset >= 0, offset, size >= 0, size, - fatal, _abort); + fatal); g_free(message); if (fatal) { diff --git a/block/quorum.c b/block/quorum.c index 9152da8c58..eb526cc0f1 100644 --- a/block/quorum.c +++ b/block/quorum.c @@ -199,7 +199,7 @@ static void quorum_report_bad(QuorumOpType type, uint64_t offset, } qapi_event_send_quorum_report_bad(type, !!msg, msg, node_name, start_sector, - end_sector - start_sector, _abort); + end_sector - start_sector); } static void quorum_report_failure(QuorumAIOCB *acb) @@ -210,7 +210,7 @@ static void quorum_report_failure(QuorumAIOCB *acb) BDRV_SECTOR_SIZE); qapi_event_send_quorum_failure(reference, start_sector, - end_sector - start_sector, _abort); + end_sector - start_sector); } static int quorum_vote_error(QuorumAIOCB *acb); diff --git a/block/write-threshold.c b/block/write-threshold.c index 1d48fc2077..85b78dc2a9 100644 --- a/block/write-threshold.c +++ b/block/write-threshold.c @@ -63,8 +63,7 @@ static int coroutine_fn before_write_notify(NotifierWithReturn
[Qemu-devel] [PATCH v7 7/7] tests: add oob functional test for test-qmp-cmds
Straightforward test just to let the test-qmp-cmds be complete. Signed-off-by: Peter Xu --- tests/test-qmp-cmds.c | 16 1 file changed, 16 insertions(+) diff --git a/tests/test-qmp-cmds.c b/tests/test-qmp-cmds.c index ab414fa0c9..23c68c7944 100644 --- a/tests/test-qmp-cmds.c +++ b/tests/test-qmp-cmds.c @@ -122,6 +122,21 @@ static void test_dispatch_cmd(void) qobject_unref(req); } +static void test_dispatch_cmd_oob(void) +{ +QDict *req = qdict_new(); +QDict *resp; + +qdict_put_str(req, "exec-oob", "test-flags-command"); + +resp = qmp_dispatch(_commands, QOBJECT(req), true); +assert(resp != NULL); +assert(!qdict_haskey(resp, "error")); + +qobject_unref(resp); +qobject_unref(req); +} + /* test commands that return an error due to invalid parameters */ static void test_dispatch_cmd_failure(void) { @@ -287,6 +302,7 @@ int main(int argc, char **argv) g_test_init(, , NULL); g_test_add_func("/qmp/dispatch_cmd", test_dispatch_cmd); +g_test_add_func("/qmp/dispatch_cmd_oob", test_dispatch_cmd_oob); g_test_add_func("/qmp/dispatch_cmd_failure", test_dispatch_cmd_failure); g_test_add_func("/qmp/dispatch_cmd_io", test_dispatch_cmd_io); g_test_add_func("/qmp/dealloc_types", test_dealloc_types); -- 2.17.1
[Qemu-devel] [PATCH v7 3/7] monitor: suspend monitor instead of send CMD_DROP
When we received too many qmp commands, previously we'll send COMMAND_DROPPED events to monitors, then we'll drop the requests. Now instead of dropping the command we stop reading from input when we notice the queue is getting full. Note that here since we removed the need_resume flag we need to be _very_ careful on the suspend/resume paring on the conditions since unmatched condition checks will hang death the monitor. Meanwhile, now we will need to read the queue length to decide whether we'll need to resume the monitor so we need to take the queue lock again even after popping from it. Signed-off-by: Peter Xu --- monitor.c | 33 +++-- 1 file changed, 15 insertions(+), 18 deletions(-) diff --git a/monitor.c b/monitor.c index 3b90c9eb5f..9e6cad2af6 100644 --- a/monitor.c +++ b/monitor.c @@ -89,6 +89,8 @@ #include "hw/s390x/storage-attributes.h" #endif +#define QMP_REQ_QUEUE_LEN_MAX (8) + /* * Supported types: * @@ -4124,29 +4126,33 @@ static QMPRequest *monitor_qmp_requests_pop_any(void) static void monitor_qmp_bh_dispatcher(void *data) { QMPRequest *req_obj = monitor_qmp_requests_pop_any(); +Monitor *mon; QDict *rsp; bool need_resume; if (!req_obj) { return; } - +mon = req_obj->mon; /* qmp_oob_enabled() might change after "qmp_capabilities" */ -need_resume = !qmp_oob_enabled(req_obj->mon); +qemu_mutex_lock(>qmp.qmp_queue_lock); +need_resume = !qmp_oob_enabled(req_obj->mon) || +mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1; +qemu_mutex_unlock(>qmp.qmp_queue_lock); if (req_obj->req) { trace_monitor_qmp_cmd_in_band(qobject_get_try_str(req_obj->id) ?: ""); -monitor_qmp_dispatch(req_obj->mon, req_obj->req, req_obj->id); +monitor_qmp_dispatch(mon, req_obj->req, req_obj->id); } else { assert(req_obj->err); rsp = qmp_error_response(req_obj->err); req_obj->err = NULL; -monitor_qmp_respond(req_obj->mon, rsp, NULL); +monitor_qmp_respond(mon, rsp, NULL); qobject_unref(rsp); } if (need_resume) { /* Pairs with the monitor_suspend() in handle_qmp_command() */ -monitor_resume(req_obj->mon); +monitor_resume(mon); } qmp_request_free(req_obj); @@ -4154,8 +4160,6 @@ static void monitor_qmp_bh_dispatcher(void *data) qemu_bh_schedule(qmp_dispatcher_bh); } -#define QMP_REQ_QUEUE_LEN_MAX (8) - static void handle_qmp_command(void *opaque, QObject *req, Error *err) { Monitor *mon = opaque; @@ -4205,19 +4209,12 @@ static void handle_qmp_command(void *opaque, QObject *req, Error *err) if (!qmp_oob_enabled(mon)) { monitor_suspend(mon); } else { -/* Drop the request if queue is full. */ -if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) { -qemu_mutex_unlock(>qmp.qmp_queue_lock); +if (mon->qmp.qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1) { /* - * FIXME @id's scope is just @mon, and broadcasting it is - * wrong. If another monitor's client has a command with - * the same ID in flight, the event will incorrectly claim - * that command was dropped. + * If this is _exactly_ the last request that we can + * queue, we suspend the monitor right now. */ -qapi_event_send_command_dropped(id, -COMMAND_DROP_REASON_QUEUE_FULL); -qmp_request_free(req_obj); -return; +monitor_suspend(mon); } } -- 2.17.1
[Qemu-devel] [PATCH v7 1/7] qapi: Fix build_params() for empty parameter list
From: Markus Armbruster build_params() returns '' instead of 'void' when there are no parameters. Can't happen now, but the next commit will change that. Signed-off-by: Markus Armbruster [peterx: compose the patch from email replies] Signed-off-by: Peter Xu --- scripts/qapi/common.py | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/scripts/qapi/common.py b/scripts/qapi/common.py index 02c5c6767a..3b0d4bf9c0 100644 --- a/scripts/qapi/common.py +++ b/scripts/qapi/common.py @@ -2070,16 +2070,14 @@ extern const QEnumLookup %(c_name)s_lookup; return ret -def build_params(arg_type, boxed, extra): -if not arg_type: -assert not boxed -return extra +def build_params(arg_type, boxed, extra=None): ret = '' sep = '' if boxed: +assert arg_type ret += '%s arg' % arg_type.c_param_type() sep = ', ' -else: +elif arg_type: assert not arg_type.variants for memb in arg_type.members: ret += sep @@ -2090,7 +2088,7 @@ def build_params(arg_type, boxed, extra): c_name(memb.name)) if extra: ret += sep + extra -return ret +return ret if ret else 'void' # -- 2.17.1
[Qemu-devel] [PATCH v7 0/7] monitor: enable OOB by default
(this series is based on Markus's monitor-next tree so if patchew spits something out with "apply failure" then it's expected) v7: - use Markus's commit message for patch "qapi: Drop qapi_event_send_FOO()'s Error ** argument" [Markus] - update commit message for "qapi: remove COMMAND_DROPPED event" since QEMU 3.0 is released [Eric/Dave] - rebase to Markus's monitor-next tree: http://repo.or.cz/qemu/armbru.git/shortlog/refs/heads/monitor-next the patch "monitor: suspend monitor instead of send CMD_DROP" re-written since people prefer to drop need_resume flag so now I hand-made it. Dropped a few patches since not appliable any more. Please review. Thanks, Markus Armbruster (1): qapi: Fix build_params() for empty parameter list Peter Xu (6): qapi: Drop qapi_event_send_FOO()'s Error ** argument monitor: suspend monitor instead of send CMD_DROP qapi: remove COMMAND_DROPPED event monitor: remove "x-oob", turn oob on by default Revert "tests: Add parameter to qtest_init_without_qmp_handshake" tests: add oob functional test for test-qmp-cmds block/block-backend.c| 8 ++--- block/qcow2.c| 2 +- block/quorum.c | 4 +-- block/write-threshold.c | 3 +- blockjob.c | 13 cpus.c | 8 ++--- docs/devel/qapi-code-gen.txt | 6 ++-- docs/interop/qmp-spec.txt| 5 ++-- dump.c | 3 +- hw/acpi/core.c | 2 +- hw/acpi/cpu.c| 2 +- hw/acpi/memory_hotplug.c | 5 ++-- hw/char/virtio-console.c | 3 +- hw/core/qdev.c | 3 +- hw/net/virtio-net.c | 2 +- hw/ppc/spapr_rtc.c | 2 +- hw/timer/mc146818rtc.c | 2 +- hw/virtio/virtio-balloon.c | 3 +- hw/watchdog/watchdog.c | 15 +- include/monitor/monitor.h| 1 - include/qapi/qmp-event.h | 3 +- job.c| 2 +- migration/migration.c| 4 +-- migration/ram.c | 2 +- monitor.c| 58 ++-- qapi/misc.json | 40 - scripts/qapi/common.py | 10 +++ scripts/qapi/events.py | 23 -- scsi/pr-manager-helper.c | 3 +- tests/libqtest.c | 10 +++ tests/libqtest.h | 4 +-- tests/qmp-test.c | 6 ++-- tests/test-qmp-cmds.c| 16 ++ tests/test-qmp-event.c | 11 --- ui/spice-core.c | 10 +++ ui/vnc.c | 7 ++--- vl.c | 21 + 37 files changed, 120 insertions(+), 202 deletions(-) -- 2.17.1
Re: [Qemu-devel] [RFC 0/6] Virtio-net: Support RSS
On 2018年08月30日 22:27, Sameeh Jubran wrote: From: Sameeh Jubran This series implements the Steering Mode feature which was introduced on the virtio-dev list a while ago, which can be found here: * https://lists.oasis-open.org/archives/virtio-dev/201805/msg00024.html The first three patches add some infrastructure support that is used in the following three patches. The ebpf filter doesn't fully work yet as I'm having an issue with the verifier which needs to be fixed. What issues did you meet? You can attach a trace buffer and get verbose debug information from that. Btw, dpdk use cls bpf and we use socket filter. The patches still need some love as not all of the cases have been handled yet most of the functionality has been implemented. One question is how indirection table is implemented, I thought it should be a map but looks not. Please see comment on patch 6. Thanks Please share your thoughts and comments so I'll move forward with sending v1 along with a fully functioning ebpf code. Sameeh Jubran (6): Add bpf support to qemu tap: Add support for bpf ioctls vhost-net: Expose vhost_net_get_fd virtio-net: implement steering mode feature virtio-net: steering mode: Implement rss support virtio-net: rss: Add bpf filter MAINTAINERS |5 + configure | 44 + hw/net/rss_bpf_insns.h | 3992 +++ hw/net/rss_tap_bpf.h| 37 + hw/net/rss_tap_bpf_program.c| 172 ++ hw/net/vhost_net.c |2 +- hw/net/virtio-net.c | 250 +- include/hw/virtio/virtio-net.h |5 + include/net/net.h |3 +- include/standard-headers/linux/virtio_net.h | 55 + net/tap-bsd.c |5 + net/tap-linux.c | 29 +- net/tap-linux.h |3 +- net/tap-solaris.c |5 + net/tap-stub.c |5 + net/tap.c |8 + net/tap_int.h |1 + qapi/net.json | 11 + scripts/update-linux-headers.sh |8 +- 19 files changed, 4627 insertions(+), 13 deletions(-) create mode 100644 hw/net/rss_bpf_insns.h create mode 100644 hw/net/rss_tap_bpf.h create mode 100644 hw/net/rss_tap_bpf_program.c
Re: [Qemu-devel] [RFC 5/6] virtio-net: steering mode: Implement rss support
On 2018年08月30日 22:27, Sameeh Jubran wrote: From: Sameeh Jubran Signed-off-by: Sameeh Jubran --- hw/net/virtio-net.c | 122 1 file changed, 105 insertions(+), 17 deletions(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index e7c4ce6f66..4a52a6a1d0 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -972,41 +972,129 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_OK; } -static int virtio_net_ctrl_steering_mode(VirtIONet *n, uint8_t cmd, + +static int virtio_net_ctrl_sm_rss(VirtIONet *n, uint32_t cmd, struct iovec *iov, unsigned int iov_cnt, struct iovec *iov_in, unsigned int iov_cnt_in, -size_t *size_in) +size_t *size_in) +{ +size_t s; +uint32_t supported_hash_function = 0; + +switch (cmd) { +case VIRTIO_NET_SM_CTRL_RSS_GET_SUPPORTED_FUNCTIONS: +supported_hash_function |= RSS_HASH_FUNCTION_TOEPLITZ; +if (!size_in) { +return VIRTIO_NET_ERR; +} +s = iov_from_buf(iov_in, iov_cnt_in, 0, +_hash_function, +supported_hash_function); Indentation looks wrong. +if (s != sizeof(n->supported_modes) || +!size_in) { +return VIRTIO_NET_ERR; +} +*size_in = s; +break; +case VIRTIO_NET_SM_CTRL_RSS_SET: +if (!n->rss_conf) { +n->rss_conf = g_malloc0( +sizeof(struct virtio_net_rss_conf)); +} else if (iov == NULL || iov_cnt == 0) { +g_free(n->rss_conf->ptrs.hash_key); +g_free(n->rss_conf->ptrs.indirection_table); +g_free(n->rss_conf); +return VIRTIO_NET_OK; +} +s = iov_to_buf(iov, iov_cnt, 0, n->rss_conf, +sizeof(struct virtio_net_rss_conf) - +sizeof(struct virtio_net_rss_conf_ptrs)); + +if (s != sizeof(struct virtio_net_rss_conf) - +sizeof(struct virtio_net_rss_conf_ptrs)) { +return VIRTIO_NET_ERR; +} +n->rss_conf->ptrs.hash_key = g_malloc0(sizeof(uint8_t) * +n->rss_conf->hash_key_length); What happens if n->rss_conf != 0 && iov != NULL? Looks like a guest trigger-able OOM? Btw e.g "conf_ptrs" sounds misleading, why not just embed hash key and indirection table pointers directly in rss_conf structure itself? +s = iov_to_buf(iov, iov_cnt, 0, n->rss_conf->ptrs.hash_key, +sizeof(uint8_t) * n->rss_conf->hash_key_length); +if (s != sizeof(uint8_t) * n->rss_conf->hash_key_length) { +g_free(n->rss_conf->ptrs.hash_key); +return VIRTIO_NET_ERR; +} +n->rss_conf->ptrs.indirection_table += g_malloc0(sizeof(uint32_t) * +n->rss_conf->indirection_table_length); +s = iov_to_buf(iov, iov_cnt, 0, +n->rss_conf->ptrs.indirection_table, sizeof(uint32_t) * +n->rss_conf->indirection_table_length); +if (s != sizeof(uint32_t) * +n->rss_conf->indirection_table_length) { +g_free(n->rss_conf->ptrs.hash_key); +g_free(n->rss_conf->ptrs.indirection_table); +return VIRTIO_NET_ERR; +} +/* do bpf magic */ +break; +default: +return VIRTIO_NET_ERR; +} + +return VIRTIO_NET_OK; +} + +static int virtio_net_ctrl_steering_mode(VirtIONet *n, uint8_t cmd, +struct iovec *iov, unsigned int iov_cnt, +struct iovec *iov_in, unsigned int iov_in_cnt, +size_t *size_in) { size_t s; struct virtio_net_steering_mode sm; +int status = 0; +size_t size_in_cmd = 0; switch (cmd) { case VIRTIO_NET_CTRL_SM_GET_SUPPORTED_MODES: if (!size_in) { return VIRTIO_NET_ERR; } - s = iov_from_buf(iov_in, iov_cnt_in, 0, - >supported_modes, sizeof(n->supported_modes)); +n->supported_modes.steering_modes |= STEERING_MODE_RSS | +STEERING_MODE_AUTO; We should have a property for RSS instead of hard coding it here. Thanks +s = iov_from_buf(iov_in, iov_in_cnt, 0, +>supported_modes, +sizeof(n->supported_modes)); if (s != sizeof(n->supported_modes) || - !size_in) { +!size_in) { return VIRTIO_NET_ERR; } - *size_in = s; - break; +*size_in = s; + break; case VIRTIO_NET_CTRL_SM_CONTROL: -s = iov_to_buf(iov, iov_cnt, 0, , sizeof(sm) - -sizeof(union command_data)); -if (s != sizeof(sm) - sizeof(union command_data)) { +s = iov_to_buf(iov, iov_cnt, 0, , sizeof(sm)); +if (s != sizeof(sm)) { +return
Re: [Qemu-devel] [RFC 4/6] virtio-net: implement steering mode feature
On 2018年08月30日 22:27, Sameeh Jubran wrote: From: Sameeh Jubran Signed-off-by: Sameeh Jubran --- hw/net/virtio-net.c | 65 + include/hw/virtio/virtio-net.h | 3 ++ include/standard-headers/linux/virtio_net.h | 55 3 files changed, 116 insertions(+), 7 deletions(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 90502fca7c..e7c4ce6f66 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -972,13 +972,53 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_OK; } +static int virtio_net_ctrl_steering_mode(VirtIONet *n, uint8_t cmd, +struct iovec *iov, unsigned int iov_cnt, +struct iovec *iov_in, unsigned int iov_cnt_in, +size_t *size_in) +{ +size_t s; +struct virtio_net_steering_mode sm; + +switch (cmd) { +case VIRTIO_NET_CTRL_SM_GET_SUPPORTED_MODES: +if (!size_in) { +return VIRTIO_NET_ERR; +} + s = iov_from_buf(iov_in, iov_cnt_in, 0, + >supported_modes, sizeof(n->supported_modes)); +if (s != sizeof(n->supported_modes) || + !size_in) { size_in has been checked in the above I think? +return VIRTIO_NET_ERR; +} + *size_in = s; + break; +case VIRTIO_NET_CTRL_SM_CONTROL: +s = iov_to_buf(iov, iov_cnt, 0, , sizeof(sm) - +sizeof(union command_data)); +if (s != sizeof(sm) - sizeof(union command_data)) { +return VIRTIO_NET_ERR; +} +/* switch (cmd) + { +dafault: +return VIRTIO_NET_ERR; + } */ + break; +default: +return VIRTIO_NET_ERR; +} + +return VIRTIO_NET_OK; +} + static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq) { VirtIONet *n = VIRTIO_NET(vdev); struct virtio_net_ctrl_hdr ctrl; virtio_net_ctrl_ack status = VIRTIO_NET_ERR; VirtQueueElement *elem; -size_t s; +size_t s, elem_in_size = 0; struct iovec *iov, *iov2; unsigned int iov_cnt; @@ -996,7 +1036,8 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq) } iov_cnt = elem->out_num; -iov2 = iov = g_memdup(elem->out_sg, sizeof(struct iovec) * elem->out_num); +iov2 = iov = g_memdup(elem->out_sg, sizeof(struct iovec) * +elem->out_num); Still looks unnecessary. s = iov_to_buf(iov, iov_cnt, 0, , sizeof(ctrl)); iov_discard_front(, _cnt, sizeof(ctrl)); if (s != sizeof(ctrl)) { @@ -1013,12 +1054,20 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq) status = virtio_net_handle_mq(n, ctrl.cmd, iov, iov_cnt); } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) { status = virtio_net_handle_offloads(n, ctrl.cmd, iov, iov_cnt); +} else if (ctrl.class == VIRTIO_NET_CTRL_STEERING_MODE) { +size_t size_in = 0; +status = virtio_net_ctrl_steering_mode(n, ctrl.cmd, iov, iov_cnt, + elem->in_sg, elem->in_num, _in); +if (status == VIRTIO_NET_OK && size_in > 0) { +elem_in_size += size_in; +} } -s = iov_from_buf(elem->in_sg, elem->in_num, 0, , sizeof(status)); +s = iov_from_buf(elem->in_sg, elem->in_num, elem_in_size, , +sizeof(status)); assert(s == sizeof(status)); - -virtqueue_push(vq, elem, sizeof(status)); +elem_in_size += s; +virtqueue_push(vq, elem, elem_in_size); virtio_notify(vdev, vq); g_free(iov2); g_free(elem); @@ -1375,10 +1424,10 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q) n->guest_hdr_len, -1); if (out_num == VIRTQUEUE_MAX_SIZE) { goto drop; - } +} out_num += 1; out_sg = sg2; - } + } Let's avoid mixing such possible style fixes. } /* * If host wants to see the guest header as is, we can @@ -1957,6 +2006,8 @@ static void virtio_net_device_realize(DeviceState *dev, Error **errp) n->host_features |= (1ULL << VIRTIO_NET_F_MTU); } +n->host_features |= (1ULL << VIRTIO_NET_F_CTRL_STEERING_MODE); I think we need a property for virtio-net and compat this for old machine types. + if (n->net_conf.duplex_str) { if (strncmp(n->net_conf.duplex_str, "half", 5) == 0) { n->net_conf.duplex = DUPLEX_HALF; diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h index a7b53edc96..809e85481c 100644 --- a/include/hw/virtio/virtio-net.h +++ b/include/hw/virtio/virtio-net.h @@ -102,6
Re: [Qemu-devel] [RFC 3/6] vhost-net: Expose vhost_net_get_fd
On 2018年08月30日 22:27, Sameeh Jubran wrote: From: Sameeh Jubran Signed-off-by: Sameeh Jubran Better explain the motivation in the commit log. Thanks --- hw/net/vhost_net.c | 2 +- include/hw/virtio/virtio-net.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index e037db63a3..c0bff725c9 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -129,7 +129,7 @@ uint64_t vhost_net_get_acked_features(VHostNetState *net) return net->dev.acked_features; } -static int vhost_net_get_fd(NetClientState *backend) +int vhost_net_get_fd(NetClientState *backend) { switch (backend->info->type) { case NET_CLIENT_DRIVER_TAP: diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h index 02484dc94c..a7b53edc96 100644 --- a/include/hw/virtio/virtio-net.h +++ b/include/hw/virtio/virtio-net.h @@ -107,4 +107,6 @@ typedef struct VirtIONet { void virtio_net_set_netclient_name(VirtIONet *n, const char *name, const char *type); +int vhost_net_get_fd(NetClientState *backend); + #endif
Re: [Qemu-devel] [RFC 2/6] tap: Add support for bpf ioctls
On 2018年08月30日 22:27, Sameeh Jubran wrote: From: Sameeh Jubran Starting from kernel v4.16 tun device supports TUNSETSTEERINGEBPF and TUNSETFILTEREBPF. Signed-off-by: Sameeh Jubran --- include/net/net.h | 3 ++- net/tap-bsd.c | 5 + net/tap-linux.c | 29 - net/tap-linux.h | 3 ++- net/tap-solaris.c | 5 + net/tap-stub.c| 5 + net/tap.c | 8 net/tap_int.h | 1 + qapi/net.json | 11 +++ 9 files changed, 67 insertions(+), 3 deletions(-) diff --git a/include/net/net.h b/include/net/net.h index 1425960f76..e7d1baac10 100644 --- a/include/net/net.h +++ b/include/net/net.h @@ -39,7 +39,6 @@ typedef struct NICConf { DEFINE_PROP_MACADDR("mac", _state, _conf.macaddr),\ DEFINE_PROP_NETDEV("netdev", _state, _conf.peers) - Looks unnecessary. /* Net clients */ typedef void (NetPoll)(NetClientState *, bool enable); @@ -60,6 +59,7 @@ typedef int (SetVnetLE)(NetClientState *, bool); typedef int (SetVnetBE)(NetClientState *, bool); typedef struct SocketReadState SocketReadState; typedef void (SocketReadStateFinalize)(SocketReadState *rs); +typedef int (SetBPFFilter)(NetClientState *, int, BPFType); Looks like SetBPFProg is better? Anyway steering prog is not a filter. typedef struct NetClientInfo { NetClientDriver type; @@ -80,6 +80,7 @@ typedef struct NetClientInfo { SetVnetHdrLen *set_vnet_hdr_len; SetVnetLE *set_vnet_le; SetVnetBE *set_vnet_be; +SetBPFFilter *set_bpf_filter; } NetClientInfo; struct NetClientState { diff --git a/net/tap-bsd.c b/net/tap-bsd.c index 6c9692263d..fccf17bad0 100644 --- a/net/tap-bsd.c +++ b/net/tap-bsd.c @@ -259,3 +259,8 @@ int tap_fd_get_ifname(int fd, char *ifname) { return -1; } + +int tap_fd_load_bpf(int fd, int bpf_fd, BPFType type) +{ +return -1; +} diff --git a/net/tap-linux.c b/net/tap-linux.c index 535b1ddb61..e8ee54f3b3 100644 --- a/net/tap-linux.c +++ b/net/tap-linux.c @@ -305,7 +305,8 @@ int tap_fd_get_ifname(int fd, char *ifname) { struct ifreq ifr; -if (ioctl(fd, TUNGETIFF, ) != 0) { +if (ioctl(fd, TUNGETIFF, ) != 0) +{ This looks unnecessary. error_report("TUNGETIFF ioctl() failed: %s", strerror(errno)); return -1; @@ -314,3 +315,29 @@ int tap_fd_get_ifname(int fd, char *ifname) pstrcpy(ifname, sizeof(ifr.ifr_name), ifr.ifr_name); return 0; } + + +int tap_fd_load_bpf(int fd, int bpf_fd, BPFType type) +{ +int ioctl_num = 0; +switch (type) +{ +case BPF_TYPE_FILTER: +ioctl_num = TUNSETFILTEREBPF; +break; + +case BPF_TYPE_STEERING: +ioctl_num = TUNSETSTEERINGEBPF; +break; + +default: +error_report("Unknown bpf_type"); +return -1; +} Indentation looks odd. + +if (ioctl(fd, ioctl_num, _fd) != 0) { +error_report("#%d ioctl() failed: %s", ioctl_num, strerror(errno)); +return -1; +} +return 0; +} diff --git a/net/tap-linux.h b/net/tap-linux.h index 2f36d100fc..7348169fc2 100644 --- a/net/tap-linux.h +++ b/net/tap-linux.h @@ -31,7 +31,8 @@ #define TUNSETQUEUE _IOW('T', 217, int) #define TUNSETVNETLE _IOW('T', 220, int) #define TUNSETVNETBE _IOW('T', 222, int) - +#define TUNSETSTEERINGEBPF _IOR('T', 224, int) +#define TUNSETFILTEREBPF _IOR('T', 225, int) #endif /* TUNSETIFF ifr flags */ diff --git a/net/tap-solaris.c b/net/tap-solaris.c index a2a92356c1..a5a6248c7d 100644 --- a/net/tap-solaris.c +++ b/net/tap-solaris.c @@ -254,3 +254,8 @@ int tap_fd_get_ifname(int fd, char *ifname) { return -1; } + +int tap_fd_load_bpf(int fd, int bpf_fd, BPFType type) +{ +return -1; +} diff --git a/net/tap-stub.c b/net/tap-stub.c index a9ab8f8293..d059a32435 100644 --- a/net/tap-stub.c +++ b/net/tap-stub.c @@ -85,3 +85,8 @@ int tap_fd_get_ifname(int fd, char *ifname) { return -1; } + +int tap_fd_load_bpf(int fd, int bpf_fd, BPFType type) +{ +return -1; +} diff --git a/net/tap.c b/net/tap.c index 2126f4882d..ee98fecd40 100644 --- a/net/tap.c +++ b/net/tap.c @@ -342,6 +342,13 @@ int tap_get_fd(NetClientState *nc) return s->fd; } +static int tap_set_bpf_filter(NetClientState *nc, int bpf_fd, BPFType type) +{ +TAPState *s = DO_UPCAST(TAPState, nc, nc); +assert(nc->info->type == NET_CLIENT_DRIVER_TAP); +return tap_fd_load_bpf(s->fd, bpf_fd, type); +} + /* fd support */ static NetClientInfo net_tap_info = { @@ -360,6 +367,7 @@ static NetClientInfo net_tap_info = { .set_vnet_hdr_len = tap_set_vnet_hdr_len, .set_vnet_le = tap_set_vnet_le, .set_vnet_be = tap_set_vnet_be, +.set_bpf_filter = tap_set_bpf_filter, }; static TAPState *net_tap_fd_init(NetClientState *peer, diff --git a/net/tap_int.h b/net/tap_int.h index 9f931d52d6..3e1603a88e 100644 --- a/net/tap_int.h +++ b/net/tap_int.h @@ -45,5
Re: [Qemu-devel] [PATCH] Fix a deadlock case in the CPU hotplug flow
On Sun, Sep 02, 2018 at 11:19:04AM -0300, Jose Ricardo Ziviani wrote: > We need to set cs->halted to 1 before calling ppc_set_compat. The reason > is that ppc_set_compat kicks up the new thread created to manage the > hotplugged KVM virtual CPU and the code drives directly to KVM_RUN > ioctl. When cs->halted is 1, the code: > > int kvm_cpu_exec(CPUState *cpu) > ... > if (kvm_arch_process_async_events(cpu)) { > atomic_set(>exit_request, 0); > return EXCP_HLT; > } > ... > > returns before it reaches KVM_RUN, giving time to the main thread to > finish its job. Otherwise we can fall in a deadlock because the KVM > thread will issue the KVM_RUN ioctl while the main thread is setting up > KVM registers. Depending on how these jobs are scheduled we'll end up > freezing QEMU. > > The following output shows kvm_vcpu_ioctl sleeping because it cannot get > the mutex and never will. > PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr. > > STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL > > PID: 61564 TASK: c03e981e0780 CPU: 48 COMMAND: "qemu-system-ppc" > #0 [c03e982679a0] __schedule at c0b10a44 > #1 [c03e98267a60] schedule at c0b113a8 > #2 [c03e98267a90] schedule_preempt_disabled at c0b11910 > #3 [c03e98267ab0] __mutex_lock at c0b132ec > #4 [c03e98267bc0] kvm_vcpu_ioctl at c0080ea03140 [kvm] > #5 [c03e98267d20] do_vfs_ioctl at c0407d30 > #6 [c03e98267dc0] ksys_ioctl at c0408674 > #7 [c03e98267e10] sys_ioctl at c04086f8 > #8 [c03e98267e30] system_call at c000b488 > > crash> struct -x kvm.vcpus 0xc03da000 > vcpus = {0xc03db488, 0xc03d52b8, 0xc039e9c8, > 0xc03d0e20, 0xc03d5828, 0x0, 0x0, ...} > > crash> struct -x kvm_vcpu.mutex.owner 0xc03d5828 > mutex.owner = { > counter = 0xc03a23a5c881 <- flag 1: waiters > }, > > crash> bt 0xc03a23a5c880 > PID: 61579 TASK: c03a23a5c880 CPU: 9 COMMAND: "CPU 4/KVM" > (active) > > crash> struct -x kvm_vcpu.mutex.wait_list 0xc03d5828 > mutex.wait_list = { > next = 0xc03e98267b10, > prev = 0xc03e98267b10 > }, > > crash> struct -x mutex_waiter.task 0xc03e98267b10 > task = 0xc03e981e0780 > > The following command-line was used to reproduce the problem (note: gdb > and trace can change the results). > > $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \ > -enable-kvm -m 4096 \ > -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \ > -display none -nographic \ > -drive file=disk1.qcow2,format=qcow2 > ... > (qemu) device_add host-spapr-cpu-core,core-id=4 > [no interaction is possible after it, only SIGKILL to take the terminal > back] > > Signed-off-by: Jose Ricardo Ziviani Applied, thanks. > --- > hw/ppc/spapr_cpu_core.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > index 876f0b3d9d..a73b244a3f 100644 > --- a/hw/ppc/spapr_cpu_core.c > +++ b/hw/ppc/spapr_cpu_core.c > @@ -34,16 +34,16 @@ static void spapr_cpu_reset(void *opaque) > > cpu_reset(cs); > > -/* Set compatibility mode to match the boot CPU, which was either set > - * by the machine reset code or by CAS. This should never fail. > - */ > -ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); > - > /* All CPUs start halted. CPU0 is unhalted from the machine level > * reset code and the rest are explicitly started up by the guest > * using an RTAS call */ > cs->halted = 1; > > +/* Set compatibility mode to match the boot CPU, which was either set > + * by the machine reset code or by CAS. This should never fail. > + */ > +ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); > + > env->spr[SPR_HIOR] = 0; > > lpcr = env->spr[SPR_LPCR]; -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH 0/4] Reformatted INFO files to fit the Markdown (.md) format.
Please ignore this message as a V2 was already sent. Thanks, Yoni. On 9/2/18 3:36 PM, Yoni Bettan wrote: This patch make INFO files (e.g. README, CODING_STYLE etc) much more readable when watched from GitHub GUI. ./VERSION and ./MAINTAINERS were leave untouched otherwise ./scripts/get_maintainer.pl breaks. ./COPYING and ./COPYING.LIB were leave untouched because it look likes they come from GNU and not QEMU. Visualization of the change can be found at https://github.com/ybettan/qemu/tree/md. NOTE: This last part of the message should be removed from the commit message since I will remove the 'md' brance from my fork if the patch is accepted. Yoni Bettan (4): README.md : Reformatted to fit the Markdown (.md) format. CODING_STYLE.md : Reformatted to fit the Markdown (.md) format. HACKING.md : Reformatted to fit the Markdown (.md) format. LICENSE.md : Reformatted to fit the Markdown (.md) format. CODING_STYLE => CODING_STYLE.md | 151 ++ HACKING => HACKING.md | 186 +--- LICENSE => LICENSE.md | 4 +- README => README.md | 87 +++ 4 files changed, 224 insertions(+), 204 deletions(-) rename CODING_STYLE => CODING_STYLE.md (50%) rename HACKING => HACKING.md (53%) rename LICENSE => LICENSE.md (90%) rename README => README.md (68%)
Re: [Qemu-devel] [PATCH 0/4] Reformatted INFO files to fit the Markdown (.md) format.
On 9/2/18 8:13 PM, Peter Maydell wrote: On 2 September 2018 at 13:36, Yoni Bettan wrote: This patch make INFO files (e.g. README, CODING_STYLE etc) much more readable when watched from GitHub GUI. ./VERSION and ./MAINTAINERS were leave untouched otherwise ./scripts/get_maintainer.pl breaks. ./COPYING and ./COPYING.LIB were leave untouched because it look likes they come from GNU and not QEMU. Thanks for this patch, but we don't use Markdown format. We use RST for docs files, but README, CODING_STYLE, etc are just plain text and fine as they are I think. Thanks for your reply. I sent a V2 patch before I have noticed your response so if you could please copy your response to the V2 mail I will appreciate it. Thanks, Yoni -- PMM
[Qemu-devel] [PATCH V2 3/3] HACKING.md : Formatted to fit Markdown (.md) format.
Signed-off-by: Yoni Bettan --- HACKING => HACKING.md | 186 ++ 1 file changed, 97 insertions(+), 89 deletions(-) rename HACKING => HACKING.md (53%) diff --git a/HACKING b/HACKING.md similarity index 53% rename from HACKING rename to HACKING.md index 0fc3e0fc04..f9d7631e64 100644 --- a/HACKING +++ b/HACKING.md @@ -1,86 +1,90 @@ -1. Preprocessor +## Preprocessor -1.1. Variadic macros +### Variadic macros For variadic macros, stick with this C99-like syntax: +``` #define DPRINTF(fmt, ...) \ do { printf("IRQ: " fmt, ## __VA_ARGS__); } while (0) +``` -1.2. Include directives +### Include directives Order include directives as follows: +``` #include "qemu/osdep.h" /* Always first... */ #include <...> /* then system headers... */ #include "..." /* and finally QEMU headers. */ +``` -The "qemu/osdep.h" header contains preprocessor macros that affect the behavior -of core system headers like . It must be the first include so that +The `qemu/osdep.h` header contains preprocessor macros that affect the behavior +of core system headers like ``. It must be the first include so that core system headers included by external libraries get the preprocessor macros that QEMU depends on. -Do not include "qemu/osdep.h" from header files since the .c file will have +Do not include `qemu/osdep.h` from header files since the .c file will have already included it. -2. C types +## C types It should be common sense to use the right type, but we have collected a few useful guidelines here. -2.1. Scalars +### Scalars -If you're using "int" or "long", odds are good that there's a better type. -If a variable is counting something, it should be declared with an -unsigned type. +If you're using `int` or `long`, odds are good that there's a better type. +If a variable is counting something, it should be declared with +`unsigned `. -If it's host memory-size related, size_t should be a good choice (use -ssize_t only if required). Guest RAM memory offsets must use ram_addr_t, +If it's host memory-size related, `size_t` should be a good choice (use +`ssize_t` only if required). Guest RAM memory offsets must use `ram_addr_t`, but only for RAM, it may not cover whole guest address space. -If it's file-size related, use off_t. -If it's file-offset related (i.e., signed), use off_t. -If it's just counting small numbers use "unsigned int"; +If it's file-size related, use `off_t`. +If it's file-offset related (i.e., signed), `use off_t`. +If it's just counting small numbers use `unsigned int`; (on all but oddball embedded systems, you can assume that that type is at least four bytes wide). In the event that you require a specific width, use a standard type -like int32_t, uint32_t, uint64_t, etc. The specific types are +like `int32_t`, `uint32_t`, `uint64_t`, etc. The specific types are mandatory for VMState fields. -Don't use Linux kernel internal types like u32, __u32 or __le32. +Don't use Linux kernel internal types like `u32`, `__u32` or `__le32`. -Use hwaddr for guest physical addresses except pcibus_t -for PCI addresses. In addition, ram_addr_t is a QEMU internal address +Use `hwaddr` for guest physical addresses except `pcibus_t` +for PCI addresses. In addition, `ram_addr_t` is a QEMU internal address space that maps guest RAM physical addresses into an intermediate address space that can map to host virtual address spaces. Generally -speaking, the size of guest memory can always fit into ram_addr_t but +speaking, the size of guest memory can always fit into `ram_addr_t` but it would not be correct to store an actual guest physical address in a -ram_addr_t. +`ram_addr_t`. For CPU virtual addresses there are several possible types. -vaddr is the best type to use to hold a CPU virtual address in +`vaddr` is the best type to use to hold a CPU virtual address in target-independent code. It is guaranteed to be large enough to hold a virtual address for any target, and it does not change size from target to target. It is always unsigned. -target_ulong is a type the size of a virtual address on the CPU; this means +`target_ulong` is a type the size of a virtual address on the CPU; this means it may be 32 or 64 bits depending on which target is being built. It should therefore be used only in target-specific code, and in some performance-critical built-per-target core code such as the TLB code. -There is also a signed version, target_long. -abi_ulong is for the *-user targets, and represents a type the size of -'void *' in that target's ABI. (This may not be the same as the size of a +There is also a signed version, `target_long`. +`abi_ulong` is for the `*-user` targets, and represents a type the size of +`void *` in that target's ABI. (This may not be the same as the size of a full CPU virtual address in the case of target ABIs which use 32 bit pointers -on 64 bit CPUs, like
[Qemu-devel] [PATCH V2 2/3] CODING_STYLE.md : Formatted to fit Markdown (.md) format.
Signed-off-by: Yoni Bettan --- CODING_STYLE => CODING_STYLE.md | 151 +--- 1 file changed, 80 insertions(+), 71 deletions(-) rename CODING_STYLE => CODING_STYLE.md (50%) diff --git a/CODING_STYLE b/CODING_STYLE.md similarity index 50% rename from CODING_STYLE rename to CODING_STYLE.md index ec075dedc4..776868458d 100644 --- a/CODING_STYLE +++ b/CODING_STYLE.md @@ -1,10 +1,9 @@ -QEMU Coding Style -= +# QEMU Coding Style Please use the script checkpatch.pl in the scripts directory to check patches before submitting. -1. Whitespace +## Whitespace Of course, the most important aspect in any coding style is whitespace. Crusty old coders who have trouble spotting the glasses on their noses @@ -16,20 +15,20 @@ QEMU indents are four spaces. Tabs are never used, except in Makefiles where they have been irreversibly coded into the syntax. Spaces of course are superior to tabs because: - - You have just one way to specify whitespace, not two. Ambiguity breeds +* You have just one way to specify whitespace, not two. Ambiguity breeds mistakes. - - The confusion surrounding 'use tabs to indent, spaces to justify' is gone. - - Tab indents push your code to the right, making your screen seriously +* The confusion surrounding 'use tabs to indent, spaces to justify' is gone. +* Tab indents push your code to the right, making your screen seriously unbalanced. - - Tabs will be rendered incorrectly on editors who are misconfigured not +* Tabs will be rendered incorrectly on editors who are misconfigured not to use tab stops of eight positions. - - Tabs are rendered badly in patches, causing off-by-one errors in almost +* Tabs are rendered badly in patches, causing off-by-one errors in almost every line. - - It is the QEMU coding style. +* It is the QEMU coding style. Do not leave whitespace dangling off the ends of lines. -2. Line width +## Line width Lines should be 80 characters; try not to make them longer. @@ -38,28 +37,28 @@ that use long function or symbol names. Even in that case, do not make lines much longer than 80 characters. Rationale: - - Some people like to tile their 24" screens with a 6x4 matrix of 80x24 - xterms and use vi in all of them. The best way to punish them is to - let them keep doing it. - - Code and especially patches is much more readable if limited to a sane - line length. Eighty is traditional. - - The four-space indentation makes the most common excuse ("But look - at all that white space on the left!") moot. - - It is the QEMU coding style. +* Some people like to tile their 24" screens with a 6x4 matrix of 80x24 + xterms and use vi in all of them. The best way to punish them is to + let them keep doing it. +* Code and especially patches is much more readable if limited to a sane + line length. Eighty is traditional. +* The four-space indentation makes the most common excuse ("But look + at all that white space on the left!") moot. +* It is the QEMU coding style. -3. Naming +## Naming -Variables are lower_case_with_underscores; easy to type and read. Structured -type names are in CamelCase; harder to type but standing out. Enum type -names and function type names should also be in CamelCase. Scalar type -names are lower_case_with_underscores_ending_with_a_t, like the POSIX -uint64_t and family. Note that this last convention contradicts POSIX +Variables are `lower_case_with_underscores`; easy to type and read. Structured +type names are in `CamelCase`; harder to type but standing out. Enum type +names and function type names should also be in `CamelCase`. Scalar type +names are `lower_case_with_underscores_ending_with_a_t`, like the POSIX +`uint64_t` and family. Note that this last convention contradicts POSIX and is therefore likely to be changed. -When wrapping standard library functions, use the prefix qemu_ to alert +When wrapping standard library functions, use the prefix `qemu_` to alert readers that they are seeing a wrapped version; otherwise avoid this prefix. -4. Block structure +## Block structure Every indented statement is braced; even if the block contains just one statement. The opening brace is on the line that contains the control @@ -67,69 +66,79 @@ flow statement that introduces the new block; the closing brace is on the same line as the else keyword, or on a line by itself if there is no else keyword. Example: -if (a == 5) { -printf("a was 5.\n"); -} else if (a == 6) { -printf("a was 6.\n"); -} else { -printf("a was something else entirely.\n"); -} +``` +if (a == 5) { +printf("a was 5.\n"); +} else if (a == 6) { +printf("a was 6.\n"); +} else { +printf("a was something else entirely.\n"); +} +``` -Note that 'else if' is considered a single statement; otherwise a long if/ -else if/else if/.../else sequence would need an indent for every else +Note that `else if` is considered a
[Qemu-devel] [PATCH V2 1/3] README.md : Formatted to fit Markdown (.md) format.
Also updated scripts/checkpatch.pl and made it reference to README.md instead of README. Signed-off-by: Yoni Bettan --- README => README.md | 89 ++- scripts/checkpatch.pl | 2 +- 2 files changed, 46 insertions(+), 45 deletions(-) rename README => README.md (67%) diff --git a/README b/README.md similarity index 67% rename from README rename to README.md index 49a9fd09cd..5fc06dcf8a 100644 --- a/README +++ b/README.md @@ -1,5 +1,4 @@ - QEMU README - === +# Qemu QEMU is a generic and open source machine & userspace emulator and virtualizer. @@ -27,89 +26,93 @@ It is commonly invoked indirectly via the libvirt library when using open source applications such as oVirt, OpenStack and virt-manager. QEMU as a whole is released under the GNU General Public License, -version 2. For full licensing details, consult the LICENSE file. +version 2. For full licensing details, consult the [LICENSE](LICENSE) file. -Building - +## Building QEMU is multi-platform software intended to be buildable on all modern Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety of other UNIX targets. The simple steps to build QEMU are: - mkdir build - cd build - ../configure - make +``` +mkdir build +cd build +../configure +make +``` Additional information can also be found online via the QEMU website: - https://qemu.org/Hosts/Linux - https://qemu.org/Hosts/Mac - https://qemu.org/Hosts/W32 +* https://qemu.org/Hosts/Linux +* https://qemu.org/Hosts/Mac +* https://qemu.org/Hosts/W32 -Submitting patches -== +## Submitting patches The QEMU source code is maintained under the GIT version control system. - git clone git://git.qemu.org/qemu.git +`git clone git://git.qemu.org/qemu.git` -When submitting patches, one common approach is to use 'git -format-patch' and/or 'git send-email' to format & send the mail to the -qemu-devel@nongnu.org mailing list. All patches submitted must contain -a 'Signed-off-by' line from the author. Patches should follow the -guidelines set out in the HACKING and CODING_STYLE files. +When submitting patches, one common approach is to use `git format-patch` +and/or `git send-email` to format & send the mail to the +[qemu-devel@nongnu.org](https://lists.nongnu.org/mailman/listinfo/qemu-devel) +mailing list. All patches submitted must contain a 'Signed-off-by' line from +the author. Patches should follow the guidelines set out in the +[HACKING.md](HACKING.md) and [CODING_STYLE.md](CODING_STYLE.md) files. Additional information on submitting patches can be found online via the QEMU website - https://qemu.org/Contribute/SubmitAPatch - https://qemu.org/Contribute/TrivialPatches +* https://qemu.org/Contribute/SubmitAPatch +* https://qemu.org/Contribute/TrivialPatches The QEMU website is also maintained under source control. - git clone git://git.qemu.org/qemu-web.git - https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/ +`git clone git://git.qemu.org/qemu-web.git` +* https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/ -A 'git-publish' utility was created to make above process less +A `git-publish` utility was created to make above process less cumbersome, and is highly recommended for making regular contributions, or even just for sending consecutive patch series revisions. It also -requires a working 'git send-email' setup, and by default doesn't +requires a working `git send-email` setup, and by default doesn't automate everything, so you may want to go through the above steps manually for once. For installation instructions, please go to - https://github.com/stefanha/git-publish +* https://github.com/stefanha/git-publish The workflow with 'git-publish' is: - $ git checkout master -b my-feature - $ # work on new commits, add your 'Signed-off-by' lines to each - $ git publish +``` +git checkout master -b my-feature +# work on new commits, add your 'Signed-off-by' lines to each +git publish +``` Your patch series will be sent and tagged as my-feature-v1 if you need to refer back to it in the future. Sending v2: - $ git checkout my-feature # same topic branch - $ # making changes to the commits (using 'git rebase', for example) - $ git publish +``` +git checkout my-feature # same topic branch +# making changes to the commits (using 'git rebase', for example) +git publish +``` Your patch series will be sent with 'v2' tag in the subject and the git tip will be tagged as my-feature-v2. -Bug reporting -= +## Bug reporting The QEMU project uses Launchpad as its primary upstream bug tracker. Bugs found when running code built from QEMU git or upstream released sources should be reported via: - https://bugs.launchpad.net/qemu/ +* https://bugs.launchpad.net/qemu/ If using QEMU via an operating system vendor pre-built binary package, it is preferable to report bugs to the vendor's own bug
[Qemu-devel] [PATCH V2 0/3] Formatted INFO files to fit Markdown (.md) format.
This patch make INFO files (e.g. README, CODING_STYLE and HACKING) much more readable when watched from GitHub GUI. ./VERSION, ./MAINTAINERS were leave untouched otherwise ./scripts/get_maintainer.pl breaks. ./COPYING, ./LICENSE ./COPYING.LIB were leave untouched because it look likes they come from GNU and not QEMU. Visualization of the change can be found at https://github.com/ybettan/qemu/tree/md. NOTE: This last part of the message should be removed from the commit message since I will remove the 'md' branch from my fork if the patch is accepted. Yoni Bettan (3): README.md : Formatted to fit Markdown (.md) format. CODING_STYLE.md : Formatted to fit Markdown (.md) format. HACKING.md : Formatted to fit Markdown (.md) format. CODING_STYLE => CODING_STYLE.md | 151 ++ HACKING => HACKING.md | 186 +--- README => README.md | 89 +++ scripts/checkpatch.pl | 2 +- 4 files changed, 223 insertions(+), 205 deletions(-) rename CODING_STYLE => CODING_STYLE.md (50%) rename HACKING => HACKING.md (53%) rename README => README.md (67%) V2: - Undone the changes on ./LICENSE - Updated scripts/get_maintainer.pl to reference to README.md instead of README - Repaired broken link to ./LICENSE in ./README.md -- 2.17.1
Re: [Qemu-devel] [PATCH 0/4] Reformatted INFO files to fit the Markdown (.md) format.
On 2 September 2018 at 13:36, Yoni Bettan wrote: > This patch make INFO files (e.g. README, CODING_STYLE etc) much more > readable when watched from GitHub GUI. > > ./VERSION and ./MAINTAINERS were leave untouched otherwise > ./scripts/get_maintainer.pl breaks. > > ./COPYING and ./COPYING.LIB were leave untouched because it look likes > they come from GNU and not QEMU. Thanks for this patch, but we don't use Markdown format. We use RST for docs files, but README, CODING_STYLE, etc are just plain text and fine as they are I think. -- PMM
Re: [Qemu-devel] [Qemu-ppc] [PATCH] Fix a deadlock case in the CPU hotplug flow
On Sun, 2 Sep 2018 11:19:04 -0300 Jose Ricardo Ziviani wrote: > We need to set cs->halted to 1 before calling ppc_set_compat. The reason > is that ppc_set_compat kicks up the new thread created to manage the > hotplugged KVM virtual CPU and the code drives directly to KVM_RUN > ioctl. When cs->halted is 1, the code: > > int kvm_cpu_exec(CPUState *cpu) > ... > if (kvm_arch_process_async_events(cpu)) { > atomic_set(>exit_request, 0); > return EXCP_HLT; > } > ... > > returns before it reaches KVM_RUN, giving time to the main thread to > finish its job. Otherwise we can fall in a deadlock because the KVM > thread will issue the KVM_RUN ioctl while the main thread is setting up > KVM registers. Depending on how these jobs are scheduled we'll end up > freezing QEMU. > > The following output shows kvm_vcpu_ioctl sleeping because it cannot get > the mutex and never will. > PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr. > > STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL > > PID: 61564 TASK: c03e981e0780 CPU: 48 COMMAND: "qemu-system-ppc" > #0 [c03e982679a0] __schedule at c0b10a44 > #1 [c03e98267a60] schedule at c0b113a8 > #2 [c03e98267a90] schedule_preempt_disabled at c0b11910 > #3 [c03e98267ab0] __mutex_lock at c0b132ec > #4 [c03e98267bc0] kvm_vcpu_ioctl at c0080ea03140 [kvm] > #5 [c03e98267d20] do_vfs_ioctl at c0407d30 > #6 [c03e98267dc0] ksys_ioctl at c0408674 > #7 [c03e98267e10] sys_ioctl at c04086f8 > #8 [c03e98267e30] system_call at c000b488 > > crash> struct -x kvm.vcpus 0xc03da000 > vcpus = {0xc03db488, 0xc03d52b8, 0xc039e9c8, > 0xc03d0e20, 0xc03d5828, 0x0, 0x0, ...} > > crash> struct -x kvm_vcpu.mutex.owner 0xc03d5828 > mutex.owner = { > counter = 0xc03a23a5c881 <- flag 1: waiters > }, > > crash> bt 0xc03a23a5c880 > PID: 61579 TASK: c03a23a5c880 CPU: 9 COMMAND: "CPU 4/KVM" > (active) > > crash> struct -x kvm_vcpu.mutex.wait_list 0xc03d5828 > mutex.wait_list = { > next = 0xc03e98267b10, > prev = 0xc03e98267b10 > }, > > crash> struct -x mutex_waiter.task 0xc03e98267b10 > task = 0xc03e981e0780 > > The following command-line was used to reproduce the problem (note: gdb > and trace can change the results). > > $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \ > -enable-kvm -m 4096 \ > -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \ > -display none -nographic \ > -drive file=disk1.qcow2,format=qcow2 > ... > (qemu) device_add host-spapr-cpu-core,core-id=4 > [no interaction is possible after it, only SIGKILL to take the terminal > back] > > Signed-off-by: Jose Ricardo Ziviani > --- I don't seen any obvious difference with your previous post but anyway it still looks good so: Reviewed-by: Greg Kurz > hw/ppc/spapr_cpu_core.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > index 876f0b3d9d..a73b244a3f 100644 > --- a/hw/ppc/spapr_cpu_core.c > +++ b/hw/ppc/spapr_cpu_core.c > @@ -34,16 +34,16 @@ static void spapr_cpu_reset(void *opaque) > > cpu_reset(cs); > > -/* Set compatibility mode to match the boot CPU, which was either set > - * by the machine reset code or by CAS. This should never fail. > - */ > -ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); > - > /* All CPUs start halted. CPU0 is unhalted from the machine level > * reset code and the rest are explicitly started up by the guest > * using an RTAS call */ > cs->halted = 1; > > +/* Set compatibility mode to match the boot CPU, which was either set > + * by the machine reset code or by CAS. This should never fail. > + */ > +ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); > + > env->spr[SPR_HIOR] = 0; > > lpcr = env->spr[SPR_LPCR];
Re: [Qemu-devel] [Qemu-ppc] [PATCH] Fix a deadlock case in the CPU hotplug flow
On Sun, 2 Sep 2018 02:01:12 -0300 Jose Ricardo Ziviani wrote: > From: Jose Ricardo Ziviani > > We need to set cs->halted to 1 before calling ppc_set_compat. The reason > is that ppc_set_compat kicks up the new thread created to manage the > hotplugged KVM virtual CPU and the code drives directly to KVM_RUN > ioctl. When cs->halted is 1, the code: > > int kvm_cpu_exec(CPUState *cpu) > ... > if (kvm_arch_process_async_events(cpu)) { > atomic_set(>exit_request, 0); > return EXCP_HLT; > } > ... > > returns before it reaches KVM_RUN, giving time to the main thread to > finish its job. Otherwise we can fall in a deadlock because the KVM > thread will issue the KVM_RUN ioctl while the main thread is setting up > KVM registers. Depending on how these jobs are scheduled we'll end up > freezing QEMU. > > The following output shows kvm_vcpu_ioctl sleeping because it cannot get > the mutex and never will. > PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr. > > STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL > > PID: 61564 TASK: c03e981e0780 CPU: 48 COMMAND: "qemu-system-ppc" > #0 [c03e982679a0] __schedule at c0b10a44 > #1 [c03e98267a60] schedule at c0b113a8 > #2 [c03e98267a90] schedule_preempt_disabled at c0b11910 > #3 [c03e98267ab0] __mutex_lock at c0b132ec > #4 [c03e98267bc0] kvm_vcpu_ioctl at c0080ea03140 [kvm] > #5 [c03e98267d20] do_vfs_ioctl at c0407d30 > #6 [c03e98267dc0] ksys_ioctl at c0408674 > #7 [c03e98267e10] sys_ioctl at c04086f8 > #8 [c03e98267e30] system_call at c000b488 > > crash> struct -x kvm.vcpus 0xc03da000 > vcpus = {0xc03db488, 0xc03d52b8, 0xc039e9c8, > 0xc03d0e20, 0xc03d5828, 0x0, 0x0, ...} > > crash> struct -x kvm_vcpu.mutex.owner 0xc03d5828 > mutex.owner = { > counter = 0xc03a23a5c881 <- flag 1: waiters > }, > > crash> bt 0xc03a23a5c880 > PID: 61579 TASK: c03a23a5c880 CPU: 9 COMMAND: "CPU 4/KVM" > (active) > > crash> struct -x kvm_vcpu.mutex.wait_list 0xc03d5828 > mutex.wait_list = { > next = 0xc03e98267b10, > prev = 0xc03e98267b10 > }, > > crash> struct -x mutex_waiter.task 0xc03e98267b10 > task = 0xc03e981e0780 > > The following command-line was used to reproduce the problem (note: gdb > and trace can change the results). > > $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \ > -enable-kvm -m 4096 \ > -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \ > -display none -nographic \ > -drive file=disk1.qcow2,format=qcow2 > ... > (qemu) device_add host-spapr-cpu-core,core-id=4 > [no interaction is possible after it, only SIGKILL to take the terminal > back] > > Signed-off-by: Jose Ricardo Ziviani > --- Reviewed-by: Greg Kurz > hw/ppc/spapr_cpu_core.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c > index 876f0b3d9d..a73b244a3f 100644 > --- a/hw/ppc/spapr_cpu_core.c > +++ b/hw/ppc/spapr_cpu_core.c > @@ -34,16 +34,16 @@ static void spapr_cpu_reset(void *opaque) > > cpu_reset(cs); > > -/* Set compatibility mode to match the boot CPU, which was either set > - * by the machine reset code or by CAS. This should never fail. > - */ > -ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); > - > /* All CPUs start halted. CPU0 is unhalted from the machine level > * reset code and the rest are explicitly started up by the guest > * using an RTAS call */ > cs->halted = 1; > > +/* Set compatibility mode to match the boot CPU, which was either set > + * by the machine reset code or by CAS. This should never fail. > + */ > +ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); > + > env->spr[SPR_HIOR] = 0; > > lpcr = env->spr[SPR_LPCR];
[Qemu-devel] [PATCH] Fix a deadlock case in the CPU hotplug flow
From: Jose Ricardo Ziviani We need to set cs->halted to 1 before calling ppc_set_compat. The reason is that ppc_set_compat kicks up the new thread created to manage the hotplugged KVM virtual CPU and the code drives directly to KVM_RUN ioctl. When cs->halted is 1, the code: int kvm_cpu_exec(CPUState *cpu) ... if (kvm_arch_process_async_events(cpu)) { atomic_set(>exit_request, 0); return EXCP_HLT; } ... returns before it reaches KVM_RUN, giving time to the main thread to finish its job. Otherwise we can fall in a deadlock because the KVM thread will issue the KVM_RUN ioctl while the main thread is setting up KVM registers. Depending on how these jobs are scheduled we'll end up freezing QEMU. The following output shows kvm_vcpu_ioctl sleeping because it cannot get the mutex and never will. PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr. STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL PID: 61564 TASK: c03e981e0780 CPU: 48 COMMAND: "qemu-system-ppc" #0 [c03e982679a0] __schedule at c0b10a44 #1 [c03e98267a60] schedule at c0b113a8 #2 [c03e98267a90] schedule_preempt_disabled at c0b11910 #3 [c03e98267ab0] __mutex_lock at c0b132ec #4 [c03e98267bc0] kvm_vcpu_ioctl at c0080ea03140 [kvm] #5 [c03e98267d20] do_vfs_ioctl at c0407d30 #6 [c03e98267dc0] ksys_ioctl at c0408674 #7 [c03e98267e10] sys_ioctl at c04086f8 #8 [c03e98267e30] system_call at c000b488 crash> struct -x kvm.vcpus 0xc03da000 vcpus = {0xc03db488, 0xc03d52b8, 0xc039e9c8, 0xc03d0e20, 0xc03d5828, 0x0, 0x0, ...} crash> struct -x kvm_vcpu.mutex.owner 0xc03d5828 mutex.owner = { counter = 0xc03a23a5c881 <- flag 1: waiters }, crash> bt 0xc03a23a5c880 PID: 61579 TASK: c03a23a5c880 CPU: 9 COMMAND: "CPU 4/KVM" (active) crash> struct -x kvm_vcpu.mutex.wait_list 0xc03d5828 mutex.wait_list = { next = 0xc03e98267b10, prev = 0xc03e98267b10 }, crash> struct -x mutex_waiter.task 0xc03e98267b10 task = 0xc03e981e0780 The following command-line was used to reproduce the problem (note: gdb and trace can change the results). $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \ -enable-kvm -m 4096 \ -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \ -display none -nographic \ -drive file=disk1.qcow2,format=qcow2 ... (qemu) device_add host-spapr-cpu-core,core-id=4 [no interaction is possible after it, only SIGKILL to take the terminal back] Signed-off-by: Jose Ricardo Ziviani --- hw/ppc/spapr_cpu_core.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c index 876f0b3d9d..a73b244a3f 100644 --- a/hw/ppc/spapr_cpu_core.c +++ b/hw/ppc/spapr_cpu_core.c @@ -34,16 +34,16 @@ static void spapr_cpu_reset(void *opaque) cpu_reset(cs); -/* Set compatibility mode to match the boot CPU, which was either set - * by the machine reset code or by CAS. This should never fail. - */ -ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); - /* All CPUs start halted. CPU0 is unhalted from the machine level * reset code and the rest are explicitly started up by the guest * using an RTAS call */ cs->halted = 1; +/* Set compatibility mode to match the boot CPU, which was either set + * by the machine reset code or by CAS. This should never fail. + */ +ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); + env->spr[SPR_HIOR] = 0; lpcr = env->spr[SPR_LPCR]; -- 2.17.1
[Qemu-devel] [PATCH 3/4] HACKING.md : Reformatted to fit the Markdown (.md) format.
Signed-off-by: Yoni Bettan --- HACKING => HACKING.md | 186 ++ 1 file changed, 97 insertions(+), 89 deletions(-) rename HACKING => HACKING.md (53%) diff --git a/HACKING b/HACKING.md similarity index 53% rename from HACKING rename to HACKING.md index 0fc3e0fc04..f9d7631e64 100644 --- a/HACKING +++ b/HACKING.md @@ -1,86 +1,90 @@ -1. Preprocessor +## Preprocessor -1.1. Variadic macros +### Variadic macros For variadic macros, stick with this C99-like syntax: +``` #define DPRINTF(fmt, ...) \ do { printf("IRQ: " fmt, ## __VA_ARGS__); } while (0) +``` -1.2. Include directives +### Include directives Order include directives as follows: +``` #include "qemu/osdep.h" /* Always first... */ #include <...> /* then system headers... */ #include "..." /* and finally QEMU headers. */ +``` -The "qemu/osdep.h" header contains preprocessor macros that affect the behavior -of core system headers like . It must be the first include so that +The `qemu/osdep.h` header contains preprocessor macros that affect the behavior +of core system headers like ``. It must be the first include so that core system headers included by external libraries get the preprocessor macros that QEMU depends on. -Do not include "qemu/osdep.h" from header files since the .c file will have +Do not include `qemu/osdep.h` from header files since the .c file will have already included it. -2. C types +## C types It should be common sense to use the right type, but we have collected a few useful guidelines here. -2.1. Scalars +### Scalars -If you're using "int" or "long", odds are good that there's a better type. -If a variable is counting something, it should be declared with an -unsigned type. +If you're using `int` or `long`, odds are good that there's a better type. +If a variable is counting something, it should be declared with +`unsigned `. -If it's host memory-size related, size_t should be a good choice (use -ssize_t only if required). Guest RAM memory offsets must use ram_addr_t, +If it's host memory-size related, `size_t` should be a good choice (use +`ssize_t` only if required). Guest RAM memory offsets must use `ram_addr_t`, but only for RAM, it may not cover whole guest address space. -If it's file-size related, use off_t. -If it's file-offset related (i.e., signed), use off_t. -If it's just counting small numbers use "unsigned int"; +If it's file-size related, use `off_t`. +If it's file-offset related (i.e., signed), `use off_t`. +If it's just counting small numbers use `unsigned int`; (on all but oddball embedded systems, you can assume that that type is at least four bytes wide). In the event that you require a specific width, use a standard type -like int32_t, uint32_t, uint64_t, etc. The specific types are +like `int32_t`, `uint32_t`, `uint64_t`, etc. The specific types are mandatory for VMState fields. -Don't use Linux kernel internal types like u32, __u32 or __le32. +Don't use Linux kernel internal types like `u32`, `__u32` or `__le32`. -Use hwaddr for guest physical addresses except pcibus_t -for PCI addresses. In addition, ram_addr_t is a QEMU internal address +Use `hwaddr` for guest physical addresses except `pcibus_t` +for PCI addresses. In addition, `ram_addr_t` is a QEMU internal address space that maps guest RAM physical addresses into an intermediate address space that can map to host virtual address spaces. Generally -speaking, the size of guest memory can always fit into ram_addr_t but +speaking, the size of guest memory can always fit into `ram_addr_t` but it would not be correct to store an actual guest physical address in a -ram_addr_t. +`ram_addr_t`. For CPU virtual addresses there are several possible types. -vaddr is the best type to use to hold a CPU virtual address in +`vaddr` is the best type to use to hold a CPU virtual address in target-independent code. It is guaranteed to be large enough to hold a virtual address for any target, and it does not change size from target to target. It is always unsigned. -target_ulong is a type the size of a virtual address on the CPU; this means +`target_ulong` is a type the size of a virtual address on the CPU; this means it may be 32 or 64 bits depending on which target is being built. It should therefore be used only in target-specific code, and in some performance-critical built-per-target core code such as the TLB code. -There is also a signed version, target_long. -abi_ulong is for the *-user targets, and represents a type the size of -'void *' in that target's ABI. (This may not be the same as the size of a +There is also a signed version, `target_long`. +`abi_ulong` is for the `*-user` targets, and represents a type the size of +`void *` in that target's ABI. (This may not be the same as the size of a full CPU virtual address in the case of target ABIs which use 32 bit pointers -on 64 bit CPUs, like
[Qemu-devel] [PATCH] Fix a deadlock case in the CPU hotplug flow
We need to set cs->halted to 1 before calling ppc_set_compat. The reason is that ppc_set_compat kicks up the new thread created to manage the hotplugged KVM virtual CPU and the code drives directly to KVM_RUN ioctl. When cs->halted is 1, the code: int kvm_cpu_exec(CPUState *cpu) ... if (kvm_arch_process_async_events(cpu)) { atomic_set(>exit_request, 0); return EXCP_HLT; } ... returns before it reaches KVM_RUN, giving time to the main thread to finish its job. Otherwise we can fall in a deadlock because the KVM thread will issue the KVM_RUN ioctl while the main thread is setting up KVM registers. Depending on how these jobs are scheduled we'll end up freezing QEMU. The following output shows kvm_vcpu_ioctl sleeping because it cannot get the mutex and never will. PS: kvm_vcpu_ioctl was triggered kvm_set_one_reg - compat_pvr. STATE: TASK_UNINTERRUPTIBLE|TASK_WAKEKILL PID: 61564 TASK: c03e981e0780 CPU: 48 COMMAND: "qemu-system-ppc" #0 [c03e982679a0] __schedule at c0b10a44 #1 [c03e98267a60] schedule at c0b113a8 #2 [c03e98267a90] schedule_preempt_disabled at c0b11910 #3 [c03e98267ab0] __mutex_lock at c0b132ec #4 [c03e98267bc0] kvm_vcpu_ioctl at c0080ea03140 [kvm] #5 [c03e98267d20] do_vfs_ioctl at c0407d30 #6 [c03e98267dc0] ksys_ioctl at c0408674 #7 [c03e98267e10] sys_ioctl at c04086f8 #8 [c03e98267e30] system_call at c000b488 crash> struct -x kvm.vcpus 0xc03da000 vcpus = {0xc03db488, 0xc03d52b8, 0xc039e9c8, 0xc03d0e20, 0xc03d5828, 0x0, 0x0, ...} crash> struct -x kvm_vcpu.mutex.owner 0xc03d5828 mutex.owner = { counter = 0xc03a23a5c881 <- flag 1: waiters }, crash> bt 0xc03a23a5c880 PID: 61579 TASK: c03a23a5c880 CPU: 9 COMMAND: "CPU 4/KVM" (active) crash> struct -x kvm_vcpu.mutex.wait_list 0xc03d5828 mutex.wait_list = { next = 0xc03e98267b10, prev = 0xc03e98267b10 }, crash> struct -x mutex_waiter.task 0xc03e98267b10 task = 0xc03e981e0780 The following command-line was used to reproduce the problem (note: gdb and trace can change the results). $ qemu-ppc/build/ppc64-softmmu/qemu-system-ppc64 -cpu host \ -enable-kvm -m 4096 \ -smp 4,maxcpus=8,sockets=1,cores=2,threads=4 \ -display none -nographic \ -drive file=disk1.qcow2,format=qcow2 ... (qemu) device_add host-spapr-cpu-core,core-id=4 [no interaction is possible after it, only SIGKILL to take the terminal back] Signed-off-by: Jose Ricardo Ziviani --- hw/ppc/spapr_cpu_core.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c index 876f0b3d9d..a73b244a3f 100644 --- a/hw/ppc/spapr_cpu_core.c +++ b/hw/ppc/spapr_cpu_core.c @@ -34,16 +34,16 @@ static void spapr_cpu_reset(void *opaque) cpu_reset(cs); -/* Set compatibility mode to match the boot CPU, which was either set - * by the machine reset code or by CAS. This should never fail. - */ -ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); - /* All CPUs start halted. CPU0 is unhalted from the machine level * reset code and the rest are explicitly started up by the guest * using an RTAS call */ cs->halted = 1; +/* Set compatibility mode to match the boot CPU, which was either set + * by the machine reset code or by CAS. This should never fail. + */ +ppc_set_compat(cpu, POWERPC_CPU(first_cpu)->compat_pvr, _abort); + env->spr[SPR_HIOR] = 0; lpcr = env->spr[SPR_LPCR]; -- 2.17.1
Re: [Qemu-devel] [PATCH v5 0/3] Some improvements in z/Arch instructions support
On 02.09.2018 02:33, Pavel Zbitskiy wrote: > Changes since v4: > > * Simplified alignment checks by using tcg_gen_qemu_ld_i64 and > MO_ALIGN*. > > Pavel Zbitskiy (3): > target/s390x: use regular spaces in translate.c > target/s390x: exception on non-aligned LPSW(E) > target/s390x: implement CVB, CVBY and CVBG > > target/s390x/helper.h | 1 + > target/s390x/insn-data.def | 4 +++ > target/s390x/int_helper.c | 52 + > target/s390x/translate.c| 21 ++--- > tests/tcg/s390x/Makefile.target | 1 + > tests/tcg/s390x/cvb.c | 18 > 6 files changed, 93 insertions(+), 4 deletions(-) > create mode 100644 tests/tcg/s390x/cvb.c > Hi Pavel, I'm leaving for vacation tomorrow and will have a look in two weeks when I'm back. Also, Conny is on vacation, so expect ~2 weeks until this gets reviewed / picked up. Thanks! -- Thanks, David / dhildenb
[Qemu-devel] [PATCH 0/4] Reformatted INFO files to fit the Markdown (.md) format.
This patch make INFO files (e.g. README, CODING_STYLE etc) much more readable when watched from GitHub GUI. ./VERSION and ./MAINTAINERS were leave untouched otherwise ./scripts/get_maintainer.pl breaks. ./COPYING and ./COPYING.LIB were leave untouched because it look likes they come from GNU and not QEMU. Visualization of the change can be found at https://github.com/ybettan/qemu/tree/md. NOTE: This last part of the message should be removed from the commit message since I will remove the 'md' brance from my fork if the patch is accepted. Yoni Bettan (4): README.md : Reformatted to fit the Markdown (.md) format. CODING_STYLE.md : Reformatted to fit the Markdown (.md) format. HACKING.md : Reformatted to fit the Markdown (.md) format. LICENSE.md : Reformatted to fit the Markdown (.md) format. CODING_STYLE => CODING_STYLE.md | 151 ++ HACKING => HACKING.md | 186 +--- LICENSE => LICENSE.md | 4 +- README => README.md | 87 +++ 4 files changed, 224 insertions(+), 204 deletions(-) rename CODING_STYLE => CODING_STYLE.md (50%) rename HACKING => HACKING.md (53%) rename LICENSE => LICENSE.md (90%) rename README => README.md (68%) -- 2.17.1
[Qemu-devel] [PATCH 1/4] README.md : Reformatted to fit the Markdown (.md) format.
Signed-off-by: Yoni Bettan --- README => README.md | 87 +++-- 1 file changed, 44 insertions(+), 43 deletions(-) rename README => README.md (68%) diff --git a/README b/README.md similarity index 68% rename from README rename to README.md index 49a9fd09cd..ac0e0528fb 100644 --- a/README +++ b/README.md @@ -1,5 +1,4 @@ - QEMU README - === +# Qemu QEMU is a generic and open source machine & userspace emulator and virtualizer. @@ -30,86 +29,90 @@ QEMU as a whole is released under the GNU General Public License, version 2. For full licensing details, consult the LICENSE file. -Building - +## Building QEMU is multi-platform software intended to be buildable on all modern Linux platforms, OS-X, Win32 (via the Mingw64 toolchain) and a variety of other UNIX targets. The simple steps to build QEMU are: - mkdir build - cd build - ../configure - make +``` +mkdir build +cd build +../configure +make +``` Additional information can also be found online via the QEMU website: - https://qemu.org/Hosts/Linux - https://qemu.org/Hosts/Mac - https://qemu.org/Hosts/W32 +* https://qemu.org/Hosts/Linux +* https://qemu.org/Hosts/Mac +* https://qemu.org/Hosts/W32 -Submitting patches -== +## Submitting patches The QEMU source code is maintained under the GIT version control system. - git clone git://git.qemu.org/qemu.git +`git clone git://git.qemu.org/qemu.git` -When submitting patches, one common approach is to use 'git -format-patch' and/or 'git send-email' to format & send the mail to the -qemu-devel@nongnu.org mailing list. All patches submitted must contain -a 'Signed-off-by' line from the author. Patches should follow the -guidelines set out in the HACKING and CODING_STYLE files. +When submitting patches, one common approach is to use `git format-patch` +and/or `git send-email` to format & send the mail to the +[qemu-devel@nongnu.org](https://lists.nongnu.org/mailman/listinfo/qemu-devel) +mailing list. All patches submitted must contain a 'Signed-off-by' line from +the author. Patches should follow the guidelines set out in the +[HACKING.md](HACKING.md) and [CODING_STYLE.md](CODING_STYLE.md) files. Additional information on submitting patches can be found online via the QEMU website - https://qemu.org/Contribute/SubmitAPatch - https://qemu.org/Contribute/TrivialPatches +* https://qemu.org/Contribute/SubmitAPatch +* https://qemu.org/Contribute/TrivialPatches The QEMU website is also maintained under source control. - git clone git://git.qemu.org/qemu-web.git - https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/ +`git clone git://git.qemu.org/qemu-web.git` +* https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/ -A 'git-publish' utility was created to make above process less +A `git-publish` utility was created to make above process less cumbersome, and is highly recommended for making regular contributions, or even just for sending consecutive patch series revisions. It also -requires a working 'git send-email' setup, and by default doesn't +requires a working `git send-email` setup, and by default doesn't automate everything, so you may want to go through the above steps manually for once. For installation instructions, please go to - https://github.com/stefanha/git-publish +* https://github.com/stefanha/git-publish The workflow with 'git-publish' is: - $ git checkout master -b my-feature - $ # work on new commits, add your 'Signed-off-by' lines to each - $ git publish +``` +git checkout master -b my-feature +# work on new commits, add your 'Signed-off-by' lines to each +git publish +``` Your patch series will be sent and tagged as my-feature-v1 if you need to refer back to it in the future. Sending v2: - $ git checkout my-feature # same topic branch - $ # making changes to the commits (using 'git rebase', for example) - $ git publish +``` +git checkout my-feature # same topic branch +# making changes to the commits (using 'git rebase', for example) +git publish +``` Your patch series will be sent with 'v2' tag in the subject and the git tip will be tagged as my-feature-v2. -Bug reporting -= +## Bug reporting The QEMU project uses Launchpad as its primary upstream bug tracker. Bugs found when running code built from QEMU git or upstream released sources should be reported via: - https://bugs.launchpad.net/qemu/ +* https://bugs.launchpad.net/qemu/ If using QEMU via an operating system vendor pre-built binary package, it is preferable to report bugs to the vendor's own bug tracker first. If @@ -118,22 +121,20 @@ reported via launchpad. For additional information on bug reporting consult: - https://qemu.org/Contribute/ReportABug +* https://qemu.org/Contribute/ReportABug -Contact -=== +## Contact The QEMU community can be contacted in a number of ways, with the two main methods being email
[Qemu-devel] [PATCH v4 1/3] x86: Data structure changes to support MSR based features
Add FeatureWordType indicator in struct FeatureWordInfo. Change feature_word_info[] accordingly. Change existing functions that refer to feature_word_info[] accordingly. Signed-off-by: Robert Hoo --- target/i386/cpu.c | 172 +- 1 file changed, 120 insertions(+), 52 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index f24295e..a252c26 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -770,17 +770,36 @@ static void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1, /* missing: CPUID_XSAVE_XSAVEC, CPUID_XSAVE_XSAVES */ +typedef enum FeatureWordType { + CPUID_FEATURE_WORD, + MSR_FEATURE_WORD, +} FeatureWordType; + typedef struct FeatureWordInfo { -/* feature flags names are taken from "Intel Processor Identification and +FeatureWordType type; + /* feature flags names are taken from "Intel Processor Identification and * the CPUID Instruction" and AMD's "CPUID Specification". * In cases of disagreement between feature naming conventions, * aliases may be added. */ const char *feat_names[32]; -uint32_t cpuid_eax; /* Input EAX for CPUID */ -bool cpuid_needs_ecx; /* CPUID instruction uses ECX as input */ -uint32_t cpuid_ecx; /* Input ECX value for CPUID */ -int cpuid_reg;/* output register (R_* constant) */ +union { +/* If type==CPUID_FEATURE_WORD */ +struct { +uint32_t eax; /* Input EAX for CPUID */ +bool needs_ecx; /* CPUID instruction uses ECX as input */ +uint32_t ecx; /* Input ECX value for CPUID */ +int reg;/* output register (R_* constant) */ +} cpuid; +/* If type==MSR_FEATURE_WORD */ +struct { +uint32_t index; +struct { /*CPUID that enumerate this MSR*/ +FeatureWord cpuid_class; +uint32_tcpuid_flag; +} cpuid_dep; +} msr; +}; uint32_t tcg_features; /* Feature flags supported by TCG */ uint32_t unmigratable_flags; /* Feature flags known to be unmigratable */ uint32_t migratable_flags; /* Feature flags known to be migratable */ @@ -790,6 +809,7 @@ typedef struct FeatureWordInfo { static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { [FEAT_1_EDX] = { +.type = CPUID_FEATURE_WORD, .feat_names = { "fpu", "vme", "de", "pse", "tsc", "msr", "pae", "mce", @@ -800,10 +820,11 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { "fxsr", "sse", "sse2", "ss", "ht" /* Intel htt */, "tm", "ia64", "pbe", }, -.cpuid_eax = 1, .cpuid_reg = R_EDX, +.cpuid = {.eax = 1, .reg = R_EDX, }, .tcg_features = TCG_FEATURES, }, [FEAT_1_ECX] = { +.type = CPUID_FEATURE_WORD, .feat_names = { "pni" /* Intel,AMD sse3 */, "pclmulqdq", "dtes64", "monitor", "ds-cpl", "vmx", "smx", "est", @@ -814,7 +835,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { "tsc-deadline", "aes", "xsave", NULL /* osxsave */, "avx", "f16c", "rdrand", "hypervisor", }, -.cpuid_eax = 1, .cpuid_reg = R_ECX, +.cpuid = { .eax = 1, .reg = R_ECX, }, .tcg_features = TCG_EXT_FEATURES, }, /* Feature names that are already defined on feature_name[] but @@ -823,6 +844,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { * to features[FEAT_8000_0001_EDX] if and only if CPU vendor is AMD. */ [FEAT_8000_0001_EDX] = { +.type = CPUID_FEATURE_WORD, .feat_names = { NULL /* fpu */, NULL /* vme */, NULL /* de */, NULL /* pse */, NULL /* tsc */, NULL /* msr */, NULL /* pae */, NULL /* mce */, @@ -833,10 +855,11 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { NULL /* fxsr */, "fxsr-opt", "pdpe1gb", "rdtscp", NULL, "lm", "3dnowext", "3dnow", }, -.cpuid_eax = 0x8001, .cpuid_reg = R_EDX, +.cpuid = { .eax = 0x8001, .reg = R_EDX, }, .tcg_features = TCG_EXT2_FEATURES, }, [FEAT_8000_0001_ECX] = { +.type = CPUID_FEATURE_WORD, .feat_names = { "lahf-lm", "cmp-legacy", "svm", "extapic", "cr8legacy", "abm", "sse4a", "misalignsse", @@ -847,7 +870,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { "perfctr-nb", NULL, NULL, NULL, NULL, NULL, NULL, NULL, }, -.cpuid_eax = 0x8001, .cpuid_reg = R_ECX, +.cpuid = { .eax = 0x8001, .reg = R_ECX, }, .tcg_features = TCG_EXT3_FEATURES, /* * TOPOEXT is always allowed but can't be enabled blindly by @@ -857,6 +880,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { .no_autoenable_flags = CPUID_EXT3_TOPOEXT,
[Qemu-devel] [PATCH v4 3/3] x86: define a new MSR based feature word -- FEATURE_WORDS_ARCH_CAPABILITIES
Note RSBA is specially treated -- no matter host support it or not, qemu pretends it is supported. Signed-off-by: Robert Hoo --- target/i386/cpu.c | 27 ++- target/i386/cpu.h | 12 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 0160e97..8ec9613 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -1129,6 +1129,24 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { .reg = R_EDX, }, .tcg_features = ~0U, }, +/*Below are MSR exposed features*/ +[FEATURE_WORDS_ARCH_CAPABILITIES] = { +.type = MSR_FEATURE_WORD, +.feat_names = { +"rdctl-no", "ibrs-all", "rsba", NULL, +"ssb-no", NULL, NULL, NULL, +NULL, NULL, NULL, NULL, +NULL, NULL, NULL, NULL, +NULL, NULL, NULL, NULL, +NULL, NULL, NULL, NULL, +NULL, NULL, NULL, NULL, +NULL, NULL, NULL, NULL, +}, +.msr = { .index = MSR_IA32_ARCH_CAPABILITIES, +.cpuid_dep = { FEAT_7_0_EDX, +CPUID_7_0_EDX_ARCH_CAPABILITIES } +}, +}, }; typedef struct X86RegisterInfo32 { @@ -3680,7 +3698,14 @@ static uint32_t x86_cpu_get_supported_feature_word(FeatureWord w, wi->cpuid.reg); break; case MSR_FEATURE_WORD: -r = kvm_arch_get_supported_msr_feature(kvm_state, +/* Special case: + * No matter host status, IA32_ARCH_CAPABILITIES.RSBA [bit 2] + * is always supported in guest. + */ +if (wi->msr.index == MSR_IA32_ARCH_CAPABILITIES) { +r = MSR_ARCH_CAP_RSBA; +} +r |= kvm_arch_get_supported_msr_feature(kvm_state, wi->msr.index); break; } diff --git a/target/i386/cpu.h b/target/i386/cpu.h index b572a8e..9662730 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -502,9 +502,14 @@ typedef enum FeatureWord { FEAT_6_EAX, /* CPUID[6].EAX */ FEAT_XSAVE_COMP_LO, /* CPUID[EAX=0xd,ECX=0].EAX */ FEAT_XSAVE_COMP_HI, /* CPUID[EAX=0xd,ECX=0].EDX */ +FEATURE_WORDS_NUM_CPUID, +FEATURE_WORDS_FIRST_MSR = FEATURE_WORDS_NUM_CPUID, +FEATURE_WORDS_ARCH_CAPABILITIES = FEATURE_WORDS_FIRST_MSR, FEATURE_WORDS, } FeatureWord; +#define FEATURE_WORDS_NUM_MSRS (FEATURE_WORDS - FEATURE_WORDS_FIRST_MSR) + typedef uint32_t FeatureWordArray[FEATURE_WORDS]; /* cpuid_features bits */ @@ -730,6 +735,13 @@ typedef uint32_t FeatureWordArray[FEATURE_WORDS]; #define CPUID_TOPOLOGY_LEVEL_SMT (1U << 8) #define CPUID_TOPOLOGY_LEVEL_CORE (2U << 8) +/* MSR Feature Bits */ +#define MSR_ARCH_CAP_RDCL_NO(1U << 0) +#define MSR_ARCH_CAP_IBRS_ALL (1U << 1) +#define MSR_ARCH_CAP_RSBA (1U << 2) +#define MSR_ARCH_CAP_SKIP_L1DFL_VMENTRY (1U << 3) +#define MSR_ARCH_CAP_SSB_NO (1U << 4) + #ifndef HYPERV_SPINLOCK_NEVER_RETRY #define HYPERV_SPINLOCK_NEVER_RETRY 0x #endif -- 1.8.3.1
[Qemu-devel] [PATCH v4 2/3] kvm: Add support to KVM_GET_MSR_FEATURE_INDEX_LIST and KVM_GET_MSRS system ioctl
Add kvm_get_supported_feature_msrs() to get supported MSR feature index list. Add kvm_arch_get_supported_msr_feature() to get each MSR features value. Signed-off-by: Robert Hoo --- include/sysemu/kvm.h | 2 ++ target/i386/cpu.c| 7 ++--- target/i386/kvm.c| 72 3 files changed, 78 insertions(+), 3 deletions(-) diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index 0b64b8e..97d8d9d 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -463,6 +463,8 @@ int kvm_vm_check_extension(KVMState *s, unsigned int extension); uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function, uint32_t index, int reg); +uint32_t kvm_arch_get_supported_msr_feature(KVMState *s, uint32_t index); + void kvm_set_sigmask_len(KVMState *s, unsigned int sigmask_len); diff --git a/target/i386/cpu.c b/target/i386/cpu.c index a252c26..0160e97 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -3670,7 +3670,7 @@ static uint32_t x86_cpu_get_supported_feature_word(FeatureWord w, bool migratable_only) { FeatureWordInfo *wi = _word_info[w]; -uint32_t r; +uint32_t r = 0; if (kvm_enabled()) { switch (wi->type) { @@ -3679,8 +3679,9 @@ static uint32_t x86_cpu_get_supported_feature_word(FeatureWord w, wi->cpuid.ecx, wi->cpuid.reg); break; -default: -r = 0; +case MSR_FEATURE_WORD: +r = kvm_arch_get_supported_msr_feature(kvm_state, +wi->msr.index); break; } } else if (hvf_enabled()) { diff --git a/target/i386/kvm.c b/target/i386/kvm.c index 0b2a07d..bfd8088 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -107,6 +107,7 @@ static int has_pit_state2; static bool has_msr_mcg_ext_ctl; static struct kvm_cpuid2 *cpuid_cache; +static struct kvm_msr_list *kvm_feature_msrs; int kvm_has_pit_state2(void) { @@ -420,6 +421,33 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, return ret; } +uint32_t kvm_arch_get_supported_msr_feature(KVMState *s, uint32_t index) +{ +struct { +struct kvm_msrs info; +struct kvm_msr_entry entries[1]; +} msr_data; +uint32_t ret; + +if (kvm_feature_msrs == NULL) { /*ARCH doesn't support feature MSRs*/ +return 0; +} + +msr_data.info.nmsrs = 1; +msr_data.entries[0].index = index; + +ret = kvm_ioctl(s, KVM_GET_MSRS, _data); + +if (ret != 1) { +fprintf(stderr, "KVM get MSR (index=0x%x) feature failed, %s\n", +index, strerror(-ret)); +exit(1); +} + +return msr_data.entries[0].data; +} + + typedef struct HWPoisonPage { ram_addr_t ram_addr; QLIST_ENTRY(HWPoisonPage) list; @@ -1239,6 +1267,45 @@ void kvm_arch_do_init_vcpu(X86CPU *cpu) } } +static int kvm_get_supported_feature_msrs(KVMState *s) +{ +int ret = 0; + +if (kvm_feature_msrs != NULL) { +return 0; +} + +if (!kvm_check_extension(s, KVM_CAP_GET_MSR_FEATURES)) { +return -1; +} + +struct kvm_msr_list msr_list; + +msr_list.nmsrs = 0; +ret = kvm_ioctl(s, KVM_GET_MSR_FEATURE_INDEX_LIST, _list); +if (ret < 0 && ret != -E2BIG) { +return ret; +} + +assert(msr_list.nmsrs > 0); +kvm_feature_msrs = (struct kvm_msr_list *) \ +g_malloc0(sizeof(msr_list) + + msr_list.nmsrs * sizeof(msr_list.indices[0])); + +kvm_feature_msrs->nmsrs = msr_list.nmsrs; +ret = kvm_ioctl(s, KVM_GET_MSR_FEATURE_INDEX_LIST, kvm_feature_msrs); + +if (ret < 0) { +fprintf(stderr, "Fetch KVM feature MSRs failed: %s\n", +strerror(-ret)); +g_free(kvm_feature_msrs); +kvm_feature_msrs = NULL; +return ret; +} + +return 0; +} + static int kvm_get_supported_msrs(KVMState *s) { static int kvm_supported_msrs; @@ -1392,6 +1459,11 @@ int kvm_arch_init(MachineState *ms, KVMState *s) return ret; } +ret = kvm_get_supported_feature_msrs(s); +if (ret < 0) { /*if MSR based features aren't supported, ignore it.*/ +warn_report("Get supported feature MSRs failed."); +} + uname(); lm_capable_kernel = strcmp(utsname.machine, "x86_64") == 0; -- 1.8.3.1
[Qemu-devel] [PATCH v4 0/3] x86: QEMU side support on MSR based features
KVM side has added the framework (kvm.git:d1d93fa90) to support MSR based features. Here is the QEMU part, including data structure changes/expanding, referring functions changes, and the implementations on KVM_GET_MSR_FEATURE_INDEX_LIST and KVM_GET_MSRS system ioctl. Changelog: v4: Re-organize patch set to conform to request of individually build pass. Add KVM capability check for KVM_GET_MSR_INDEX_LIST before fetch. Special treatment for MSR_IA32_ARCH_CAPABILITIES.RSBA. Use more convenient glib wrapper (g_strdup_printf) instead of native (sprintf). v3: patch 2&3 in v2 are corrupted. Re-format patches. v2: coding style changes to pass ./scripts/checkpatch.pl. Robert Hoo (3): x86: Data structure changes to support MSR based features kvm: Add support to KVM_GET_MSR_FEATURE_INDEX_LIST and KVM_GET_MSRS system ioctl x86: define a new MSR based feature word -- FEATURE_WORDS_ARCH_CAPABILITIES include/sysemu/kvm.h | 2 + target/i386/cpu.c| 200 +-- target/i386/cpu.h| 12 target/i386/kvm.c| 72 +++ 4 files changed, 233 insertions(+), 53 deletions(-) -- 1.8.3.1
[Qemu-devel] [PATCH 2/4] CODING_STYLE.md : Reformatted to fit the Markdown (.md) format.
Signed-off-by: Yoni Bettan --- CODING_STYLE => CODING_STYLE.md | 151 +--- 1 file changed, 80 insertions(+), 71 deletions(-) rename CODING_STYLE => CODING_STYLE.md (50%) diff --git a/CODING_STYLE b/CODING_STYLE.md similarity index 50% rename from CODING_STYLE rename to CODING_STYLE.md index ec075dedc4..5b1523ba63 100644 --- a/CODING_STYLE +++ b/CODING_STYLE.md @@ -1,10 +1,9 @@ -QEMU Coding Style -= +# QEMU Coding Style Please use the script checkpatch.pl in the scripts directory to check patches before submitting. -1. Whitespace +## Whitespace Of course, the most important aspect in any coding style is whitespace. Crusty old coders who have trouble spotting the glasses on their noses @@ -16,20 +15,20 @@ QEMU indents are four spaces. Tabs are never used, except in Makefiles where they have been irreversibly coded into the syntax. Spaces of course are superior to tabs because: - - You have just one way to specify whitespace, not two. Ambiguity breeds +* You have just one way to specify whitespace, not two. Ambiguity breeds mistakes. - - The confusion surrounding 'use tabs to indent, spaces to justify' is gone. - - Tab indents push your code to the right, making your screen seriously +* The confusion surrounding 'use tabs to indent, spaces to justify' is gone. +* Tab indents push your code to the right, making your screen seriously unbalanced. - - Tabs will be rendered incorrectly on editors who are misconfigured not +* Tabs will be rendered incorrectly on editors who are misconfigured not to use tab stops of eight positions. - - Tabs are rendered badly in patches, causing off-by-one errors in almost +* Tabs are rendered badly in patches, causing off-by-one errors in almost every line. - - It is the QEMU coding style. +* It is the QEMU coding style. Do not leave whitespace dangling off the ends of lines. -2. Line width +## Line width Lines should be 80 characters; try not to make them longer. @@ -38,28 +37,28 @@ that use long function or symbol names. Even in that case, do not make lines much longer than 80 characters. Rationale: - - Some people like to tile their 24" screens with a 6x4 matrix of 80x24 - xterms and use vi in all of them. The best way to punish them is to - let them keep doing it. - - Code and especially patches is much more readable if limited to a sane - line length. Eighty is traditional. - - The four-space indentation makes the most common excuse ("But look - at all that white space on the left!") moot. - - It is the QEMU coding style. +* Some people like to tile their 24" screens with a 6x4 matrix of 80x24 + xterms and use vi in all of them. The best way to punish them is to + let them keep doing it. +* Code and especially patches is much more readable if limited to a sane + line length. Eighty is traditional. +* The four-space indentation makes the most common excuse ("But look + at all that white space on the left!") moot. +* It is the QEMU coding style. -3. Naming +## Naming -Variables are lower_case_with_underscores; easy to type and read. Structured -type names are in CamelCase; harder to type but standing out. Enum type -names and function type names should also be in CamelCase. Scalar type -names are lower_case_with_underscores_ending_with_a_t, like the POSIX -uint64_t and family. Note that this last convention contradicts POSIX +Variables are `lower_case_with_underscores`; easy to type and read. Structured +type names are in `CamelCase`; harder to type but standing out. Enum type +names and function type names should also be in `CamelCase`. Scalar type +names are `lower_case_with_underscores_ending_with_a_t`, like the POSIX +`uint64_t` and family. Note that this last convention contradicts POSIX and is therefore likely to be changed. -When wrapping standard library functions, use the prefix qemu_ to alert +When wrapping standard library functions, use the prefix `qemu_` to alert readers that they are seeing a wrapped version; otherwise avoid this prefix. -4. Block structure +## Block structure Every indented statement is braced; even if the block contains just one statement. The opening brace is on the line that contains the control @@ -67,69 +66,79 @@ flow statement that introduces the new block; the closing brace is on the same line as the else keyword, or on a line by itself if there is no else keyword. Example: -if (a == 5) { -printf("a was 5.\n"); -} else if (a == 6) { -printf("a was 6.\n"); -} else { -printf("a was something else entirely.\n"); -} +``` +if (a == 5) { +printf("a was 5.\n"); +} else if (a == 6) { +printf("a was 6.\n"); +} else { +printf("a was something else entirely.\n"); +} +``` -Note that 'else if' is considered a single statement; otherwise a long if/ -else if/else if/.../else sequence would need an indent for every else +Note that `else if` is considered a
[Qemu-devel] [PATCH 4/4] LICENSE.md : Reformatted to fit the Markdown (.md) format.
Signed-off-by: Yoni Bettan --- LICENSE => LICENSE.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) rename LICENSE => LICENSE.md (90%) diff --git a/LICENSE b/LICENSE.md similarity index 90% rename from LICENSE rename to LICENSE.md index 0e0b4b9553..644e953429 100644 --- a/LICENSE +++ b/LICENSE.md @@ -1,3 +1,5 @@ +# License + The following points clarify the QEMU license: 1) QEMU as a whole is released under the GNU General Public License, @@ -11,7 +13,7 @@ option) any later version. As of July 2013, contributions under version 2 of the GNU General Public License (and no later version) are only accepted for the following files -or directories: bsd-user/, linux-user/, hw/vfio/, hw/xen/xen_pt*. +or directories: `bsd-user/`, `linux-user/`, `hw/vfio/`, `hw/xen/xen_pt*`. 3) The Tiny Code Generator (TCG) is released under the BSD license (see license headers in files). -- 2.17.1
Re: [Qemu-devel] [PATCH 0/2] pvrdma: Extend device state
On Sun, Sep 02, 2018 at 11:48:11AM +0300, Yuval Shaia wrote: > pvrdma device state is compose of the state of the following: > - rdma backend device > - vmxnet3 device on PCI function 0 > - pvrdma device on PCI function 1 > > This patch-set aim to make it. > > Patch 1: Make vmxnet3 accessible from other devices > Patch 2: Take vnmxnet3 state into account > > > Yuval Shaia (2): > vmxnet3: Move some definitions to header file > hw/pvrdma: Add support for port state > > hw/net/vmxnet3.c | 116 + > hw/net/vmxnet3_defs.h | 133 ++ > hw/rdma/vmw/pvrdma.h | 3 + > hw/rdma/vmw/pvrdma_cmd.c | 12 +++- > hw/rdma/vmw/pvrdma_main.c | 14 +++- > 5 files changed, 160 insertions(+), 118 deletions(-) > create mode 100644 hw/net/vmxnet3_defs.h Please ignore this patch-set, i will send v2 with a missing functionality. > > -- > 2.17.1 >