Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On Thu, Mar 23, 2017 at 1:37 PM, Paolo Bonziniwrote: > > > On 23/03/2017 17:50, Pranith Kumar wrote: >> On Thu, Mar 23, 2017 at 6:27 AM, Paolo Bonzini wrote: >>> >>> >>> On 22/03/2017 21:01, Richard Henderson wrote: > > Ah, OK. Thanks for the explanation. May be we should check the size of > the instruction while decoding the prefixes and error out once we > exceed the limit. We would not generate any IR code. Yes. It would not enforce a true limit of 15 bytes, since you can't know that until you've done the rest of the decode. But you'd be able to say that no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 bytes is used. Which does fix the bug. >>> >>> Yeah, that would work for 2.9 if somebody wants to put together a patch. >>> Ensuring that all instruction fetching happens before translation side >>> effects is a little harder, but perhaps it's also the opportunity to get >>> rid of s->rip_offset which is a little ugly. >> >> How about the following? >> >> diff --git a/target/i386/translate.c b/target/i386/translate.c >> index 72c1b03a2a..67c58b8900 100644 >> --- a/target/i386/translate.c >> +++ b/target/i386/translate.c >> @@ -4418,6 +4418,11 @@ static target_ulong disas_insn(CPUX86State >> *env, DisasContext *s, >> s->vex_l = 0; >> s->vex_v = 0; >> next_byte: >> +/* The prefixes can atmost be 14 bytes since x86 has an upper >> + limit of 15 bytes for the instruction */ >> +if (s->pc - pc_start > 14) { >> +goto illegal_op; >> +} >> b = cpu_ldub_code(env, s->pc); >> s->pc++; >> /* Collect prefixes. */ > > Please make the comment more verbose, based on Richard's remark. We > should apply it to 2.9. > > Also, QEMU usually formats comments with stars on every line. OK. I'll send a proper patch with updated comment. Thanks, -- Pranith
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On 23/03/2017 17:50, Pranith Kumar wrote: > On Thu, Mar 23, 2017 at 6:27 AM, Paolo Bonziniwrote: >> >> >> On 22/03/2017 21:01, Richard Henderson wrote: Ah, OK. Thanks for the explanation. May be we should check the size of the instruction while decoding the prefixes and error out once we exceed the limit. We would not generate any IR code. >>> >>> Yes. >>> >>> It would not enforce a true limit of 15 bytes, since you can't know that >>> until you've done the rest of the decode. But you'd be able to say that >>> no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 >>> bytes is used. >>> >>> Which does fix the bug. >> >> Yeah, that would work for 2.9 if somebody wants to put together a patch. >> Ensuring that all instruction fetching happens before translation side >> effects is a little harder, but perhaps it's also the opportunity to get >> rid of s->rip_offset which is a little ugly. > > How about the following? > > diff --git a/target/i386/translate.c b/target/i386/translate.c > index 72c1b03a2a..67c58b8900 100644 > --- a/target/i386/translate.c > +++ b/target/i386/translate.c > @@ -4418,6 +4418,11 @@ static target_ulong disas_insn(CPUX86State > *env, DisasContext *s, > s->vex_l = 0; > s->vex_v = 0; > next_byte: > +/* The prefixes can atmost be 14 bytes since x86 has an upper > + limit of 15 bytes for the instruction */ > +if (s->pc - pc_start > 14) { > +goto illegal_op; > +} > b = cpu_ldub_code(env, s->pc); > s->pc++; > /* Collect prefixes. */ Please make the comment more verbose, based on Richard's remark. We should apply it to 2.9. Also, QEMU usually formats comments with stars on every line. Paolo
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On Thu, Mar 23, 2017 at 6:27 AM, Paolo Bonziniwrote: > > > On 22/03/2017 21:01, Richard Henderson wrote: >>> >>> Ah, OK. Thanks for the explanation. May be we should check the size of >>> the instruction while decoding the prefixes and error out once we >>> exceed the limit. We would not generate any IR code. >> >> Yes. >> >> It would not enforce a true limit of 15 bytes, since you can't know that >> until you've done the rest of the decode. But you'd be able to say that >> no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 >> bytes is used. >> >> Which does fix the bug. > > Yeah, that would work for 2.9 if somebody wants to put together a patch. > Ensuring that all instruction fetching happens before translation side > effects is a little harder, but perhaps it's also the opportunity to get > rid of s->rip_offset which is a little ugly. How about the following? diff --git a/target/i386/translate.c b/target/i386/translate.c index 72c1b03a2a..67c58b8900 100644 --- a/target/i386/translate.c +++ b/target/i386/translate.c @@ -4418,6 +4418,11 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, s->vex_l = 0; s->vex_v = 0; next_byte: +/* The prefixes can atmost be 14 bytes since x86 has an upper + limit of 15 bytes for the instruction */ +if (s->pc - pc_start > 14) { +goto illegal_op; +} b = cpu_ldub_code(env, s->pc); s->pc++; /* Collect prefixes. */ -- Pranith
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On 22/03/2017 21:01, Richard Henderson wrote: >> >> Ah, OK. Thanks for the explanation. May be we should check the size of >> the instruction while decoding the prefixes and error out once we >> exceed the limit. We would not generate any IR code. > > Yes. > > It would not enforce a true limit of 15 bytes, since you can't know that > until you've done the rest of the decode. But you'd be able to say that > no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 > bytes is used. > > Which does fix the bug. Yeah, that would work for 2.9 if somebody wants to put together a patch. Ensuring that all instruction fetching happens before translation side effects is a little harder, but perhaps it's also the opportunity to get rid of s->rip_offset which is a little ugly. Paolo
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On 03/23/2017 02:29 AM, Pranith Kumar wrote: On Wed, Mar 22, 2017 at 11:21 AM, Peter Maydellwrote: On 22 March 2017 at 15:14, Pranith Kumar wrote: On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell wrote: This doesn't look right because it means we'll check only after we've emitted all the code to do the instruction operation, so the effect will be "execute instruction, then take illegal-opcode exception". The pc is restored to original address (s->pc = pc_start), so the exception will overwrite the generated illegal instruction and will be executed first. s->pc is the guest PC -- moving that backwards will not do anything about the generated TCG IR that's already been written. You'd need to rewind the write pointer in the IR stream, which there is no support for doing AFAIK. Ah, OK. Thanks for the explanation. May be we should check the size of the instruction while decoding the prefixes and error out once we exceed the limit. We would not generate any IR code. Yes. It would not enforce a true limit of 15 bytes, since you can't know that until you've done the rest of the decode. But you'd be able to say that no more than 14 prefix + 1 opc + 6 modrm+sib+ofs + 4 immediate = 25 bytes is used. Which does fix the bug. r~
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On Wed, Mar 22, 2017 at 11:21 AM, Peter Maydellwrote: > On 22 March 2017 at 15:14, Pranith Kumar wrote: >> On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell >> wrote: >>> This doesn't look right because it means we'll check >>> only after we've emitted all the code to do the >>> instruction operation, so the effect will be >>> "execute instruction, then take illegal-opcode >>> exception". > >> The pc is restored to original address (s->pc = pc_start), so the >> exception will overwrite the generated illegal instruction and will be >> executed first. > > s->pc is the guest PC -- moving that backwards will > not do anything about the generated TCG IR that's > already been written. You'd need to rewind the > write pointer in the IR stream, which there is > no support for doing AFAIK. Ah, OK. Thanks for the explanation. May be we should check the size of the instruction while decoding the prefixes and error out once we exceed the limit. We would not generate any IR code. -- Pranith
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On 22 March 2017 at 15:14, Pranith Kumarwrote: > On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydell > wrote: >> This doesn't look right because it means we'll check >> only after we've emitted all the code to do the >> instruction operation, so the effect will be >> "execute instruction, then take illegal-opcode >> exception". > The pc is restored to original address (s->pc = pc_start), so the > exception will overwrite the generated illegal instruction and will be > executed first. s->pc is the guest PC -- moving that backwards will not do anything about the generated TCG IR that's already been written. You'd need to rewind the write pointer in the IR stream, which there is no support for doing AFAIK. thanks -- PMM
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On Wed, Mar 22, 2017 at 11:04 AM, Peter Maydellwrote: >> >> How about doing the instruction size check as follows? >> >> diff --git a/target/i386/translate.c b/target/i386/translate.c >> index 72c1b03a2a..94cf3da719 100644 >> --- a/target/i386/translate.c >> +++ b/target/i386/translate.c >> @@ -8235,6 +8235,10 @@ static target_ulong disas_insn(CPUX86State >> *env, DisasContext *s, >> default: >> goto unknown_op; >> } >> +if (s->pc - pc_start > 15) { >> +s->pc = pc_start; >> +goto illegal_op; >> +} >> return s->pc; >> illegal_op: >> gen_illegal_opcode(s); > > This doesn't look right because it means we'll check > only after we've emitted all the code to do the > instruction operation, so the effect will be > "execute instruction, then take illegal-opcode > exception". > The pc is restored to original address (s->pc = pc_start), so the exception will overwrite the generated illegal instruction and will be executed first. But yes, it's better to follow the architecture manual. Thanks, -- Pranith
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On 22 March 2017 at 14:55, Pranith Kumarwrote: > On Mon, Mar 20, 2017 at 10:46 AM, Peter Maydell wrote: >> On 20 March 2017 at 14:36, Jann Horn wrote: >>> This is an issue in QEMU's system emulation for X86 in TCG mode. >>> The issue permits an attacker who can execute code in guest ring 3 >>> with normal user privileges to inject code into other processes that >>> are running in guest ring 3, in particular root-owned processes. >> >>> I am sending this to qemu-devel because a QEMU security contact >>> told me that QEMU does not consider privilege escalation inside a >>> TCG VM to be a security concern. >> >> Correct; it's just a bug. Don't trust TCG QEMU as a security boundary. >> >> We should really fix the crossing-a-page-boundary code for x86. >> I believe we do get it correct for ARM Thumb instructions. > > How about doing the instruction size check as follows? > > diff --git a/target/i386/translate.c b/target/i386/translate.c > index 72c1b03a2a..94cf3da719 100644 > --- a/target/i386/translate.c > +++ b/target/i386/translate.c > @@ -8235,6 +8235,10 @@ static target_ulong disas_insn(CPUX86State > *env, DisasContext *s, > default: > goto unknown_op; > } > +if (s->pc - pc_start > 15) { > +s->pc = pc_start; > +goto illegal_op; > +} > return s->pc; > illegal_op: > gen_illegal_opcode(s); This doesn't look right because it means we'll check only after we've emitted all the code to do the instruction operation, so the effect will be "execute instruction, then take illegal-opcode exception". We should check what the x86 architecture spec actually says and implement that. thanks -- PMM
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On Mon, Mar 20, 2017 at 10:46 AM, Peter Maydell wrote: > On 20 March 2017 at 14:36, Jann Hornwrote: >> This is an issue in QEMU's system emulation for X86 in TCG mode. >> The issue permits an attacker who can execute code in guest ring 3 >> with normal user privileges to inject code into other processes that >> are running in guest ring 3, in particular root-owned processes. > >> I am sending this to qemu-devel because a QEMU security contact >> told me that QEMU does not consider privilege escalation inside a >> TCG VM to be a security concern. > > Correct; it's just a bug. Don't trust TCG QEMU as a security boundary. > > We should really fix the crossing-a-page-boundary code for x86. > I believe we do get it correct for ARM Thumb instructions. How about doing the instruction size check as follows? diff --git a/target/i386/translate.c b/target/i386/translate.c index 72c1b03a2a..94cf3da719 100644 --- a/target/i386/translate.c +++ b/target/i386/translate.c @@ -8235,6 +8235,10 @@ static target_ulong disas_insn(CPUX86State *env, DisasContext *s, default: goto unknown_op; } +if (s->pc - pc_start > 15) { +s->pc = pc_start; +goto illegal_op; +} return s->pc; illegal_op: gen_illegal_opcode(s); Thanks, -- Pranith
Re: [Qemu-devel] [BUG] user-to-root privesc inside VM via bad translation caching
On 20 March 2017 at 14:36, Jann Hornwrote: > This is an issue in QEMU's system emulation for X86 in TCG mode. > The issue permits an attacker who can execute code in guest ring 3 > with normal user privileges to inject code into other processes that > are running in guest ring 3, in particular root-owned processes. > I am sending this to qemu-devel because a QEMU security contact > told me that QEMU does not consider privilege escalation inside a > TCG VM to be a security concern. Correct; it's just a bug. Don't trust TCG QEMU as a security boundary. We should really fix the crossing-a-page-boundary code for x86. I believe we do get it correct for ARM Thumb instructions. thanks -- PMM