How about we talk about operating systems for a change? I've been trying
to learn a little about x86_64, and modern Linux userspace recently. I
just thought I'd summarize some of that for you guys.

In case you didn't know, x86 machine code is ... interesting (in a bad
way). I don't think Intel is too pleased that AMD got ahead of them with
the 64bit extensions to the instruction set, so they don't place nice,
and they tend to create their own version of everything now. It doesn't
make for a compact instruction set. Sucks for AMD. Even just the name of
the architecture is a mess, between Microsoft, Linux, AMD, and Intel
using different names.

In the old days, syscalls executed in the kernel, and INT $0x80 was the
instruction to get there. At some point, INT $80 got kind of slow on
processors of the day. Intel introduced the faster sysenter/sysexit pair
of instructions. Also, AMD introduced syscall/sysret with the AMD64
architecture (apparently a better interface than Intel's, but whatever,
I think that Intel came up with a 64bit sysenter/sysexit anyway).

Meanwhile the kernel people noticed that certain syscalls don't require
any special permission, and so they wouldn't need to execute in kernel
space at all. For example, a process doesn't need any special permission
to get the time of day, so if only the gettimeofday routine and
corresponding variable location were mapped into user space of every
process, it would be much faster. This they did.

So, given some combination of syscall, processor version, and kernel
version, how do you know if you should do INT $0x80, sysenter, syscall,
or just use some userspace routine? Well, you would probably just go
indirectly through glibc, but without that, you could do it indirectly
through an entry in a table called vsyscall which is statically located
in every process' memory. That entry can either do it in userland (in
the case of gettimeofday and friends) or use the best syscall
instruction for your architecture.

Sounds pretty good, but the story can't just end here can it?

The vsyscall is at a well known static location, and is executable, and
has (had, actually) live data (the time of day) in it, so a bad guy in
your process could wait until the right time then jump there and execute
an arbitrary instruction (for the subset of instructions that matched
some current time). That is a bummer, although I'd say you were screwed
already if the bad guy can get it to jump to the timeofday data. Oh, and
security people also don't like the static addresses at all. To make a
long story short, these days vsyscall just slows you down instead of
speeding you up, it patches through to other mechanisms.

The real table is the virtual Dynamic Share Object (vDSO). It is a
genuine elf object (with instructions tailored for your processor
architecture), and it is randomly located in every user process. Oh, and
the data isn't executable (aren't, for you grammar police).

I should probably attach a bunch of blog posts, kernel mailing list
messages, and man pages here, but could you just search for them yourself?

PS. HOMEWORK: How can you find the address of the vDSO?
PSS. And how else?

-- 
Anthony Carrico

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to