Hi Paul,

On Tue Apr 12, 2022 at 01:09:40 +0200, Paul Boddie wrote:
> On Monday, 11 April 2022 01:02:37 CEST Adam Lackorzynski wrote:
> > Hi Paul,
> > 
> > On Sun Apr 10, 2022 at 18:58:10 +0200, Paul Boddie wrote:
> > > I finally got round to experimenting with L4Re again, but in attempting to
> > > investigate task creation, I seem to have some difficulties understanding
> > > the mechanism by which tasks are typically created and how the
> > > l4_task_map function might be used in the process.
> > > 
> > > After looking at lots of different files in the L4Re distribution, my
> > > understanding of the basic mechanism is as follows:
> > > 
> > > 1. Some memory is reserved for the UTCB of a new task, perhaps using the
> > > l4re_ma_alloc_align function (or equivalent) to obtain a dataspace.
> > 
> > No, for UTCBs there's a dedicated call l4_task_add_ku_mem in case one
> > needs more UTCB memory than has been initially created with
> > l4_factory_create_task().
> 
> OK, I did see that function being used, too, but I also found plenty of other 
> things in my perusal of the different files. Obviously, being able to extend 
> the UTCB memory is an important consideration.

It is, because, for example, one might not know how many threads a task
will have, especially the component that creates the task.

> > > 2. A task is created using l4_factory_create_task, indicating the UTCB
> > > flexpage, with this being defined as...
> > > 
> > >   l4_factory_create_task(l4re_env()->factory, new_task,
> > >   
> > >       l4_fpage(utcb_start, utcb_log2size, L4_FPAGE_RW))
> > 
> > Yes. Here the flexpage defines where memory usable for UTCBs shall be
> > created.
> 
> Right. I see that the factory function actually sends the flexpage in the IPC 
> call (using l4_factory_create_add_fpage_u), thus mapping it in the task. I 
> find it hard to follow where this message is actually handled (I presume that 
> Moe acts as the factory) or what the factory actually does with the flexpage, 
> but I presume that it ultimately causes it to be mapped in the new task.

It is handled in Fiasco.

> [Thread creation and initiation]
> 
> > > The expectation is that the thread will immediately fault because there is
> > > no memory mapped at the instruction pointer location. However, it seems
> > > to me that it should be possible to use l4_task_map to make a memory
> > > region available within the task's address space, although I don't ever
> > > see this function used in L4Re for anything.
> > > 
> > > (The C++ API makes it difficult to perform ad-hoc searches for such
> > > low-level primitives, in my view, so perhaps I am missing use of the
> > > equivalent methods.)
> > 
> > Indeed. L4::Task::map is used, for example to map some initial
> > capabilities, and typically not memory.
> 
> Yes, I see a lot of these map operations operating on capabilities. For 
> example:
> 
> pkg/l4re-core/libloader/include/remote_app_model
> 
> However, I wonder about the "chicken and egg" situation in new tasks. It 
> seems 
> to me that the way things work is that a new task in L4Re is typically 
> populated with the l4re binary containing the region mapper/manager (RM). 
> This 
> seems to be initiated here (in launch_loader):

Yes, the l4re binary loads the application and then serves as its pager.

> pkg/l4re-core/ned/server/src/lua_exec.cc
> 
> This RM is then able to handle the page fault when an attempt is made to load 
> and run a new program. But one cannot rely on the RM when it isn't already 
> installed in a task, so there must be a way of mapping it into the new task 
> so 
> that it can be present. I assumed that using l4_task_map might be one way of 
> doing so.
> 
> Otherwise, I thought that perhaps an existing task could provide a kind of RM 
> to act as the new task's pager in the bootstrapping phase, so that page 
> faults 
> would be directed towards the existing task's RM and mappings established to 
> get the new task's RM up and running. However, in that case, since the usual 
> IPC traffic between RM and dataspaces does not involve sending flexpages to 
> the new task (and thus implicitly mapping them in the task, as I understand 
> it), it seems that the existing task's RM would also need to explicitly map 
> flexpages in the new task, again using something like l4_task_map.

That's how it works. Moe also has region managers that are used for the
l4re binary to be paged. When a page fault is resolved, then there is
someone sending memory via a flexpage to the task in question. In our
case it's the dataspace manager which sends the memory via an 'map'
call. Here it does not matter whether the l4re binary faulted or the
application, because in the end the task is the receiver of the flex
page, not the particular application (which are runnig in the same
task).

> 
> I think I understand the usual mechanism between a task's RM and dataspaces, 
> at least enough to have implemented paging with dataspaces myself, but I 
> don't 
> follow what is actually being done here.
> 
> > > Tentatively, I would imagine that something like this might work:
> > >   l4_task_map(new_task, L4RE_THIS_TASK_CAP,
> > >   
> > >               l4_fpage(program_start, program_log2size, L4_FPAGE_RX),
> > >               task_program_start)
> > > 
> > > Here, the program payload would be loaded into the creating task at
> > > program_start, but the new task would be receiving the payload at
> > > task_program_start, with the configured instruction pointer location
> > > occurring within the receive window (after task_program_start, in other
> > > words).
> > Yes, this would work.
> 
> As far as I have seen, the "send base" with l4_task_map is effectively 
> defining the location where the flexpage is mapped, since the receive window 
> is "the whole address space" of the destination task according to the 
> documentation. Some testing with existing tasks appears to confirm this, but 
> then I also seem to experience issues with memory coherency or something 
> resembling it, meaning that I write to several pages, map the memory to 
> another task, and yet the mapped region does not reflect the mapped memory 
> (and it is indeed mapped, since I can neglect to map it and get a page fault).
> 
> I just spent some time mapping memory by sending flexpages from one task to 
> another, defining the receive window using the recipient's buffer registers 
> so 
> that the mappings are established by the kernel (again, as I understand how 
> this actually ends up working), and this does establish the mappings, but I 
> still see a lack of coherency. Maybe I am just failing to call the 
> appropriate 
> functions to let the recipient see all of the changed memory contents, 
> however.

Jdb has facilities to check how the address spaces look like, exactly to
debug issue like you describe. You can press 's' to see all the tasks in
the system, navigate onto them, and then press 'p' to see the page-table
view. Here you can navigate the page tables and verify that the pages at
some virtual address is actually pointing to the physical location they
should point at. For a particular physical address (page frame number)
you can also show the mapping hierarchy via the 'm'.


> [...]
> 
> > > In any case, I wonder if there are any resources that describe the use of
> > > l4_task_map and the details of the program environment within tasks.
> > 
> > l4_task_map() has documentation:
> > https://l4re.org/doc/group__l4__task__api.html#ga8ed2ff7ba204de7c01311c2241
> > 2a2063 and is a direct API to the kernel for mapping resources, defined by
> > l4sys. At this level, there is not really a definition of how a program
> > environment looks like. However, as Fiasco needs to supply its initial
> > programs some capabilities, those are defined
> > (https://l4re.org/doc/group__l4__cap__api.html#gaa7801b63edba351bad9ea802643
> > 2b5c4). What moe and ned do, is similar, but not necessarily the same, as
> > they provide a more powerful interface to this
> > (https://l4re.org/doc/group__api__l4re__env.html) and also provide all
> > the functionality normal programs enjoy, like argument lists,
> > environment variables, etc.
> 
> I've been spending plenty of time looking at the documentation. However, I 
> really feel that the fundamentals of the system are often not readily 
> documented, at least in such reference documentation. So, I've also spent 
> time 
> looking at teaching materials related to L4Re and Fiasco, some of which are 
> helpful, but they obviously do not go into much detail.
> 
> That leaves the code, which is not always very easy to follow. Part of the 
> reason I just decided to implement my own IPC library and interface 
> description language was that the IPC support in L4Re is incoherent, with 
> different approaches used in different places, and sometimes convoluted, also 
> not relating obviously to the low-level libraries, thus making it difficult 
> to 
> identify common areas of functionality.

For sure this is an area where the code is pretty involved. Back in the
old days we had an IDL compiler that grew and grew and was in the end
not easy to maintain. When we switched to the capability system and thus
changing all the APIs we had the choice of whether adapting the IDL
compiler or doing something different. Back then it was a major hassle
to parse the input because eventually one wants to have the whole
language understood by the IDL compiler to be able to use all sorts of
types (of course one could make compromises there). With C / C++ that is
not so easy, at least back then. Now there's LLVM and that's a major
improvement in this area. Still the actual tool needs to be implemented
and maintained. Now, as we all see, we have opted for the "do something
else" option. With C++ as our main language and the possibilities with
it, there was the idea to implement the "IDL thing" purely with C++
directly in the code. That's what we have now. For me, all the code
around it is the IDL compiler, and abstractly, if it would not be in
header files it would sit somewhere else but it would be there in one
form or another.
 
> This might just sound like me complaining, but I also have some concerns 
> about 
> being able to verify the behaviour of some of the code. For example, I 
> recently found that my dataspace implementation was getting requests from a 
> region mapper/manager with an opcode of 0x100000000, which doesn't make any 
> sense to me at all, given that the dataspace interface code in L4Re 
> implicitly 
> defines opcodes that are all likely to be very small integers. At first I 
> obviously blamed my own code, but then I found that in the IPC call 
> implementation found here...
> 
> pk/l4re-core/l4sys/include/cxx/ipc_iface
> 
> ...if I explicitly cleared the first message register before this statement...
> 
>   int send_bytes =
>     Args::template write_op<Do_in_data>(mrs->mr, 0, Mr_bytes,
>                                         Opt::Opcode, a...);
> 
> ...then the opcode was produced as expected again.

Which does not fully make sense to me because the message registers seem
to be written from 0 on. Anyway, do you have an example maybe?

> I suppose what I am trying 
> to communicate is that some of the organisation of the code is not conducive 
> to inspection, nor does it readily reflect the mechanisms involved. Although 
> the availability of software to do a task arguably diminishes the need to be 
> familiar with what the software does, that arrangement only works if the 
> software is usable, extensible and does what it is supposed to.

Indeed, I cannot agree more.

> In case you might be wondering why I am doing any of this (as I sometimes do 
> myself), I am attempting to integrate a filesystem into L4Re, but this also 
> leads to the matter of running programs from the filesystem. And so, it 
> becomes interesting to try and create tasks and populate them with those 
> programs.

Yes, for sure!



Adam

_______________________________________________
l4-hackers mailing list
l4-hackers@os.inf.tu-dresden.de
https://os.inf.tu-dresden.de/mailman/listinfo/l4-hackers

Reply via email to