On 01/11/17 15:09, Scott D Phillips wrote:
Lionel Landwerlin <lionel.g.landwer...@intel.com> writes:

On 31/10/17 23:04, Scott D Phillips wrote:
Lionel Landwerlin <lionel.g.landwer...@intel.com> writes:

On 31/10/17 20:54, Scott D Phillips wrote:
Lionel Landwerlin <lionel.g.landwer...@intel.com> writes:

We want to introduce a reader interface for accessing memory, so that
later on we can use different ways of storing the content of the GTT
address space that don't involve a pointer to a linear buffer.
I'm kinda sceptical that this is the best way to achieve what you want
here. It strikes me as code that we'll look at in a year and wonder
what's going on.

If I'm understanding, it seems like the essence of what you're going for
here is in the one place where you're using the sub_struct_reader. Maybe
instead of plumbing the reader object through everywhere, you can add a
callback just in gen_print_group for fixing up offsets to pointers, and
then leave everywhere else assuming contiguous memory blocks as today.
First, thanks for you time reviewing this!

I should have stated that in patch 33 I introduce a sparse memory object
that isn't contiguous.
It's based on the data structure described here :
https://en.wikipedia.org/wiki/Hash_array_mapped_trie

The idea is to split the memory into chunks of 4Kb but still make it
look like it's a 64bit address space.
The trie structure allows for reuse of pages at different point in time
without having an actual copy of the whole address space.
What I meant was that most dword reads will really be adjacent in a
piece of memory and leaving the simple pointer math there is
clearer. You will only need to callback for indirection when you're
chasing an offset or an address.

Like a couple of pages might have been written by relocations associated
to the first batch buffer, then 10 batches later you override them.
The amount of memory we need to allocate for storing 2 snapshots is just
the modified pages (+ ~12 nodes in the trie but those are less than
300bytes).
That allows the UI to decode 2 batches at the same time as well as all
the associated memory with a small cost.
Really there's no need to manage any memory for the buffers themselves,
they're immutably stored in the aub file. If you mmap the entire file
then you would just need to have a map of gfx addrs to file addrs that
would help direct your decoding.

Thanks, I'll try that.
Thinking more about it, I remember that intel_aubdump will break up
buffers into 32KiB chunks. So that would cause problems for this idea for
buffers bigger than 32KiB. We could try just not doing that splitting in
aubdump and see if it has any other adverse effects.

I gave a try to your approach and it seems to work but I'm still dealing with bugs everywhere :(
Still, I like the idea of the trie :)

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to