On 11-Oct-18 11:07 AM, Shreyansh Jain wrote:
On Thursday 11 October 2018 03:32 PM, Shreyansh Jain wrote:
On Thursday 11 October 2018 02:33 PM, Burakov, Anatoly wrote:
On 09-Oct-18 11:45 AM, Shreyansh Jain wrote:
On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote:
Hello Anatoly,

On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote:
On 25-Sep-18 1:54 PM, Shreyansh Jain wrote:
A common library, valid for dpaaX drivers, which is used to maintain
a local copy of PA->VA translations.

In case of physical addressing mode (one of the option for FSLMC, and
only option for DPAA bus), the addresses of descriptors Rx'd are
physical. These need to be converted into equivalent VA for rte_mbuf
and other similar calls.

Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This
library is an attempt to reduce the overall cost associated with
this translation.

A small table is maintained, containing continuous entries
representing a continguous physical range. Each of these entries
stores the equivalent VA, which is fed during mempool creation, or
memory allocation/deallocation callbacks.


[...]



Also, a couple of nitpicks below.

  cosnfig/common_base                            |   5 +
  config/common_linuxapp                        |   5 +
  drivers/common/Makefile                       |   4 +
  drivers/common/dpaax/Makefile                 |  31 ++
  drivers/common/dpaax/dpaax_iova_table.c       | 509 ++++++++++++++++++
  drivers/common/dpaax/dpaax_iova_table.h       | 104 ++++
  drivers/common/dpaax/dpaax_logs.h             |  39 ++
  drivers/common/dpaax/meson.build              |  12 +

<snip>

+    DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for vaddr:(%p),"
+            " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset,
+            vaddr, paddr, length);
+    return 0;
+}
+
+int
+dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused)

len is not unused.

I will fix this.
Actually, this function itself is useless - more for symmetry reason.
Callers would be either simply updating the table, or ignoring it completely. But, yes, this is indeed wrong that I set that unused.


Actually, I was wrong in my first reply. In case of dpaax_iova_table_del(), len is indeed redundant. This is because the mapping is for a complete page (min of 2MB size), even if the request is for lesser length. So, removal of a single entry (of fixed size) would be done.

In fact, while on this, I think deleting a PA->VA entry itself is incorrect (not just useless). A single entry (~2MB equivalent) can represent multiple users (working on a rte_malloc'd area, for example). So, effectively, its always an update - not an add or del.

I'm not sure what you mean here. If you got a mem event about memory area being freed, it's guaranteed to *not* have any users - neither malloc, nor any other memory. And len is always page-aligned.

ok. Maybe I am getting this wrong, but consider this:

1) hugepage size=2MB
2) a = malloc(1M)
   this will pin an entry in table for a block starting at VA=(a) and PA=(a'). Each entry is of 2MB length - that means, even if someone were to access a+1048577 for an equivalent PA, they would get it (though, that is a incorrect access).
3) b = malloc(1M)
   this *might* lead to a case where same 2MB page is used and VA=(b==(a+1MB)). Being hugepage backed, PA=(b=PA(a)+1M). = After b, the PA-VA table has a single entry of 2MB, representing two mallocs. It can be used for translation for any thread requesting PAs of a or b.
4) Free(a)
  - this would attempt to remove one 2MB entry from PA-VA table. But, 'b' is already valid. Access to get_pa(VA(b)) should return me the PA(b).   - 'len' is not even used as the entry in PA-VA table is of a fixed size.

Just to add to this:
- if talking about the mem_event callback, it definitely won't be a case where same page is still being served under another rte_malloc - But, calls can come to delete from users of PA-VA table based on their own rte_free().

And, your comment makes me think - I should probably del entry from the table only when mem_event callback is received.

Mem events are not triggered on rte_free(), they're triggered on page deallocation. A call to rte_free/rte_memzone_free/rte_mempool_free etc. *might* trigger a page deallocation, but *only* if the memory area being freed encompasses an entire page. If you rte_malloc() 64 bytes and then rte_free() those 64 bytes, you won't get a mem event *unless* these were the only 64 bytes allocated on a particular page, and the entire page is no longer used by anything else.



In the above, (3) is an assumption I am making based on my understanding how mem allocator is working. Is that wrong?

Basically, this is a restriction of this table - it has a min chunk of 2MB - even for 1G hugepages - and hence, it is not possible to honor deletes. I know this is convoluted logic - but, this keeps it simple and use-able without much performance impact.

[...]





--
Thanks,
Anatoly

Reply via email to