On 10/18/2011 12:25 AM, Alexis Berlemont wrote:
> Hi,
>
> 2011/10/12 Fernando Herrero Carrón <[email protected]>:
>> El 12 de octubre de 2011 16:13, Fernando Herrero Carrón <[email protected]>
>> escribió:
>>>
>>> El 11 de octubre de 2011 19:12, Alexis Berlemont
>>> <[email protected]> escribió:
>>> [...]
>>>
>>>>
>>>> I took some time to compare both versions of code (comedi and
>>>> analogy). I did not find anything interesting in mite.c. I was about
>>>> to ask you to increase verbosity (debug + a specific patch) when I got
>>>> a glimpse on the allocation of the asynchronous buffer on the comedi
>>>> side.
>>>>
>>>> The methods are not the same at that level:
>>>> - comedi: n * dma_alloc_coherent + a vmap at the end
>>>> - analogy: a big vmalloc + n * page_to_phys(vmalloc_to_page(vaddr)
>>>
>>> Hmmm, quoting
>>> http://www.mjmwired.net/kernel/Documentation/DMA-mapping.txt:
>>>
>>> If you acquired your memory via the page allocator
>>> (i.e. __get_free_page*()) or the generic memory allocators
>>>
>>> (i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from
>>> that memory using the addresses returned from those routines.
>>>
>>> This means specifically that you may _not_ use the memory/addresses
>>>
>>>
>>>
>>>
>>> returned from vmalloc() for DMA. It is possible to DMA to the
>>> _underlying_ memory mapped into a vmalloc() area, but this requires
>>> walking page tables to get the physical addresses, and then
>>>
>>>
>>>
>>>
>>> translating each of those pages back to a kernel address using
>>> something like __va(). [ EDIT: Update this when we integrate
>>> Gerd Knorr's generic code which does this. ]
>>>
>>> So, I guess analogy indeed took the walking approach mentioned there? If I
>>> understand it right, the following loop in "a4l_buf_alloc()":
>>>
>>> for (vaddr = vabase; vaddr < vabase + buf_desc->size;
>>> vaddr += PAGE_SIZE)
>>> buf_desc->pg_list[(vaddr - vabase) >> PAGE_SHIFT] =
>>> (unsigned long) page_to_phys(vmalloc_to_page(vaddr));
>>>
>>> does exactly this, by holding a list of the physical addresses of all the
>>> logical pages of the buffer, even if they may be non-contiguous. Then, the
>>> MITE is able to scatter data across the ring descriptors calculated in
>>> a4l_mite_buf_change()? What is the benefit of using vmalloc?
>
> A vmalloced area is composed of pages which do not have to be
> physically contiguous, the kernel's page table is filled so that
> sparse physical pages are reachable through a virtual contiguous area.
> This is a great advantage when your OS does not manage to allocate
> physically contiguous area (because of fragmentation: free memory 4KB
> pages in the middle of used memory pages).
>
> - On the device side, if your DMA controller can work with
> non-contiguous buffer, you just have to indicate each page
> - On the CPU side, you work with a virtually contiguous buffer (so
> really easy to manipulate).
>
> I did not use vmap because I did not know it...
>
>>> Copying from/to
>>> user space is easier so?
>>>
>>> According to my previous test, the addresses calculated are all indeed
>>> larger than 2^32. This makes sense as well, since this machine appears to
>>> have 6GB of memory:
>>>
>>> [ 0.000000] Memory: 5992084k/7208960k available (5325k kernel code,
>>> 919428k absent, 297448k reserved, 3285k data, 920k init)
>>>
>>> The comedi drivers and kernel were not installed by myself, so
>>> reinstalling them is somewhat more involved. If you still feel it would be
>>> useful to check them out I will reinstall them, but this looks to me like
>>> the possible source of the problem.
>>>
>>
>> I got it working!!! Simple test: remove two of the three RAM modules. Now
>> the machine is working with 2GB of memory:
>>
>> [ 0.000000] Memory: 1988808k/2095680k available (5325k kernel code, 452k
>> absent, 106420k reserved, 3285k data, 920k init)
>>
>> Now "cmd_read" is properly acquiring the input signal. Output of dmesg now:
>>
>> [ 109.389613] Analogy: sizeof(dma_addr_t) = 8
>> [ 109.389614] Analogy: ring->descriptors_dma_addr = 7a279000
>> [ 109.389615] Analogy: cpu_to_le32(ring->descriptors_dma_addr) = 7a279000
>> [ 109.389617] Analogy: buf->pg_list[0] = 79322000
>> [ 109.389618] Analogy: buf->pg_list[1] = 799bf000
>> [ 109.389619] Analogy: buf->pg_list[2] = 79b67000
>> [ 109.389620] Analogy: buf->pg_list[3] = 79303000
>> [ 109.389621] Analogy: buf->pg_list[4] = 79015000
>> [ 109.389622] Analogy: buf->pg_list[5] = 7997f000
>> [ 109.389623] Analogy: buf->pg_list[6] = 792c1000
>> [ 109.389625] Analogy: buf->pg_list[7] = 792a7000
>> [ 109.389626] Analogy: buf->pg_list[8] = 7a087000
>> [ 109.389627] Analogy: buf->pg_list[9] = 792c0000
>> [ 109.389628] Analogy: buf->pg_list[10] = 79b36000
>> [ 109.389629] Analogy: buf->pg_list[11] = 792b6000
>> [ 109.389630] Analogy: buf->pg_list[12] = 792d0000
>> [ 109.389631] Analogy: buf->pg_list[13] = 7999d000
>> [ 109.389632] Analogy: buf->pg_list[14] = 7a1f7000
>> [ 109.389634] Analogy: buf->pg_list[15] = 791e0000
>>
>> with all pg_list[] entries below 2^32!!
>>
>> Thus far this does it for us, since we can live with a 4GB machine. I think
>> the "vmalloc()" approach in analogy should be reworked, but my knowledge of
>> linux's internals on memory handling is very limited. Please let me know if
>> I can contribute testing any patches.
>>
>
> The last few days, I did not have enough time to review the buffer
> allocation system just like Comedi did. So, I implemented a quick
> workaround Gilles indicated me.
>
> Could you validate it on your 64bits architecture (with more than 4GB of RAM)?
>
> The code is available here:
> git://[email protected]/xenomai-abe.git
> branch: analogy.
I pull this for -rc5.
--
Gilles.
_______________________________________________
Xenomai-help mailing list
[email protected]
https://mail.gna.org/listinfo/xenomai-help