Roland Dreier <[EMAIL PROTECTED]> wrote:
>
> ...
> +/*
> + * This isn't perfect, but it's close enough for timing work. We want this
> + * to work on systems where the cycle counter isn't the same as the clock
> + * frequency.  The one msec spin is OK, since we execute this only once
> + * when first loaded.  We don't use CURRENT_TIME because on some systems
> + * it only has jiffy resolution; we just assume udelay is well calibrated
> + * and that we aren't likely to be rescheduled.  Do it multiple times,
> + * with a yield in between, to try to make sure we get the "true minimum"
> + * value.
> + * _ipath_pico_per_cycle isn't going to lead to completely accurate
> + * conversions from timestamps to nanoseconds, but it's close enough
> + * for our purposes, which is mainly to allow people to show events with
> + * nsecs or usecs if desired, rather than cycles.
> + */
> +void ipath_init_picotime(void)
> +{
> +     int i;
> +     u_int64_t ts, te, delta = -1ULL;
> +
> +     for (i = 0; i < 5; i++) {
> +             ts = get_cycles();
> +             udelay(250);
> +             te = get_cycles();
> +             if ((te - ts) < delta)
> +                     delta = te - ts;
> +             yield();
> +     }
> +     _ipath_pico_per_cycle = 250000000 / delta;
> +}

hm, I hope this is debug code which is going away.  If not, we should take
a look at what it's trying to do here.


> +/*
> + * Our version of the kernel mlock function.  This function is no longer
> + * exposed, so we need to do it ourselves.  It takes a given start page
> + * (page aligned user virtual address) and pins it and the following 
> specified
> + * number of pages.
> + * For now, num_pages is always 1, but that will probably change at some
> + * point (because caller is doing expected sends on a single virtually
> + * contiguous buffer, so we can do all pages at once).
> + */
> +int ipath_mlock(unsigned long start_page, size_t num_pages, struct page **p)
> +{
> +     int n;
> +
> +     _IPATH_VDBG("pin %lx pages from vaddr %lx\n", num_pages, start_page);
> +     down_read(&current->mm->mmap_sem);
> +     n = get_user_pages(current, current->mm, start_page, num_pages, 1, 1,
> +                        p, NULL);
> +     up_read(&current->mm->mmap_sem);
> +     if (n != num_pages) {
> +             _IPATH_INFO
> +                 ("get_user_pages (0x%lx pages starting at 0x%lx failed with 
> %d\n",
> +                  num_pages, start_page, n);
> +             if (n < 0)      /* it's an errno */
> +                     return n;
> +             return -ENOMEM; /* no way to know actual error */
> +     }
> +
> +     return 0;
> +}

OK.  It's perhaps not a very well named function.

> +/*
> + * this is similar to ipath_mlock, but it's always one page, and we mark
> + * the page as locked for i/o, and shared.  This is used for the user process
> + * page that contains the destination address for the rcvhdrq tail update,
> + * so we need to have the vma.  If we don't do this, the page can be taken
> + * away from us on fork, even if the child never touches it, and then
> + * the user process never sees the tail register updates.
> + */
> +int ipath_mlock_nocopy(unsigned long start_page, struct page **p)
> +{
> +     int n;
> +     struct vm_area_struct *vm = NULL;
> +
> +     down_read(&current->mm->mmap_sem);
> +     n = get_user_pages(current, current->mm, start_page, 1, 1, 1, p, &vm);
> +     up_read(&current->mm->mmap_sem);
> +     if (n != 1) {
> +             _IPATH_INFO("get_user_pages for 0x%lx failed with %d\n",
> +                         start_page, n);
> +             if (n < 0)      /* it's an errno */
> +                     return n;
> +             return -ENOMEM; /* no way to know actual error */
> +     }
> +     vm->vm_flags |= VM_SHM | VM_LOCKED;
> +
> +     return 0;
> +}

I don't think we want to be setting the user's VMA's vm_flags in this
manner.  This is purely to retain the physical page across fork?

> +/*
> + * Our version of the kernel munlock function.  This function is no longer
> + * exposed, so we need to do it ourselves.  It unpins the start page
> + * (a page aligned full user virtual address, not a page number)
> + * and pins it and the following specified number of pages.
> + */
> +int ipath_munlock(size_t num_pages, struct page **p)
> +{
> +     int i;
> +
> +     for (i = 0; i < num_pages; i++) {
> +             _IPATH_MMDBG("%u/%lu put_page %p\n", i, num_pages, p[i]);
> +             SetPageDirty(p[i]);
> +             put_page(p[i]);
> +     }
> +     return 0;
> +}

Nope, SetPageDirty() doesn't tell the VM that the page is dirty - it'll
never get written out.  Use set_page_dirty_lock().


_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to