On Wed, 26 Dec 2012 17:59:52 -0800
Kent Overstreet <koverstr...@google.com> wrote:

> Previously, aio_read_event() pulled a single completion off the
> ringbuffer at a time, locking and unlocking each time.  Changed it to
> pull off as many events as it can at a time, and copy them directly to
> userspace.
> 
> This also fixes a bug where if copying the event to userspace failed,
> we'd lose the event.
> 
> Also convert it to wait_event_interruptible_hrtimeout(), which
> simplifies it quite a bit.
> 
> ...
>
> -static int aio_read_evt(struct kioctx *ioctx, struct io_event *ent)
> +static int aio_read_events_ring(struct kioctx *ctx,
> +                             struct io_event __user *event, long nr)
>  {
> -     struct aio_ring_info *info = &ioctx->ring_info;
> +     struct aio_ring_info *info = &ctx->ring_info;
>       struct aio_ring *ring;
> -     unsigned long head;
> -     int ret = 0;
> +     unsigned head, pos;
> +     int ret = 0, copy_ret;
> +
> +     if (!mutex_trylock(&info->ring_lock)) {
> +             __set_current_state(TASK_RUNNING);
> +             mutex_lock(&info->ring_lock);
> +     }

You're not big on showing your homework, I see :(

I agree that calling mutex_lock() in state TASK_[UN]INTERRUPTIBLE is at
least poor practice.  Assuming this is what the code is trying to do. 
But if aio_read_events_ring() is indeed called in state
TASK_[UN]INTERRUPTIBLE then the effect of the above code is to put the
task into an *unknown* state.

IOW, I don't have the foggiest clue what you're trying to do here and
you owe us all a code comment.  At least.


>       ring = kmap_atomic(info->ring_pages[0]);
> -     pr_debug("h%u t%u m%u\n", ring->head, ring->tail, ring->nr);
> +     head = ring->head;
> +     kunmap_atomic(ring);
> +
> +     pr_debug("h%u t%u m%u\n", head, info->tail, info->nr);
>  
> -     if (ring->head == ring->tail)
> +     if (head == info->tail)
>               goto out;
>  
> -     spin_lock(&info->ring_lock);
> -
> -     head = ring->head % info->nr;
> -     if (head != ring->tail) {
> -             struct io_event *evp = aio_ring_event(info, head);
> -             *ent = *evp;
> -             head = (head + 1) % info->nr;
> -             smp_mb(); /* finish reading the event before updatng the head */
> -             ring->head = head;
> -             ret = 1;
> -             put_aio_ring_event(evp);
> +     __set_current_state(TASK_RUNNING);
> +
> +     while (ret < nr) {
> +             unsigned i = (head < info->tail ? info->tail : info->nr) - head;
> +             struct io_event *ev;
> +             struct page *page;
> +
> +             if (head == info->tail)
> +                     break;
> +
> +             i = min_t(int, i, nr - ret);
> +             i = min_t(int, i, AIO_EVENTS_PER_PAGE -
> +                       ((head + AIO_EVENTS_OFFSET) % AIO_EVENTS_PER_PAGE));

min_t() is kernel shorthand for "I screwed up my types".  Methinks
`ret' should have long type.  Or, better, unsigned (negative makes no
sense).  And when a C programmer sees an variable called "i" he thinks
it has type "int", so that guy should be renamed.

Can we please clean all this up?

> +             pos = head + AIO_EVENTS_OFFSET;
> +             page = info->ring_pages[pos / AIO_EVENTS_PER_PAGE];
> +             pos %= AIO_EVENTS_PER_PAGE;
> +
> +             ev = kmap(page);
> +             copy_ret = copy_to_user(event + ret, ev + pos, sizeof(*ev) * i);
> +             kunmap(page);
> +
> +             if (unlikely(copy_ret)) {
> +                     ret = -EFAULT;
> +                     goto out;
> +             }
> +
> +             ret += i;
> +             head += i;
> +             head %= info->nr;
>       }
> -     spin_unlock(&info->ring_lock);
>  
> -out:
> +     ring = kmap_atomic(info->ring_pages[0]);
> +     ring->head = head;
>       kunmap_atomic(ring);
> -     pr_debug("%d  h%u t%u\n", ret, ring->head, ring->tail);
> +
> +     pr_debug("%d  h%u t%u\n", ret, head, info->tail);
> +out:
> +     mutex_unlock(&info->ring_lock);
> +
>       return ret;
>  }
> 
> ...
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to