Hi Ying,

I have a new patch sets which fixes this issue using fixes from your 
patches. It deviates from your patch the following way:
In my solution, the subscription refcount keeps track of a subscription 
with or without timer. I do not increment refcount for timer, and use 
the subscriber lock plus the del_timer to find outstanding timers. I 
will send the series shortly.

I applied your series and ran some tests with it. If I run test against 
each patch individually, they seem to introduce new warnings/panics. I 
think every patch should be as correct as possible. I tried to moved 
around the patches in this series to get every patch correct, which led 
me to my series above.

1. tipc: advance the time of deleting subscription from
     subscriber->subscrp_list
I get several warnings like:
WARNING: CPU: 15 PID: 356 at lib/refcount.c:128 
refcount_sub_and_test+0x5e/0x70
refcount_t: underflow; use-after-free.
Modules linked in: tipc ip6_udp_tunnel udp_tunnel 9pnet_virtio 9p 9pnet 
virtio_net virtio_pci virtio_ring virtio
CPU: 15 PID: 356 Comm: topsrv_tester Tainted: G  W    4.10.0+ #111
Hardware name: Ericsson AB CXC1060320/ROJ208840/5, BIOS 4.6.5 11/07/2013
Call Trace:
  dump_stack+0x4d/0x72
  __warn+0xd1/0xf0
  warn_slowpath_fmt+0x4f/0x60
  ? conn_put+0x12/0x30 [tipc]
  refcount_sub_and_test+0x5e/0x70
  refcount_dec_and_test+0x11/0x20
  tipc_subscrp_put+0x12/0x30 [tipc]
  tipc_subscrp_report_overlap+0xa7/0xc0 [tipc]
  tipc_nameseq_remove_publ+0x179/0x210 [tipc]
  tipc_nametbl_remove_publ+0x59/0xf0 [tipc]
  tipc_nametbl_withdraw+0x6c/0x130 [tipc]
  tipc_sk_withdraw+0xc0/0x100 [tipc]
  tipc_release+0x56/0x290 [tipc]
  sock_release+0x1f/0x80
  sock_close+0x12/0x20
  __fput+0xbe/0x200
  ____fput+0xe/0x10
  task_work_run+0x7e/0xb0
  do_exit+0x3fd/0xc10
  ? __do_page_fault+0x27a/0x500
  do_group_exit+0x43/0xc0
  SyS_exit_group+0x14/0x20
  entry_SYSCALL_64_fastpath+0x13/0x94
RIP: 0033:0x7fcc289d22e9
RSP: 002b:00007ffc804f0ef8 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fcc289d22e9
RDX: 0000000000000000 RSI: 00007fcc28cbe323 RDI: 0000000000000000
RBP: 00007fcc28cb9860 R08: 000000000000003c R09: 00000000000000e7
R10: ffffffffffffff90 R11: 0000000000000246 R12: 00007fcc28cb9860
R13: 00007fcc28cbec60 R14: 0000000000000000 R15: 0000000000000000

And, the one below as you forgot to delete the subscription list at 
subscription timeout.
general protection fault: 0000 [#1] SMP
Modules linked in: tipc ip6_udp_tunnel udp_tunnel 9pnet_virtio 9p 9pnet 
virtio_net virtio_pci virtio_ring virtio
CPU: 17 PID: 222 Comm: kworker/u40:2 Tainted: G        W       4.10.0+ #111
Hardware name: Ericsson AB CXC1060320/ROJ208840/5, BIOS 4.6.5 11/07/2013
Workqueue: tipc_send tipc_send_work [tipc]
task: ffff880c444eef00 task.stack: ffffc90007734000
RIP: 0010:tipc_subscrb_subscrp_delete+0x6c/0x110 [tipc]
RSP: 0018:ffffc90007737d48 EFLAGS: 00010246
RAX: dead000000000200 RBX: ffff880c3cebd480 RCX: 0000000000000101
RDX: dead000000000100 RSI: 0000000000000001 RDI: ffff880c3cebd480
RBP: ffffc90007737d70 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000011 R11: 0000000000000000 R12: dead0000000000a8
R13: 0000000000000000 R14: ffff880c44638088 R15: ffff880c446380a0
FS:  0000000000000000(0000) GS:ffff880c7f440000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000084f0c0 CR3: 0000000001c0b000 CR4: 00000000001406e0
Call Trace:
  tipc_subscrb_release_cb+0x17/0x30 [tipc]
  tipc_close_conn+0x80/0x90 [tipc]
  tipc_send_work+0x1a7/0x1b0 [tipc]
  process_one_work+0x145/0x410
  worker_thread+0x69/0x4c0
  kthread+0x128/0x160
  ? process_one_work+0x410/0x410
  ? kthread_create_on_node+0x40/0x40
  ret_from_fork+0x29/0x40
Code: 49 89 c4 4d 85 ed 74 18 48 8d b3 80 00 00 00 ba 1c 00 00 00 4c 89 
ef e8 e3 51 2e e1 85 c0 75 76 48 8b 53 58 48 8b 43 60 48 89 df <48> 89 
42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 43 58
RIP: tipc_subscrb_subscrp_delete+0x6c/0x110 [tipc] RSP: ffffc90007737d48

2. tipc: adjust policy that sub->timer holds subscription kref
My testcases fail, i think the waitq should fix it. So these commits 
should be squashed together.
- create max_subscriptions for a non existent subscriber
- wait for subscription timeout
- create again the same max subscription and cancel them.
They fail as:
[16761.924321] Subscription rejected, limit reached (65535)
[16761.929661] Subscription rejected, limit reached (65535)
[16761.934966] Subscription rejected, limit reached (65535)
[16761.940281] Subscription rejected, limit reached (65535)

And with one of the your patches (the last one "tipc: delete expired 
subscription" which fixed it), I got:
general protection fault: 0000 [#1] SMP
Modules linked in: tipc(-) ip6_udp_tunnel udp_tunnel 9pnet_virtio 9p 
9pnet virtio_net virtio_pci virtio_ring virtio [last unloaded: 
ip6_udp_tunnel]
CPU: 15 PID: 8105 Comm: modprobe Tainted: G        W       4.10.0+ #111
Hardware name: Ericsson AB CXC1060320/ROJ208840/5, BIOS 4.6.5 11/07/2013
task: ffff880c3c4fe2c0 task.stack: ffffc9002227c000
RIP: 0010:tipc_nametbl_stop+0xf9/0x190 [tipc]
RSP: 0018:ffffc9002227fe00 EFLAGS: 00010286
RAX: 0200000000000000 RBX: ffff880c4d8f5700 RCX: ffff880c3c5b8ab0
RDX: ffffffff81c56cc0 RSI: 00000000000000c3 RDI: ffffea002ad57750
RBP: ffffc9002227fe50 R08: 00000000ffff880c R09: 00000000ffffffff
R10: 00000000ffffffff R11: 0000000000000000 R12: 01ffffffffffff90
R13: ffffffff81d02e00 R14: ffff880c4bba1360 R15: 01ffffffffffff90
FS:  0000000000852880(0000) GS:ffff880c7f3c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000878038 CR3: 0000000c3bda5000 CR4: 00000000001406e0
Call Trace:
  tipc_exit_net+0x2a/0x40 [tipc]
  ops_exit_list.isra.8+0x38/0x60
  unregister_pernet_operations+0x90/0xe0
  unregister_pernet_subsys+0x21/0x30
  tipc_exit+0x15/0xc6d [tipc]
  SyS_delete_module+0x192/0x200
  ? SyS_read+0x49/0xb0
  entry_SYSCALL_64_fastpath+0x13/0x94
RIP: 0033:0x4abc07
RSP: 002b:00007ffd9adbfe18 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: 0000000000877e20 RCX: 00000000004abc07
RDX: 000000000000002e RSI: 0000000000000880 RDI: 00007ffd9adbfe20
RBP: 0000000000000110 R08: 0000000000000004 R09: 0000000000000000
R10: 000000000084cd30 R11: 0000000000000246 R12: 000000000084cce0
R13: 0000000000877f30 R14: 0000000000000200 R15: 0000000000000110
Code: 45 8b 4c 24 18 45 8b 44 24 14 4c 89 ef e8 40 ee ff ff 49 8d bc 24 
80 00 00 00 be 80 00 00 00 4d 89 fc e8 3b 16 05 e1 49 8d 47 70 <49> 8b 
57 70 4c 39 f0 4c 8d 7a 90 75 bb 48 8b 43 20 48 85 c0 74
RIP: tipc_nametbl_stop+0xf9/0x190 [tipc] RSP: ffffc9002227fe00

/Partha

On 03/09/2017 03:37 PM, Ying Xue wrote:
> To be honest, I don't very like this solution. But I cannot think one
> solution is better than this one although I tried at least the following
> two methods to solve the issue.
>
> - Introduce one timeout list to subscriber. When subscription is
> expired, it will be moved to its timeout list. When name service is
> deleted or inserted, we are going to release all subscriptions linked in
> timeout list. After I tried to do the experiment, it's found we have to
> make a big change for current code;
>
> - In subscription's timeout handler, when subsription timeout event is
> sent to server through tipc_subscrp_send_event(), we can add a new flag
> in the message. When server sends the timeout message to user space send
> workqueue, it can send back a TIPC_SUB_CANCEL message to subscriber
> through tipc_subscrb_rcv_cb() if that new flag is set. When the
> TIPC_SUB_CANCEL message arrives at subscriber, the expired subscription
> can be deleted at the moment. The approach is still infeasible due to
> the same reason as previous one.
>
> In all, this solution is a bit elegant compared to above two.
>
> If you have any suggestion, please let me know.
>
> Thanks,
> Ying
>
> On 03/09/2017 10:21 PM, Ying Xue wrote:
>> After the design policy of subscription timeout is adjusted,
>> subscription is still present in name table after it's expired, and
>> hence the subscriber will receive new events for this subscription
>> until the subscriber terminates.
>>
>> Now when subscription is expired, it will be first deleted from name
>> table and its subscriber, and then schedule its work. Finally the
>> subscription will be destroyed in workqueue asynchronously.
>>
>> Signed-off-by: Ying Xue <ying....@windriver.com>
>> ---
>>  net/tipc/server.h |  2 ++
>>  net/tipc/subscr.c | 32 ++++++++++++++++++++++++++++++++
>>  net/tipc/subscr.h |  1 +
>>  3 files changed, 35 insertions(+)
>>
>> diff --git a/net/tipc/server.h b/net/tipc/server.h
>> index 34f8055..7d5588f 100644
>> --- a/net/tipc/server.h
>> +++ b/net/tipc/server.h
>> @@ -59,6 +59,7 @@
>>   * @name: server name
>>   * @imp: message importance
>>   * @type: socket type
>> + * @wq: workqueue
>>   */
>>  struct tipc_server {
>>      struct idr conn_idr;
>> @@ -78,6 +79,7 @@ struct tipc_server {
>>      char name[TIPC_SERVER_NAME_LEN];
>>      int imp;
>>      int type;
>> +    struct workqueue_struct *wq;
>>  };
>>
>>  int tipc_conn_sendmsg(struct tipc_server *s, int conid,
>> diff --git a/net/tipc/subscr.c b/net/tipc/subscr.c
>> index ea67cce..3dcb009 100644
>> --- a/net/tipc/subscr.c
>> +++ b/net/tipc/subscr.c
>> @@ -137,14 +137,34 @@ void tipc_subscrp_report_overlap(struct 
>> tipc_subscription *sub, u32 found_lower,
>>  static void tipc_subscrp_timeout(unsigned long data)
>>  {
>>      struct tipc_subscription *sub = (struct tipc_subscription *)data;
>> +    struct tipc_subscriber *subscriber = sub->subscriber;
>> +    struct tipc_net *tn = net_generic(sub->net, tipc_net_id);
>>
>>      /* Notify subscriber of timeout */
>>      tipc_subscrp_send_event(sub, sub->evt.s.seq.lower, sub->evt.s.seq.upper,
>>                              TIPC_SUBSCR_TIMEOUT, 0, 0);
>>
>> +    spin_lock_bh(&subscriber->lock);
>> +    list_del(&sub->subscrp_list);
>> +    tipc_nametbl_unsubscribe(sub);
>> +    spin_unlock_bh(&subscriber->lock);
>> +
>> +    queue_work(tn->topsrv->wq, &sub->work);
>> +
>>      tipc_subscrp_put(sub);
>>  }
>>
>> +static void tipc_subscrp_work(struct work_struct *work)
>> +{
>> +    struct tipc_subscription *sub = container_of(work,
>> +                                    struct tipc_subscription, work);
>> +    struct tipc_subscriber *subscriber = sub->subscriber;
>> +
>> +    spin_lock_bh(&subscriber->lock);
>> +    tipc_subscrp_delete(sub);
>> +    spin_unlock_bh(&subscriber->lock);
>> +}
>> +
>>  static void tipc_subscrb_kref_release(struct kref *kref)
>>  {
>>      kfree(container_of(kref,struct tipc_subscriber, kref));
>> @@ -282,6 +302,9 @@ static struct tipc_subscription 
>> *tipc_subscrp_create(struct net *net,
>>      memcpy(&sub->evt.s, s, sizeof(*s));
>>      atomic_inc(&tn->subscription_count);
>>      kref_init(&sub->kref);
>> +
>> +    INIT_WORK(&sub->work, tipc_subscrp_work);
>> +
>>      return sub;
>>  }
>>
>> @@ -379,6 +402,13 @@ int tipc_topsrv_start(struct net *net)
>>      topsrv->tipc_conn_release       = tipc_subscrb_release_cb;
>>
>>      strncpy(topsrv->name, name, strlen(name) + 1);
>> +    topsrv->wq = create_singlethread_workqueue(topsrv->name);
>> +    if (!topsrv->wq) {
>> +            kfree(saddr);
>> +            kfree(topsrv);
>> +            return -ENOMEM;
>> +    }
>> +
>>      tn->topsrv = topsrv;
>>      atomic_set(&tn->subscription_count, 0);
>>
>> @@ -391,6 +421,8 @@ void tipc_topsrv_stop(struct net *net)
>>      struct tipc_server *topsrv = tn->topsrv;
>>
>>      tipc_server_stop(topsrv);
>> +    flush_workqueue(topsrv->wq);
>> +    destroy_workqueue(topsrv->wq);
>>      kfree(topsrv->saddr);
>>      kfree(topsrv);
>>  }
>> diff --git a/net/tipc/subscr.h b/net/tipc/subscr.h
>> index ee52957..50fc64b 100644
>> --- a/net/tipc/subscr.h
>> +++ b/net/tipc/subscr.h
>> @@ -65,6 +65,7 @@ struct tipc_subscription {
>>      struct list_head subscrp_list;
>>      int swap;
>>      struct tipc_event evt;
>> +    struct work_struct work;
>>  };
>>
>>  int tipc_subscrp_check_overlap(struct tipc_name_seq *seq, u32 found_lower,
>>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
tipc-discussion mailing list
tipc-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/tipc-discussion

Reply via email to