Re: [capnproto] TSAN faiure

2020-10-13 Thread 'Kenton Varda' via Cap'n Proto
On Tue, Oct 13, 2020 at 5:15 AM Erin Shepherd  wrote:

> > I think that ThreadSanitizer is having trouble recognizing that the
> initialization of `brokenCapFactory` is thread-safe, due to the awkward way
> in which it is initialized. It may end up being initialized by multiple
> threads, but all threads will initialize it to the same value, hence no
> atomics are necessary when reading it later.
>
> This doesn't hold unless every thread reading it also once wrote it.
>

That is the case here.

-Kenton

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/CAJouXQn5Q%2Bz6Z2B9cuNUX6-tWF5E80EKMhB0wBqFAprkbuVBPw%40mail.gmail.com.


Re: [capnproto] TSAN faiure

2020-10-13 Thread Erin Shepherd
One option I see which would sidestep this entirely is to default 
initialise the brokenCapFactory to something that performs a read with a 
stronger load. You'd still need to use explicit atomic loads to not trigger 
UB, of course, but you could have something like


class UninitializedBrokenCapFactoryImpl: public _::BrokenCapFactory { 
public: 
kj::Own newBrokenCap(kj::StringPtr description) override { 
  auto realFactory = atomic_load(brokenCapFactory, acquire);
  KJ_REQUIRE(realFactory != this, "must initialise...");
  return realFactory->newBrokenCap(description);
} 
...
} 

static UninitializedBrokenCapFactoryImpl uninitializedBrokenCapFactory;
static BrokenCapFactory* brokenCapFactory = &uninitializedBrokenCapFactory;


On Tuesday, October 13, 2020 at 12:15:26 PM UTC+2 Erin Shepherd wrote:

> > I think that ThreadSanitizer is having trouble recognizing that the 
> initialization of `brokenCapFactory` is thread-safe, due to the awkward way 
> in which it is initialized. It may end up being initialized by multiple 
> threads, but all threads will initialize it to the same value, hence no 
> atomics are necessary when reading it later.
>
> This doesn't hold unless every thread reading it also once wrote it. If it 
> might be written by T1 and read (but never initialised) by T2, then 
> appropriate write/read barriers need to be in place (e.g. acquire/release).
>
> Erin
>
> On Friday, October 9, 2020 at 6:48:06 PM UTC+2 ken...@cloudflare.com 
> wrote:
>
>> I think that ThreadSanitizer is having trouble recognizing that the 
>> initialization of `brokenCapFactory` is thread-safe, due to the awkward way 
>> in which it is initialized. It may end up being initialized by multiple 
>> threads, but all threads will initialize it to the same value, hence no 
>> atomics are necessary when reading it later.
>>
>> Maybe we should use atomic reads anyway, to make ThreadSanitizer happy. 
>> Doing so should be free, at least on x86.
>>
>> (The reason for the weird initialization is that the factory is 
>> implemented in libcapnp-rpc.so, yet needs to be callable from libcapnp.so 
>> -- but only if RPC is in use.)
>>
>> -Kenton
>>
>> On Fri, Oct 9, 2020 at 8:37 AM Vitali Lovich  wrote:
>>
>>> I'm trying to do a basic in-process connection of the servers & running 
>>> that under TSAN but I feel like this is some nuance of the APIs I'm missing 
>>> or using the API totally incorrectly outright, so I would appreciate some 
>>> feedback if possible. I'm sure the bug must be in my code rather than 
>>> cap'n'proto. I don't have a helpful standalone piece of code to demonstrate 
>>> this issue at the moment (but if it's really critical I can work on 
>>> providing that). I've provided brief snippets of what the threads do (the 
>>> threads themselves are really that simple, it's just the schema & the 
>>> threads using various helper internal libraries that make it harder to post 
>>> a working standalone example).
>>>
>>> Full TSAN (to avoid spamming the list): https://pastebin.com/hSkVDgsU
>>>
>>> For context, the main thread does 
>>> auto ioContext = kj::setupAsyncIo();
>>> auto serverThread = ioContext.provider->newPipeThread(...);
>>> auto serverPtr = ;
>>> capnp::TwoPartyClient rpcClient(*serverThread.pipe);
>>> auto client = rpcClient.bootstrap().castAs();
>>> auto connectedResponse = 
>>> client.connectRequest().send().wait(ioContext.waitScope); *<-- this is 
>>> the problematic TSAN line *
>>>
>>> T1 (taking AsyncIoProvider&, AsyncIoStream& stream, kj::WaitScope& 
>>> waitScope):
>>> capnp::TwoPartyVatNetwork network(stream, 
>>> capnp::rpc::twoparty::Side::SERVER);
>>> auto myServer = kj::heap();
>>> auto myServerPtr = myServer.get()
>>> auto rpcServer = capnp::makeRpcServer(network, kj::mv( myServer));
>>>   
>>> network.onDisconnect().wait(waitScope);* <- problematic TSAN line*
>>>
>>> SUMMARY: ThreadSanitizer: data race 
>>> capnproto/c++/src/capnp/layout.c++:2188:5 in 
>>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*, 
>>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>>>
>>>   Read of size 8 at 0x016673e0 by main thread:
>>> #0 
>>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*, 
>>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int) 
>>> capnproto/c++/src/capnp/layout.c++:2188:5 (ld-2.26.so+0x809e60)
>>> #1 capnp::_::PointerReader::getCapability() const 
>>> capnproto/c++/src/capnp/layout.c++:2705 (ld-2.26.so+0x809e60)
>>> #2 
>>> capnp::AnyPointer::Reader::getPipelinedCap(kj::ArrayPtr>> const>) const capnproto/c++/src/capnp/any.c++:53:18 (ld-2.26.so
>>> +0x7ff4a7)
>>> #3 capnp::_::(anonymous 
>>>
>>>   Previous atomic write of size 8 at 0x016673e0 by thread T1:
>>> #0 __tsan_atomic64_store  (ld-2.26.so+0x6911ee)
>>> #1 
>>> capnp::_::setGlobalBrokenCapFactoryForLayoutCpp(capnp::_::BrokenCapFactory&)
>>>  
>>> capnproto/c++/src/capnp/layout.c++:45:3 (ld-2.26.so+0x800d12)
>>> #2 
>>> capnp::Reader

Re: [capnproto] TSAN faiure

2020-10-13 Thread Erin Shepherd
> I think that ThreadSanitizer is having trouble recognizing that the 
initialization of `brokenCapFactory` is thread-safe, due to the awkward way 
in which it is initialized. It may end up being initialized by multiple 
threads, but all threads will initialize it to the same value, hence no 
atomics are necessary when reading it later.

This doesn't hold unless every thread reading it also once wrote it. If it 
might be written by T1 and read (but never initialised) by T2, then 
appropriate write/read barriers need to be in place (e.g. acquire/release).

Erin

On Friday, October 9, 2020 at 6:48:06 PM UTC+2 ken...@cloudflare.com wrote:

> I think that ThreadSanitizer is having trouble recognizing that the 
> initialization of `brokenCapFactory` is thread-safe, due to the awkward way 
> in which it is initialized. It may end up being initialized by multiple 
> threads, but all threads will initialize it to the same value, hence no 
> atomics are necessary when reading it later.
>
> Maybe we should use atomic reads anyway, to make ThreadSanitizer happy. 
> Doing so should be free, at least on x86.
>
> (The reason for the weird initialization is that the factory is 
> implemented in libcapnp-rpc.so, yet needs to be callable from libcapnp.so 
> -- but only if RPC is in use.)
>
> -Kenton
>
> On Fri, Oct 9, 2020 at 8:37 AM Vitali Lovich  wrote:
>
>> I'm trying to do a basic in-process connection of the servers & running 
>> that under TSAN but I feel like this is some nuance of the APIs I'm missing 
>> or using the API totally incorrectly outright, so I would appreciate some 
>> feedback if possible. I'm sure the bug must be in my code rather than 
>> cap'n'proto. I don't have a helpful standalone piece of code to demonstrate 
>> this issue at the moment (but if it's really critical I can work on 
>> providing that). I've provided brief snippets of what the threads do (the 
>> threads themselves are really that simple, it's just the schema & the 
>> threads using various helper internal libraries that make it harder to post 
>> a working standalone example).
>>
>> Full TSAN (to avoid spamming the list): https://pastebin.com/hSkVDgsU
>>
>> For context, the main thread does 
>> auto ioContext = kj::setupAsyncIo();
>> auto serverThread = ioContext.provider->newPipeThread(...);
>> auto serverPtr = ;
>> capnp::TwoPartyClient rpcClient(*serverThread.pipe);
>> auto client = rpcClient.bootstrap().castAs();
>> auto connectedResponse = 
>> client.connectRequest().send().wait(ioContext.waitScope); *<-- this is 
>> the problematic TSAN line *
>>
>> T1 (taking AsyncIoProvider&, AsyncIoStream& stream, kj::WaitScope& 
>> waitScope):
>> capnp::TwoPartyVatNetwork network(stream, 
>> capnp::rpc::twoparty::Side::SERVER);
>> auto myServer = kj::heap();
>> auto myServerPtr = myServer.get()
>> auto rpcServer = capnp::makeRpcServer(network, kj::mv( myServer));
>>   
>> network.onDisconnect().wait(waitScope);* <- problematic TSAN line*
>>
>> SUMMARY: ThreadSanitizer: data race 
>> capnproto/c++/src/capnp/layout.c++:2188:5 in 
>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*, 
>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>>
>>   Read of size 8 at 0x016673e0 by main thread:
>> #0 
>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*, 
>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int) 
>> capnproto/c++/src/capnp/layout.c++:2188:5 (ld-2.26.so+0x809e60)
>> #1 capnp::_::PointerReader::getCapability() const 
>> capnproto/c++/src/capnp/layout.c++:2705 (ld-2.26.so+0x809e60)
>> #2 
>> capnp::AnyPointer::Reader::getPipelinedCap(kj::ArrayPtr> const>) const capnproto/c++/src/capnp/any.c++:53:18 (ld-2.26.so+0x7ff4a7)
>> #3 capnp::_::(anonymous 
>>
>>   Previous atomic write of size 8 at 0x016673e0 by thread T1:
>> #0 __tsan_atomic64_store  (ld-2.26.so+0x6911ee)
>> #1 
>> capnp::_::setGlobalBrokenCapFactoryForLayoutCpp(capnp::_::BrokenCapFactory&) 
>> capnproto/c++/src/capnp/layout.c++:45:3 (ld-2.26.so+0x800d12)
>> #2 
>> capnp::ReaderCapabilityTable::ReaderCapabilityTable(kj::Array
>>  
>> > >) capnproto/c++/src/capnp/capability.c++:951:3 (ld-2.26.so+0x6f32cc)
>> #3 capnp::_::(anonymous 
>> namespace)::RpcConnectionState::RpcCallContext::RpcCallContext(capnp::_::(anonymous
>>  
>> namespace)::RpcConnectionState&, unsigned int, 
>> kj::Own&&, 
>> kj::Array > >, 
>> capnp::AnyPointer::Reader const&, bool, kj::Own 
>> >&&, unsigned long, unsigned short) capnproto/c++/src/capnp/rpc.c++:2144:11 
>> (ld-2.26.so+0x754bea)
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Cap'n Proto" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to capnproto+...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/capnproto/19bac31b-6f11-43d1-886e-93d6bc32557an%40googlegroups.com
>>  
>> 

Re: [capnproto] TSAN faiure

2020-10-10 Thread Vitali Lovich
Sounds good. Relaxed is what I went with and it seemed to make TSAN happy.
I uploaded a pull request for this (and the UBSAN issue).

Thanks for your guidance on this.

On Sat, Oct 10, 2020 at 12:14 PM Kenton Varda  wrote:

> An atomic read with "relaxed" ordering should be free. Even "acquire"
> ordering (if TSAN insists on it) is free on x86. So I'd just have it always
> be atomic.
>
> -Kenton
>
> On Fri, Oct 9, 2020 at 4:23 PM Vitali Lovich  wrote:
>
>> Ok. That seems to have made TSAN happy (I still don't understand the
>> exact reason it's safe but I'll trust you're higher level reasoning about
>> the internals of capnp :D).
>>
>> Should this atomic read happen only when running under TSAN or just bite
>> the bullet & do it regardless? It's unclear to me if this is a hotpath
>> that's carefully tuned
>>
>> On Fri, Oct 9, 2020 at 10:00 AM 'Kenton Varda' via Cap'n Proto <
>> capnproto@googlegroups.com> wrote:
>>
>>> It looks like the writes are done atomically, but the reads aren't. TSAN
>>> is right to be suspicious of this. In practice there is no bug here, but
>>> showing that requires higher-level reasoning that TSAN wouldn't be expected
>>> to figure out.
>>>
>>> -Kenton
>>>
>>> On Fri, Oct 9, 2020 at 11:52 AM Vitali Lovich  wrote:
>>>
 So you're saying just change the read to atomic read?

 Is it possible there's a legitimate read-before-write race condition &
 that's what TSAN is complaining about? It doesn't seem to be complaining
 about concurrent writes but I also haven't investigated this codepath to
 understand yet as I assumed the problem was in my code.

 On Fri, Oct 9, 2020 at 9:48 AM Kenton Varda 
 wrote:

> I think that ThreadSanitizer is having trouble recognizing that the
> initialization of `brokenCapFactory` is thread-safe, due to the awkward 
> way
> in which it is initialized. It may end up being initialized by multiple
> threads, but all threads will initialize it to the same value, hence no
> atomics are necessary when reading it later.
>
> Maybe we should use atomic reads anyway, to make ThreadSanitizer
> happy. Doing so should be free, at least on x86.
>
> (The reason for the weird initialization is that the factory is
> implemented in libcapnp-rpc.so, yet needs to be callable from libcapnp.so
> -- but only if RPC is in use.)
>
> -Kenton
>
> On Fri, Oct 9, 2020 at 8:37 AM Vitali Lovich 
> wrote:
>
>> I'm trying to do a basic in-process connection of the servers &
>> running that under TSAN but I feel like this is some nuance of the APIs 
>> I'm
>> missing or using the API totally incorrectly outright, so I would
>> appreciate some feedback if possible. I'm sure the bug must be in my code
>> rather than cap'n'proto. I don't have a helpful standalone piece of code 
>> to
>> demonstrate this issue at the moment (but if it's really critical I can
>> work on providing that). I've provided brief snippets of what the threads
>> do (the threads themselves are really that simple, it's just the schema &
>> the threads using various helper internal libraries that make it harder 
>> to
>> post a working standalone example).
>>
>> Full TSAN (to avoid spamming the list): https://pastebin.com/hSkVDgsU
>>
>> For context, the main thread does
>> auto ioContext = kj::setupAsyncIo();
>> auto serverThread = ioContext.provider->newPipeThread(...);
>> auto serverPtr = ;
>> capnp::TwoPartyClient rpcClient(*serverThread.pipe);
>> auto client = rpcClient.bootstrap().castAs();
>> auto connectedResponse =
>> client.connectRequest().send().wait(ioContext.waitScope); *<-- this
>> is the problematic TSAN line *
>>
>> T1 (taking AsyncIoProvider&, AsyncIoStream& stream, kj::WaitScope&
>> waitScope):
>> capnp::TwoPartyVatNetwork network(stream,
>> capnp::rpc::twoparty::Side::SERVER);
>> auto myServer = kj::heap();
>> auto myServerPtr = myServer.get()
>> auto rpcServer = capnp::makeRpcServer(network, kj::mv( myServer));
>> 
>> network.onDisconnect().wait(waitScope);* <- problematic TSAN line*
>>
>> SUMMARY: ThreadSanitizer: data race
>> capnproto/c++/src/capnp/layout.c++:2188:5 in
>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>>
>>   Read of size 8 at 0x016673e0 by main thread:
>> #0
>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>> capnproto/c++/src/capnp/layout.c++:2188:5 (ld-2.26.so+0x809e60)
>> #1 capnp::_::PointerReader::getCapability() const
>> capnproto/c++/src/capnp/layout.c++:2705 (ld-2.26.so+0x809e60)
>> #2
>> capnp::AnyPointer::Reader::getPipelinedCap(kj::ArrayPtr> const>) const capnproto

Re: [capnproto] TSAN faiure

2020-10-10 Thread 'Kenton Varda' via Cap'n Proto
An atomic read with "relaxed" ordering should be free. Even "acquire"
ordering (if TSAN insists on it) is free on x86. So I'd just have it always
be atomic.

-Kenton

On Fri, Oct 9, 2020 at 4:23 PM Vitali Lovich  wrote:

> Ok. That seems to have made TSAN happy (I still don't understand the exact
> reason it's safe but I'll trust you're higher level reasoning about
> the internals of capnp :D).
>
> Should this atomic read happen only when running under TSAN or just bite
> the bullet & do it regardless? It's unclear to me if this is a hotpath
> that's carefully tuned
>
> On Fri, Oct 9, 2020 at 10:00 AM 'Kenton Varda' via Cap'n Proto <
> capnproto@googlegroups.com> wrote:
>
>> It looks like the writes are done atomically, but the reads aren't. TSAN
>> is right to be suspicious of this. In practice there is no bug here, but
>> showing that requires higher-level reasoning that TSAN wouldn't be expected
>> to figure out.
>>
>> -Kenton
>>
>> On Fri, Oct 9, 2020 at 11:52 AM Vitali Lovich  wrote:
>>
>>> So you're saying just change the read to atomic read?
>>>
>>> Is it possible there's a legitimate read-before-write race condition &
>>> that's what TSAN is complaining about? It doesn't seem to be complaining
>>> about concurrent writes but I also haven't investigated this codepath to
>>> understand yet as I assumed the problem was in my code.
>>>
>>> On Fri, Oct 9, 2020 at 9:48 AM Kenton Varda 
>>> wrote:
>>>
 I think that ThreadSanitizer is having trouble recognizing that the
 initialization of `brokenCapFactory` is thread-safe, due to the awkward way
 in which it is initialized. It may end up being initialized by multiple
 threads, but all threads will initialize it to the same value, hence no
 atomics are necessary when reading it later.

 Maybe we should use atomic reads anyway, to make ThreadSanitizer happy.
 Doing so should be free, at least on x86.

 (The reason for the weird initialization is that the factory is
 implemented in libcapnp-rpc.so, yet needs to be callable from libcapnp.so
 -- but only if RPC is in use.)

 -Kenton

 On Fri, Oct 9, 2020 at 8:37 AM Vitali Lovich  wrote:

> I'm trying to do a basic in-process connection of the servers &
> running that under TSAN but I feel like this is some nuance of the APIs 
> I'm
> missing or using the API totally incorrectly outright, so I would
> appreciate some feedback if possible. I'm sure the bug must be in my code
> rather than cap'n'proto. I don't have a helpful standalone piece of code 
> to
> demonstrate this issue at the moment (but if it's really critical I can
> work on providing that). I've provided brief snippets of what the threads
> do (the threads themselves are really that simple, it's just the schema &
> the threads using various helper internal libraries that make it harder to
> post a working standalone example).
>
> Full TSAN (to avoid spamming the list): https://pastebin.com/hSkVDgsU
>
> For context, the main thread does
> auto ioContext = kj::setupAsyncIo();
> auto serverThread = ioContext.provider->newPipeThread(...);
> auto serverPtr = ;
> capnp::TwoPartyClient rpcClient(*serverThread.pipe);
> auto client = rpcClient.bootstrap().castAs();
> auto connectedResponse =
> client.connectRequest().send().wait(ioContext.waitScope); *<-- this
> is the problematic TSAN line *
>
> T1 (taking AsyncIoProvider&, AsyncIoStream& stream, kj::WaitScope&
> waitScope):
> capnp::TwoPartyVatNetwork network(stream,
> capnp::rpc::twoparty::Side::SERVER);
> auto myServer = kj::heap();
> auto myServerPtr = myServer.get()
> auto rpcServer = capnp::makeRpcServer(network, kj::mv( myServer));
> 
> network.onDisconnect().wait(waitScope);* <- problematic TSAN line*
>
> SUMMARY: ThreadSanitizer: data race
> capnproto/c++/src/capnp/layout.c++:2188:5 in
> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>
>   Read of size 8 at 0x016673e0 by main thread:
> #0
> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
> capnproto/c++/src/capnp/layout.c++:2188:5 (ld-2.26.so+0x809e60)
> #1 capnp::_::PointerReader::getCapability() const
> capnproto/c++/src/capnp/layout.c++:2705 (ld-2.26.so+0x809e60)
> #2
> capnp::AnyPointer::Reader::getPipelinedCap(kj::ArrayPtr const>) const capnproto/c++/src/capnp/any.c++:53:18 (ld-2.26.so
> +0x7ff4a7)
> #3 capnp::_::(anonymous
>
>   Previous atomic write of size 8 at 0x016673e0 by thread T1:
> #0 __tsan_atomic64_store  (ld-2.26.so+0x6911ee)
> #1
> capnp::_::setGlobalBrokenCapFactoryForLayoutCpp(capnp::_::BrokenCapFactory&)
> capnproto/c++/src/capnp/l

Re: [capnproto] TSAN faiure

2020-10-09 Thread Vitali Lovich
Ok. That seems to have made TSAN happy (I still don't understand the exact
reason it's safe but I'll trust you're higher level reasoning about
the internals of capnp :D).

Should this atomic read happen only when running under TSAN or just bite
the bullet & do it regardless? It's unclear to me if this is a hotpath
that's carefully tuned

On Fri, Oct 9, 2020 at 10:00 AM 'Kenton Varda' via Cap'n Proto <
capnproto@googlegroups.com> wrote:

> It looks like the writes are done atomically, but the reads aren't. TSAN
> is right to be suspicious of this. In practice there is no bug here, but
> showing that requires higher-level reasoning that TSAN wouldn't be expected
> to figure out.
>
> -Kenton
>
> On Fri, Oct 9, 2020 at 11:52 AM Vitali Lovich  wrote:
>
>> So you're saying just change the read to atomic read?
>>
>> Is it possible there's a legitimate read-before-write race condition &
>> that's what TSAN is complaining about? It doesn't seem to be complaining
>> about concurrent writes but I also haven't investigated this codepath to
>> understand yet as I assumed the problem was in my code.
>>
>> On Fri, Oct 9, 2020 at 9:48 AM Kenton Varda 
>> wrote:
>>
>>> I think that ThreadSanitizer is having trouble recognizing that the
>>> initialization of `brokenCapFactory` is thread-safe, due to the awkward way
>>> in which it is initialized. It may end up being initialized by multiple
>>> threads, but all threads will initialize it to the same value, hence no
>>> atomics are necessary when reading it later.
>>>
>>> Maybe we should use atomic reads anyway, to make ThreadSanitizer happy.
>>> Doing so should be free, at least on x86.
>>>
>>> (The reason for the weird initialization is that the factory is
>>> implemented in libcapnp-rpc.so, yet needs to be callable from libcapnp.so
>>> -- but only if RPC is in use.)
>>>
>>> -Kenton
>>>
>>> On Fri, Oct 9, 2020 at 8:37 AM Vitali Lovich  wrote:
>>>
 I'm trying to do a basic in-process connection of the servers & running
 that under TSAN but I feel like this is some nuance of the APIs I'm missing
 or using the API totally incorrectly outright, so I would appreciate some
 feedback if possible. I'm sure the bug must be in my code rather than
 cap'n'proto. I don't have a helpful standalone piece of code to demonstrate
 this issue at the moment (but if it's really critical I can work on
 providing that). I've provided brief snippets of what the threads do (the
 threads themselves are really that simple, it's just the schema & the
 threads using various helper internal libraries that make it harder to post
 a working standalone example).

 Full TSAN (to avoid spamming the list): https://pastebin.com/hSkVDgsU

 For context, the main thread does
 auto ioContext = kj::setupAsyncIo();
 auto serverThread = ioContext.provider->newPipeThread(...);
 auto serverPtr = ;
 capnp::TwoPartyClient rpcClient(*serverThread.pipe);
 auto client = rpcClient.bootstrap().castAs();
 auto connectedResponse =
 client.connectRequest().send().wait(ioContext.waitScope); *<-- this is
 the problematic TSAN line *

 T1 (taking AsyncIoProvider&, AsyncIoStream& stream, kj::WaitScope&
 waitScope):
 capnp::TwoPartyVatNetwork network(stream,
 capnp::rpc::twoparty::Side::SERVER);
 auto myServer = kj::heap();
 auto myServerPtr = myServer.get()
 auto rpcServer = capnp::makeRpcServer(network, kj::mv( myServer));
 
 network.onDisconnect().wait(waitScope);* <- problematic TSAN line*

 SUMMARY: ThreadSanitizer: data race
 capnproto/c++/src/capnp/layout.c++:2188:5 in
 capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
 capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)

   Read of size 8 at 0x016673e0 by main thread:
 #0
 capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
 capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
 capnproto/c++/src/capnp/layout.c++:2188:5 (ld-2.26.so+0x809e60)
 #1 capnp::_::PointerReader::getCapability() const
 capnproto/c++/src/capnp/layout.c++:2705 (ld-2.26.so+0x809e60)
 #2
 capnp::AnyPointer::Reader::getPipelinedCap(kj::ArrayPtr>>> const>) const capnproto/c++/src/capnp/any.c++:53:18 (ld-2.26.so
 +0x7ff4a7)
 #3 capnp::_::(anonymous

   Previous atomic write of size 8 at 0x016673e0 by thread T1:
 #0 __tsan_atomic64_store  (ld-2.26.so+0x6911ee)
 #1
 capnp::_::setGlobalBrokenCapFactoryForLayoutCpp(capnp::_::BrokenCapFactory&)
 capnproto/c++/src/capnp/layout.c++:45:3 (ld-2.26.so+0x800d12)
 #2
 capnp::ReaderCapabilityTable::ReaderCapabilityTable(kj::Array
 > >) capnproto/c++/src/capnp/capability.c++:951:3 (ld-2.26.so+0x6f32cc)
 #3 capnp::_::(anonymous
 namespace)::RpcConnectionState::RpcCallContext::RpcCallContext(capnp::_::(anonymous
 namespace)::RpcConnectionS

Re: [capnproto] TSAN faiure

2020-10-09 Thread 'Kenton Varda' via Cap'n Proto
It looks like the writes are done atomically, but the reads aren't. TSAN is
right to be suspicious of this. In practice there is no bug here, but
showing that requires higher-level reasoning that TSAN wouldn't be expected
to figure out.

-Kenton

On Fri, Oct 9, 2020 at 11:52 AM Vitali Lovich  wrote:

> So you're saying just change the read to atomic read?
>
> Is it possible there's a legitimate read-before-write race condition &
> that's what TSAN is complaining about? It doesn't seem to be complaining
> about concurrent writes but I also haven't investigated this codepath to
> understand yet as I assumed the problem was in my code.
>
> On Fri, Oct 9, 2020 at 9:48 AM Kenton Varda  wrote:
>
>> I think that ThreadSanitizer is having trouble recognizing that the
>> initialization of `brokenCapFactory` is thread-safe, due to the awkward way
>> in which it is initialized. It may end up being initialized by multiple
>> threads, but all threads will initialize it to the same value, hence no
>> atomics are necessary when reading it later.
>>
>> Maybe we should use atomic reads anyway, to make ThreadSanitizer happy.
>> Doing so should be free, at least on x86.
>>
>> (The reason for the weird initialization is that the factory is
>> implemented in libcapnp-rpc.so, yet needs to be callable from libcapnp.so
>> -- but only if RPC is in use.)
>>
>> -Kenton
>>
>> On Fri, Oct 9, 2020 at 8:37 AM Vitali Lovich  wrote:
>>
>>> I'm trying to do a basic in-process connection of the servers & running
>>> that under TSAN but I feel like this is some nuance of the APIs I'm missing
>>> or using the API totally incorrectly outright, so I would appreciate some
>>> feedback if possible. I'm sure the bug must be in my code rather than
>>> cap'n'proto. I don't have a helpful standalone piece of code to demonstrate
>>> this issue at the moment (but if it's really critical I can work on
>>> providing that). I've provided brief snippets of what the threads do (the
>>> threads themselves are really that simple, it's just the schema & the
>>> threads using various helper internal libraries that make it harder to post
>>> a working standalone example).
>>>
>>> Full TSAN (to avoid spamming the list): https://pastebin.com/hSkVDgsU
>>>
>>> For context, the main thread does
>>> auto ioContext = kj::setupAsyncIo();
>>> auto serverThread = ioContext.provider->newPipeThread(...);
>>> auto serverPtr = ;
>>> capnp::TwoPartyClient rpcClient(*serverThread.pipe);
>>> auto client = rpcClient.bootstrap().castAs();
>>> auto connectedResponse =
>>> client.connectRequest().send().wait(ioContext.waitScope); *<-- this is
>>> the problematic TSAN line *
>>>
>>> T1 (taking AsyncIoProvider&, AsyncIoStream& stream, kj::WaitScope&
>>> waitScope):
>>> capnp::TwoPartyVatNetwork network(stream,
>>> capnp::rpc::twoparty::Side::SERVER);
>>> auto myServer = kj::heap();
>>> auto myServerPtr = myServer.get()
>>> auto rpcServer = capnp::makeRpcServer(network, kj::mv( myServer));
>>> 
>>> network.onDisconnect().wait(waitScope);* <- problematic TSAN line*
>>>
>>> SUMMARY: ThreadSanitizer: data race
>>> capnproto/c++/src/capnp/layout.c++:2188:5 in
>>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
>>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>>>
>>>   Read of size 8 at 0x016673e0 by main thread:
>>> #0
>>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
>>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>>> capnproto/c++/src/capnp/layout.c++:2188:5 (ld-2.26.so+0x809e60)
>>> #1 capnp::_::PointerReader::getCapability() const
>>> capnproto/c++/src/capnp/layout.c++:2705 (ld-2.26.so+0x809e60)
>>> #2
>>> capnp::AnyPointer::Reader::getPipelinedCap(kj::ArrayPtr>> const>) const capnproto/c++/src/capnp/any.c++:53:18 (ld-2.26.so
>>> +0x7ff4a7)
>>> #3 capnp::_::(anonymous
>>>
>>>   Previous atomic write of size 8 at 0x016673e0 by thread T1:
>>> #0 __tsan_atomic64_store  (ld-2.26.so+0x6911ee)
>>> #1
>>> capnp::_::setGlobalBrokenCapFactoryForLayoutCpp(capnp::_::BrokenCapFactory&)
>>> capnproto/c++/src/capnp/layout.c++:45:3 (ld-2.26.so+0x800d12)
>>> #2
>>> capnp::ReaderCapabilityTable::ReaderCapabilityTable(kj::Array
>>> > >) capnproto/c++/src/capnp/capability.c++:951:3 (ld-2.26.so+0x6f32cc)
>>> #3 capnp::_::(anonymous
>>> namespace)::RpcConnectionState::RpcCallContext::RpcCallContext(capnp::_::(anonymous
>>> namespace)::RpcConnectionState&, unsigned int,
>>> kj::Own&&,
>>> kj::Array > >,
>>> capnp::AnyPointer::Reader const&, bool, kj::Own
>>> >&&, unsigned long, unsigned short) capnproto/c++/src/capnp/rpc.c++:2144:11
>>> (ld-2.26.so+0x754bea)
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Cap'n Proto" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to capnproto+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/capnpr

Re: [capnproto] TSAN faiure

2020-10-09 Thread Vitali Lovich
So you're saying just change the read to atomic read?

Is it possible there's a legitimate read-before-write race condition &
that's what TSAN is complaining about? It doesn't seem to be complaining
about concurrent writes but I also haven't investigated this codepath to
understand yet as I assumed the problem was in my code.

On Fri, Oct 9, 2020 at 9:48 AM Kenton Varda  wrote:

> I think that ThreadSanitizer is having trouble recognizing that the
> initialization of `brokenCapFactory` is thread-safe, due to the awkward way
> in which it is initialized. It may end up being initialized by multiple
> threads, but all threads will initialize it to the same value, hence no
> atomics are necessary when reading it later.
>
> Maybe we should use atomic reads anyway, to make ThreadSanitizer happy.
> Doing so should be free, at least on x86.
>
> (The reason for the weird initialization is that the factory is
> implemented in libcapnp-rpc.so, yet needs to be callable from libcapnp.so
> -- but only if RPC is in use.)
>
> -Kenton
>
> On Fri, Oct 9, 2020 at 8:37 AM Vitali Lovich  wrote:
>
>> I'm trying to do a basic in-process connection of the servers & running
>> that under TSAN but I feel like this is some nuance of the APIs I'm missing
>> or using the API totally incorrectly outright, so I would appreciate some
>> feedback if possible. I'm sure the bug must be in my code rather than
>> cap'n'proto. I don't have a helpful standalone piece of code to demonstrate
>> this issue at the moment (but if it's really critical I can work on
>> providing that). I've provided brief snippets of what the threads do (the
>> threads themselves are really that simple, it's just the schema & the
>> threads using various helper internal libraries that make it harder to post
>> a working standalone example).
>>
>> Full TSAN (to avoid spamming the list): https://pastebin.com/hSkVDgsU
>>
>> For context, the main thread does
>> auto ioContext = kj::setupAsyncIo();
>> auto serverThread = ioContext.provider->newPipeThread(...);
>> auto serverPtr = ;
>> capnp::TwoPartyClient rpcClient(*serverThread.pipe);
>> auto client = rpcClient.bootstrap().castAs();
>> auto connectedResponse =
>> client.connectRequest().send().wait(ioContext.waitScope); *<-- this is
>> the problematic TSAN line *
>>
>> T1 (taking AsyncIoProvider&, AsyncIoStream& stream, kj::WaitScope&
>> waitScope):
>> capnp::TwoPartyVatNetwork network(stream,
>> capnp::rpc::twoparty::Side::SERVER);
>> auto myServer = kj::heap();
>> auto myServerPtr = myServer.get()
>> auto rpcServer = capnp::makeRpcServer(network, kj::mv( myServer));
>> 
>> network.onDisconnect().wait(waitScope);* <- problematic TSAN line*
>>
>> SUMMARY: ThreadSanitizer: data race
>> capnproto/c++/src/capnp/layout.c++:2188:5 in
>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>>
>>   Read of size 8 at 0x016673e0 by main thread:
>> #0
>> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
>> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>> capnproto/c++/src/capnp/layout.c++:2188:5 (ld-2.26.so+0x809e60)
>> #1 capnp::_::PointerReader::getCapability() const
>> capnproto/c++/src/capnp/layout.c++:2705 (ld-2.26.so+0x809e60)
>> #2
>> capnp::AnyPointer::Reader::getPipelinedCap(kj::ArrayPtr> const>) const capnproto/c++/src/capnp/any.c++:53:18 (ld-2.26.so+0x7ff4a7)
>> #3 capnp::_::(anonymous
>>
>>   Previous atomic write of size 8 at 0x016673e0 by thread T1:
>> #0 __tsan_atomic64_store  (ld-2.26.so+0x6911ee)
>> #1
>> capnp::_::setGlobalBrokenCapFactoryForLayoutCpp(capnp::_::BrokenCapFactory&)
>> capnproto/c++/src/capnp/layout.c++:45:3 (ld-2.26.so+0x800d12)
>> #2
>> capnp::ReaderCapabilityTable::ReaderCapabilityTable(kj::Array
>> > >) capnproto/c++/src/capnp/capability.c++:951:3 (ld-2.26.so+0x6f32cc)
>> #3 capnp::_::(anonymous
>> namespace)::RpcConnectionState::RpcCallContext::RpcCallContext(capnp::_::(anonymous
>> namespace)::RpcConnectionState&, unsigned int,
>> kj::Own&&,
>> kj::Array > >,
>> capnp::AnyPointer::Reader const&, bool, kj::Own
>> >&&, unsigned long, unsigned short) capnproto/c++/src/capnp/rpc.c++:2144:11
>> (ld-2.26.so+0x754bea)
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Cap'n Proto" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to capnproto+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/capnproto/19bac31b-6f11-43d1-886e-93d6bc32557an%40googlegroups.com
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubs

Re: [capnproto] TSAN faiure

2020-10-09 Thread 'Kenton Varda' via Cap'n Proto
I think that ThreadSanitizer is having trouble recognizing that the
initialization of `brokenCapFactory` is thread-safe, due to the awkward way
in which it is initialized. It may end up being initialized by multiple
threads, but all threads will initialize it to the same value, hence no
atomics are necessary when reading it later.

Maybe we should use atomic reads anyway, to make ThreadSanitizer happy.
Doing so should be free, at least on x86.

(The reason for the weird initialization is that the factory is implemented
in libcapnp-rpc.so, yet needs to be callable from libcapnp.so -- but only
if RPC is in use.)

-Kenton

On Fri, Oct 9, 2020 at 8:37 AM Vitali Lovich  wrote:

> I'm trying to do a basic in-process connection of the servers & running
> that under TSAN but I feel like this is some nuance of the APIs I'm missing
> or using the API totally incorrectly outright, so I would appreciate some
> feedback if possible. I'm sure the bug must be in my code rather than
> cap'n'proto. I don't have a helpful standalone piece of code to demonstrate
> this issue at the moment (but if it's really critical I can work on
> providing that). I've provided brief snippets of what the threads do (the
> threads themselves are really that simple, it's just the schema & the
> threads using various helper internal libraries that make it harder to post
> a working standalone example).
>
> Full TSAN (to avoid spamming the list): https://pastebin.com/hSkVDgsU
>
> For context, the main thread does
> auto ioContext = kj::setupAsyncIo();
> auto serverThread = ioContext.provider->newPipeThread(...);
> auto serverPtr = ;
> capnp::TwoPartyClient rpcClient(*serverThread.pipe);
> auto client = rpcClient.bootstrap().castAs();
> auto connectedResponse =
> client.connectRequest().send().wait(ioContext.waitScope); *<-- this is
> the problematic TSAN line *
>
> T1 (taking AsyncIoProvider&, AsyncIoStream& stream, kj::WaitScope&
> waitScope):
> capnp::TwoPartyVatNetwork network(stream,
> capnp::rpc::twoparty::Side::SERVER);
> auto myServer = kj::heap();
> auto myServerPtr = myServer.get()
> auto rpcServer = capnp::makeRpcServer(network, kj::mv( myServer));
> 
> network.onDisconnect().wait(waitScope);* <- problematic TSAN line*
>
> SUMMARY: ThreadSanitizer: data race
> capnproto/c++/src/capnp/layout.c++:2188:5 in
> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
>
>   Read of size 8 at 0x016673e0 by main thread:
> #0
> capnp::_::WireHelpers::readCapabilityPointer(capnp::_::SegmentReader*,
> capnp::_::CapTableReader*, capnp::_::WirePointer const*, int)
> capnproto/c++/src/capnp/layout.c++:2188:5 (ld-2.26.so+0x809e60)
> #1 capnp::_::PointerReader::getCapability() const
> capnproto/c++/src/capnp/layout.c++:2705 (ld-2.26.so+0x809e60)
> #2
> capnp::AnyPointer::Reader::getPipelinedCap(kj::ArrayPtr const>) const capnproto/c++/src/capnp/any.c++:53:18 (ld-2.26.so+0x7ff4a7)
> #3 capnp::_::(anonymous
>
>   Previous atomic write of size 8 at 0x016673e0 by thread T1:
> #0 __tsan_atomic64_store  (ld-2.26.so+0x6911ee)
> #1
> capnp::_::setGlobalBrokenCapFactoryForLayoutCpp(capnp::_::BrokenCapFactory&)
> capnproto/c++/src/capnp/layout.c++:45:3 (ld-2.26.so+0x800d12)
> #2
> capnp::ReaderCapabilityTable::ReaderCapabilityTable(kj::Array
> > >) capnproto/c++/src/capnp/capability.c++:951:3 (ld-2.26.so+0x6f32cc)
> #3 capnp::_::(anonymous
> namespace)::RpcConnectionState::RpcCallContext::RpcCallContext(capnp::_::(anonymous
> namespace)::RpcConnectionState&, unsigned int,
> kj::Own&&,
> kj::Array > >,
> capnp::AnyPointer::Reader const&, bool, kj::Own
> >&&, unsigned long, unsigned short) capnproto/c++/src/capnp/rpc.c++:2144:11
> (ld-2.26.so+0x754bea)
>
> --
> You received this message because you are subscribed to the Google Groups
> "Cap'n Proto" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to capnproto+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/capnproto/19bac31b-6f11-43d1-886e-93d6bc32557an%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/CAJouXQmfGxyTG%3Dt8c38aTPJ5JHTJFmn0LHvFZaTPVu9Pr8dy3Q%40mail.gmail.com.