I think there are more data races in your product

http://paste.ubuntu.com/23946008/

On Tuesday, 7 February 2017 16:42:28 UTC+11, Jason E. Aten wrote:
>
> the race is fixed in the latest push; 
> a572f570c52f70c9518bc1b3e3319ff9e2424885; it was an artifact of adding 
> logging levels, and would have affected only the tests.
>
> I did manage to catch the processing running away again, this time at 
> 300%, and I got some output with gctrace=1 as you suggested. I'm not sure 
> how to read the lines though; could you advise Dave?
>
>
> runaway cpu behavior happening from at least:
>  Tue Feb  7 05:26:24 UTC 2017  - 2017/02/07 05:37:05.587710
>
> [jaten@host]$ GODEBUG=gctrace=1 go/bin/vhaline.go1.8rc3 -info -addr 
> localhost:9001 -parent localhost:9000 -cpu
> 2017/02/07 05:14:07 mid node 'a7fb658c' started on localhost:9001. my 
> parent is 'localhost:9000'
> 2017/02/07 05:14:07 mid node 'a7fb658c' using ttl=3s and beat=1s.
> 2017/02/07 05:14:07.136012 client.go:85: [pid 86063] replica (a7fb658c) 
> client dialing parent at 'localhost:9000'
> 2017/02/07 05:14:07.136266 client.go:97: [pid 86063] replica (a7fb658c) 
> client reached parent at 'localhost:9000'
> gc 1 @0.007s 1%: 0.18+0.39+0.093 ms clock, 0.36+0.23/0.56/0.23+0.18 ms 
> cpu, 17->17->16 MB, 18 MB goal, 16 P
> gc 2 @0.008s 2%: 0.063+0.43+0.078 ms clock, 0.44+0.033/0.69/0.28+0.55 ms 
> cpu, 32->32->32 MB, 33 MB goal, 16 P
> 2017/02/07 05:14:07.139570 replica.go:511: [pid 86063] replica (a7fb658c) 
> sees from parent: ToChildConnectAck.
> 2017/02/07 05:14:07 pprof profiler providing web server on 'localhost:6061'
> 2017/02/07 05:14:07 cmd/vhaline/main.go: contacted parent event.
> 2017/02/07 05:14:09.139731 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(/).
> 2017/02/07 05:14:09.139759 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000176418s/never
> gc 3 @52.176s 0%: 0.065+0.24+0.082 ms clock, 0.58+0.15/0.62/0.47+0.74 ms 
> cpu, 49->49->48 MB, 65 MB goal, 16 P
> gc 4 @52.177s 0%: 0.041+0.55+0.094 ms clock, 0.502017/02/07 
> 05:14:59.307973 replica.go:742: [pid 86063] replica (a7fb658c) 
> NewChildConnect from child 4040ba06' at '127.0.0.1:46194'
> +0.054/0.54/0.40+1.1 ms cpu, 64->64->64 MB, 97 MB goal, 16 P
> 2017/02/07 05:14:59 cmd/vhaline/main.go: contacted child event.
> 2017/02/07 05:15:16.160638 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(4040ba06/127.0.0.1:46194).
> 2017/02/07 05:15:16.160673 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000166804s/1.000144038s
> 2017/02/07 05:16:24.182488 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(4040ba06/127.0.0.1:46194).
> 2017/02/07 05:16:24.182518 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000148515s/1.000133515s
> scvg0: inuse: 68, idle: 0, sys: 69, released: 0, consumed: 69 (MB)
> GC forced
> gc 5 @172.183s 0%: 0.089+0.30+0.086 ms clock, 1.2+0/0.74/1.0+1.2 ms cpu, 
> 67->67->64 MB, 129 MB goal, 16 P
> 2017/02/07 05:17:24.201369 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(4040ba06/127.0.0.1:46194).
> 2017/02/07 05:17:24.201404 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000158566s/1.000172823s
> 2017/02/07 05:18:29.222380 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(4040ba06/127.0.0.1:46194).
> 2017/02/07 05:18:29.222413 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000154109s/1.000142059s
> GC forced
> gc 6 @292.191s 0%: 0.18+0.31+0.10 ms clock, 2.7+0/0.72/0.89+1.6 ms cpu, 
> 67->67->64 MB, 129 MB goal, 16 P
> scvg1: inuse: 67, idle: 2, sys: 69, released: 0, consumed: 69 (MB)
> 2017/02/07 05:19:35.243248 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(4040ba06/127.0.0.1:46194).
> 2017/02/07 05:19:35.243274 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000142593s/1.000160351s
> 2017/02/07 05:20:45.265426 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(4040ba06/127.0.0.1:46194).
> 2017/02/07 05:20:45.265455 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000176146s/1.000179458s
> GC forced
> gc 7 @412.200s 0%: 0.027+0.28+0.097 ms clock, 0.43+0/0.78/0.69+1.5 ms cpu, 
> 67->67->64 MB, 129 MB goal, 16 P
> scvg2: inuse: 67, idle: 1, sys: 69, released: 0, consumed: 69 (MB)
> 2017/02/07 05:21:54.287241 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(4040ba06/127.0.0.1:46194).
> 2017/02/07 05:21:54.287275 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000167178s/2.000483842s
> 2017/02/07 05:21:54.287396 replica.go:278: [pid 86063] replica (a7fb658c) 
> child connection lost
> 2017/02/07 05:21:55.287620 replica.go:613: [pid 86063] replica (a7fb658c) 
> it's been 3.000864314s (> ttl == 3s) since last child contact, declaring 
> child '4040ba06' at '127.0.0.1:46194' to have failed.
> 2017/02/07 05:21:55 cmd/vhaline/main.go:child failed event.
> 2017/02/07 05:21:55.821703 replica.go:742: [pid 86063] replica (a7fb658c) 
> NewChildConnect from child 96bfad81' at '127.0.0.1:46534'
> 2017/02/07 05:22:01.289558 replica.go:278: [pid 86063] replica (a7fb658c) 
> child connection lost
> 2017/02/07 05:22:01.865167 replica.go:374: [pid 86063] replica (a7fb658c) 
> rejecting new child 'b305026f/127.0.0.1:46542' b/c already have '96bfad81/
> 127.0.0.1:46534'
> gc 8 @474.734s 0%: 0.031+0.51+0.11 ms clock, 0.50+0.093/0.99/0.020+1.9 ms 
> cpu, 130->130->128 MB, 131 MB goal, 16 P
> 2017/02/07 05:22:03.290073 replica.go:613: [pid 86063] replica (a7fb658c) 
> it's been 4.001142473s (> ttl == 3s) since last child contact, declaring 
> child '96bfad81' at '127.0.0.1:46534' to have failed.
> 2017/02/07 05:22:03 cmd/vhaline/main.go:child failed event.
> 2017/02/07 05:22:05.038889 replica.go:742: [pid 86063] replica (a7fb658c) 
> NewChildConnect from child 753dd7fe' at '127.0.0.1:46550'
> 2017/02/07 05:23:04.309740 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:23:04.309767 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000190244s/1.000179191s
> GC forced
> gc 9 @594.743s 0%: 0.025+0.48+0.12 ms clock, 0.40+0/1.0/0.29+2.0 ms cpu, 
> 163->163->161 MB, 257 MB goal, 16 P
> scvg3: inuse: 163, idle: 2, sys: 165, released: 0, consumed: 165 (MB)
> 2017/02/07 05:24:12.331876 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:24:12.331911 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000161445s/1.000345842s
> 2017/02/07 05:25:16.352877 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:25:16.352912 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000113785s/1.000117716s
> GC forced
> gc 10 @714.754s 0%: 0.019+0.34+0.12 ms clock, 0.31+0/1.0/0.52+2.0 ms cpu, 
> 164->164->161 MB, 322 MB goal, 16 P
> 2017/02/07 05:26:20.373956 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:26:20.373982 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000188576s/1.000199715s
> scvg4: inuse: 164, idle: 2, sys: 166, released: 0, consumed: 166 (MB)
>
> ### observing the 300% cpu spike from here on...
>
> 2017/02/07 05:27:26.395974 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:27:26.396009 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000216294s/1.000187739s
> GC forced
> gc 11 @834.754s 0%: 0.047+0.35+0.11 ms clock, 0.76+0/1.0/0.55+1.8 ms cpu, 
> 164->164->161 MB, 323 MB goal, 16 P
> 2017/02/07 05:28:36.418341 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:28:36.418369 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000121649s/1.000160242s
> scvg5: inuse: 164, idle: 1, sys: 166, released: 0, consumed: 166 (MB)
> 2017/02/07 05:29:38.438866 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:29:38.438893 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000151323s/1.000168023s
> GC forced
> gc 12 @954.763s 0%: 0.025+0.36+0.11 ms clock, 0.40+0/1.0/0.54+1.8 ms cpu, 
> 166->166->162 MB, 323 MB goal, 16 P
> 2017/02/07 05:30:42.459870 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:30:42.459896 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000071789s/1.000163893s
> scvg6: inuse: 166, idle: 1, sys: 168, released: 0, consumed: 168 (MB)
> 2017/02/07 05:31:46.480874 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:31:46.480901 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000216787s/1.000181407s
> GC forced
> gc 13 @1074.772s 0%: 0.034+0.39+0.12 ms clock, 0.54+0/1.0/0.37+2.0 ms cpu, 
> 165->165->162 MB, 325 MB goal, 16 P
> 2017/02/07 05:32:52.502406 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:32:52.502432 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000112204s/1.00011439s
> 2017/02/07 05:33:54.523284 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:33:54.523312 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000156022s/1.000152209s
> GC forced
> gc 14 @1194.775s 0%: 0.024+0.32+0.12 ms clock, 0.39+0/0.98/0.50+1.9 ms 
> cpu, 169->169->164 MB, 325 MB goal, 16 P
> scvg7: inuse: 167, idle: 4, sys: 171, released: 0, consumed: 171 (MB)
> 2017/02/07 05:34:57.544262 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:34:57.544288 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000198885s/1.000148081s
> GC forced
> gc 15 @1314.779s 0%: 0.031+0.29+0.11 ms clock, 0.49+0/0.99/0.47+1.7 ms 
> cpu, 167->167->164 MB, 329 MB goal, 16 P
> 2017/02/07 05:36:02.566141 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:36:02.566168 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.00034139s/1.000175237s
> scvg8: 0 MB released
> scvg8: inuse: 167, idle: 4, sys: 171, released: 0, consumed: 171 (MB)
> 2017/02/07 05:37:05.587678 replica.go:587: [pid 86063] replica (a7fb658c) 
> line status: parent(bbda6890/localhost:9000) -> me(a7fb658c/localhost:9001) 
> -> child(753dd7fe/127.0.0.1:46550).
> 2017/02/07 05:37:05.587710 replica.go:598: [pid 86063] replica (a7fb658c) 
> last-ping (parent/child): 1.000171768s/1.000188289s
>
> ### the 300% cpu peg is still in progress at this point.
>
> none of the go1.7.4 processes are showing the cpu peg.
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to