[go-nuts] Long time blocking CGO call affect and CGO/Go.syscall/Go.net/C IO performance

2018-09-17 Thread changkun
Hi there,

I recently managed a data parsing C library in a CGO program by for long 
time network IO.

The data flow is as follows:

user socket <> socketpair[0] <---> socketpair[1] <> data store
|--Go domain|--C domain-|



With such a data flow, I need to pass a file descriptor to C universe for a 
long time blocking CGO call (the CGO call returns only if the connection is 
broken), the CGO call lies in a goroutine.

Thus, my first question is: Considering the scheduling strategy of 
goroutines in Go universe, will the Go universe IO performance between 
a user socket and socketpair[0] get suffering by such a blocking CGO call?


The second question set is regarding IO performance.

My benchmark shows the pure Go syscall Write and Read is roughly 15% slower 
than C system call, and net package IO performance is roughly equal to CGO 
call performance, as shown as follows:

[image: Cgo, Go and C in system call (3).png]

Test in go 1.11; Machine: MacBook Pro 2014 Retina; 
Data: 
https://docs.google.com/spreadsheets/d/1DwtZmP8fKKr3pOQWVJrD30DSOzv4_qB5KZvsd-DQ1KA/edit?usp=sharing

So, questions are, what did I do wrong regarding these benchmarks? I'm 
currently using syscall.Write() and syscall.Read() approach in different 
write/read goroutines, is there any way to achieve performance closer to C 
system call (expect down to 5%) for such a long time network IO?

Thank you in advance.

Benchmarks:

package syscall

import (
"net"
"os"
"syscall"
"testing"
)

const message = "hello, world!"

var buffer = make([]byte, 13)

func writeAll(fd int, buf []byte) error {
for len(buf) > 0 {
n, err := syscall.Write(fd, buf)
if n < 0 {
return err
}
buf = buf[n:]
}
return nil
}

func BenchmarkReadWriteCgoCalls(b *testing.B) {
fds, _ := syscall.Socketpair(syscall.AF_UNIX, syscall.SOCK_STREAM, 0)
for i := 0; i < b.N; i++ {
CwriteAll(fds[0], []byte(message))
Cread(fds[1], buffer)
}
}

func BenchmarkReadWriteGoCalls(b *testing.B) {
fds, _ := syscall.Socketpair(syscall.AF_UNIX, syscall.SOCK_STREAM, 0)
for i := 0; i < b.N; i++ {
writeAll(fds[0], []byte(message))
syscall.Read(fds[1], buffer)
}
}

func BenchmarkReadWriteNetCalls(b *testing.B) {
cs, _ := socketpair()
for i := 0; i < b.N; i++ {
cs[0].Write([]byte(message))
cs[1].Read(buffer)
}
}

func socketpair() (conns [2]net.Conn, err error) {
fds, err := syscall.Socketpair(syscall.AF_LOCAL, syscall.SOCK_STREAM, 0)
if err != nil {
return
}
conns[0], err = fdToFileConn(fds[0])
if err != nil {
return
}
conns[1], err = fdToFileConn(fds[1])
if err != nil {
conns[0].Close()
return
}
return
}

func fdToFileConn(fd int) (net.Conn, error) {
f := os.NewFile(uintptr(fd), "")
defer f.Close()
return net.FileConn(f)
}

=

package syscall

/*
#include 
int write_all(int fd, void* buffer, size_t length) {
while (length > 0) {
int written = write(fd, buffer, length);
if (written < 0)
return -1;
length -= written;
buffer += written;
}
return length;
}
int read_call(int fd, void *buffer, size_t length) {
return read(fd, buffer, length);
}
*/
import "C"
import (
"unsafe"
)

// CwriteAll is a cgo call for write
func CwriteAll(fd int, buf []byte) error {
_, err := C.write_all(C.int(fd), unsafe.Pointer(&buf[0]), 
C.size_t(len(buf)))
return err
}

// Cread is a cgo call for read
func Cread(fd int, buf []byte) (int, error) {
ret, err := C.read_call(C.int(fd), unsafe.Pointer(&buf[0]), 
C.size_t(len(buf)))
return int(ret), err
}

C:

#include 
#include 
#include 
#include 

int write_all(int fd, void* buffer, size_t length) {
while (length > 0) {
int written = write(fd, buffer, length);
if (written < 0)
return -1;
length -= written;
buffer += written;
}
return length;
}

int read_call(int fd, void *buffer, size_t length) {
return read(fd, buffer, length);
}

struct timespec timer_start(){
struct timespec start_time;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start_time);
return start_time;
}

long timer_end(struct timespec start_time){
struct timespec end_time;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end_time);
long diffInNanos = (end_time.tv_sec - start_time.tv_sec) * (long)1e9 + 
(end_time.tv_nsec - start_time.tv_nsec);
return diffInNanos;
}

int main() {
int i = 0;
int N = 50;
int fds[2];
char message[14] = "hello, world!\0";
char buffer[14] = {0};

socketpair(AF_UNIX, SOCK_STREAM, 0, fds);
struct timespec vartime = timer_start();
for(i = 0; i < N; i++) {
write_all(fds[0], message, sizeof(message));
read_call(fds[1], buffer, 14);
}
long time_elapsed_nanos = timer_end(vartime);
printf("BenchmarkReadWritePureCCalls\t%d\t%ld ns/op\n", N, 
time_elapsed_nanos/N);
}





-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and

Re: [go-nuts] Long time blocking CGO call affect and CGO/Go.syscall/Go.net/C IO performance

2018-09-17 Thread changkun
Hi Robert,

I checked your benchmark, and it kindly solves the second question "there 
is no way to optimize". Thank you :)

Best,
changkun

On Monday, September 17, 2018 at 6:19:33 PM UTC+2, Robert Engels wrote:
>
> You can run my tests at github.com/robaho/go-network-test You will see 
> that both Java and Go are 10% slower than pure C. In Java’s case it is 
> mostly due to the ‘security checks’ that are run. In Go’s case I believe it 
> is just the barrier from Go to C overhead as there are many more function 
> calls, (you can review the source, as to compared to the recvfrom source in 
> the libc library). The Go runtime needs to know which routines are in 
> syscalls, etc. so there is going to be overhead.
>
> TL;DR; don’t worry about it
>
>
> On Sep 17, 2018, at 3:41 AM, changkun > 
> wrote:
>
> Hi there,
>
> I recently managed a data parsing C library in a CGO program by for long 
> time network IO.
>
> The data flow is as follows:
>
> user socket <> socketpair[0] <---> socketpair[1] <> data store
> |--Go domain|--C domain-|
>
>
>
> With such a data flow, I need to pass a file descriptor to C universe for 
> a long time blocking CGO call (the CGO call returns only if the connection 
> is broken), the CGO call lies in a goroutine.
>
> Thus, my first question is: Considering the scheduling strategy of 
> goroutines in Go universe, will the Go universe IO performance between 
> a user socket and socketpair[0] get suffering by such a blocking CGO call?
>
>
> The second question set is regarding IO performance.
>
> My benchmark shows the pure Go syscall Write and Read is roughly 15% 
> slower than C system call, and net package IO performance is roughly equal 
> to CGO call performance, as shown as follows:
>
> 
>
> Test in go 1.11; Machine: MacBook Pro 2014 Retina; Data: 
> https://docs.google.com/spreadsheets/d/1DwtZmP8fKKr3pOQWVJrD30DSOzv4_qB5KZvsd-DQ1KA/edit?usp=sharing
>
> So, questions are, what did I do wrong regarding these benchmarks? I'm 
> currently using syscall.Write() and syscall.Read() approach in different 
> write/read goroutines, is there any way to achieve performance closer to C 
> system call (expect down to 5%) for such a long time network IO?
>
> Thank you in advance.
>
> Benchmarks:
>
> package syscall
>
> import (
> "net"
> "os"
> "syscall"
> "testing"
> )
>
> const message = "hello, world!"
>
> var buffer = make([]byte, 13)
>
> func writeAll(fd int, buf []byte) error {
> for len(buf) > 0 {
> n, err := syscall.Write(fd, buf)
> if n < 0 {
> return err
> }
> buf = buf[n:]
> }
> return nil
> }
>
> func BenchmarkReadWriteCgoCalls(b *testing.B) {
> fds, _ := syscall.Socketpair(syscall.AF_UNIX, syscall.SOCK_STREAM, 0)
> for i := 0; i < b.N; i++ {
> CwriteAll(fds[0], []byte(message))
> Cread(fds[1], buffer)
> }
> }
>
> func BenchmarkReadWriteGoCalls(b *testing.B) {
> fds, _ := syscall.Socketpair(syscall.AF_UNIX, syscall.SOCK_STREAM, 0)
> for i := 0; i < b.N; i++ {
> writeAll(fds[0], []byte(message))
> syscall.Read(fds[1], buffer)
> }
> }
>
> func BenchmarkReadWriteNetCalls(b *testing.B) {
> cs, _ := socketpair()
> for i := 0; i < b.N; i++ {
> cs[0].Write([]byte(message))
> cs[1].Read(buffer)
> }
> }
>
> func socketpair() (conns [2]net.Conn, err error) {
> fds, err := syscall.Socketpair(syscall.AF_LOCAL, syscall.SOCK_STREAM, 0)
> if err != nil {
> return
> }
> conns[0], err = fdToFileConn(fds[0])
> if err != nil {
> return
> }
> conns[1], err = fdToFileConn(fds[1])
> if err != nil {
> conns[0].Close()
> return
> }
> return
> }
>
> func fdToFileConn(fd int) (net.Conn, error) {
> f := os.NewFile(uintptr(fd), "")
> defer f.Close()
> return net.FileConn(f)
> }
>
> =
>
> package syscall
>
> /*
> #include 
> int write_all(int fd, void* buffer, size_t length) {
> while (length > 0) {
> int written = write(fd, buffer, length);
> if (written < 0)
> return -1;
> length -= written;
> buffer += written;
> }
> return length;
> }
> int read_call(int fd, void *buffer, size_t length) {
> return read(fd, buffer, length);
> }
> */
> import "C"
> import (
> "unsafe"
> )
>
> // CwriteAll is a cgo call for write
> func CwriteAll(fd int, buf []byte) error {
> _, err := C.write_all(C.int(fd), unsafe.Pointer(&buf[0]), 
> C.size_t(len(buf)))
> retu

Re: [go-nuts] Long time blocking CGO call affect and CGO/Go.syscall/Go.net/C IO performance

2018-09-17 Thread changkun
Hi Ian,

Thank you for your post.

Those overhead you mentioned come from goroutine scheduler for entering a 
system call.
Apparently, there is no such mechanism in Go to submit an IO "transaction" 
if there are two 
or more syscall close to each other, syscall can be held in OS thread 
without exit back to goroutine 
and save redundant bookkeeping, isn't?

Then, what about the net package? Socket pair FDs are pollable thus can 
converted to net.Conn. 
What happened to them and why it's performance even much "worse" than pure 
syscall and close to
Cgo calls? Is the benchmark measured in a wrong way? Are all I/Os call via 
net package suffering it?

Best,
changkun



On Monday, September 17, 2018 at 8:51:06 PM UTC+2, Ian Lance Taylor wrote:
>
> On Mon, Sep 17, 2018 at 1:41 AM, changkun  > wrote: 
> > 
> > Thus, my first question is: Considering the scheduling strategy of 
> goroutines in Go universe, will the Go universe IO performance between a 
> user socket and socketpair[0] get suffering by such a blocking CGO call? 
>
> No. 
>
>
> > The second question set is regarding IO performance. 
> > 
> > My benchmark shows the pure Go syscall Write and Read is roughly 15% 
> slower than C system call, and net package IO performance is roughly equal 
> to CGO call performance, as shown as follows: 
> > 
> > Test in go 1.11; Machine: MacBook Pro 2014 Retina; Data: 
> https://docs.google.com/spreadsheets/d/1DwtZmP8fKKr3pOQWVJrD30DSOzv4_qB5KZvsd-DQ1KA/edit?usp=sharing
>  
> > 
> > 
> > So, questions are, what did I do wrong regarding these benchmarks? I'm 
> currently using syscall.Write() and syscall.Read() approach in different 
> write/read goroutines, is there any way to achieve performance closer to C 
> system call (expect down to 5%) for such a long time network IO? 
>
> For the pure Go program, this is most likely due to Go scheduler 
> overhead.  When your program enters a potentially blocking system call 
> like Write or Read, the Go program has to note that the goroutine 
> might block for an indefinite period of time.  I would guess that that 
> bookkeeping is what you are seeing.  If you have a single-threaded 
> program that does nothing other than read and write system calls, it's 
> going to be hard for Go to match the same performance of C. 
>
> When using cgo there is similar scheduler bookkeeping overhead; on 
> average it's a tiny bit worse with cgo because cgo requires changing 
> calling conventions. 
>
> Ian 
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Long time blocking CGO call affect and CGO/Go.syscall/Go.net/C IO performance

2018-09-17 Thread changkun
Your answers are very decent, it solve all my doubts. Thank you very much!

I believe I need to spend some time reading how runtime scheduler works for 
more deep understanding of all "magic".

Cheers,
changkun

On Monday, September 17, 2018 at 9:25:20 PM UTC+2, Ian Lance Taylor wrote:
>
> On Mon, Sep 17, 2018 at 12:17 PM, changkun  > wrote: 
> > 
> > Those overhead you mentioned come from goroutine scheduler for entering 
> a 
> > system call. 
> > Apparently, there is no such mechanism in Go to submit an IO 
> "transaction" 
> > if there are two 
> > or more syscall close to each other, syscall can be held in OS thread 
> > without exit back to goroutine 
> > and save redundant bookkeeping, isn't? 
>
> That is correct: there is no such mechanism in Go. 
>
>
> > Then, what about the net package? Socket pair FDs are pollable thus can 
> > converted to net.Conn. 
> > What happened to them and why it's performance even much "worse" than 
> pure 
> > syscall and close to 
> > Cgo calls? Is the benchmark measured in a wrong way? Are all I/Os call 
> via 
> > net package suffering it? 
>
> The net package is quite different from direct syscalls.  The net 
> package is optimized for handling intermittent I/O on many thousands 
> of different descriptors.  The net package uses a runtime poller so 
> that each separate descriptor does not require its own dedicated 
> thread.  That introduces more bookkeeping and more overhead.  It's the 
> right choice for an HTTP server but will not be the best choice for 
> the highest possible I/O throughput on a single descriptor. 
>
> Ian 
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[go-nuts] Debug Go program with GDB on macOS shows nothing

2018-09-27 Thread changkun
Debugging with GDB works fine on Linux, it is able to show everything with 
the breakpoint. However, nothing appears on macOS.

A simple Go program, say `main.go`:

package main

func main() {
println("hello, world!")
}


Then build with 

go build -gcflags "-N -l" -o main main.go


Using GDB:

$ gdb main
GNU gdb (GDB) 8.2
(...)
Reading symbols from main...(no debugging symbols found)...done.
Loading Go Runtime support.
(gdb) source /usr/local/Cellar/go/1.11/libexec/src/runtime/runtime-gdb.
py
Loading Go Runtime support.
(gdb) info files
Symbols from "/Users/changkun/Desktop/demo/main".
Local exec file:
`/Users/changkun/Desktop/demo/main', file type mach-o-x86-64.
Entry point: 0x1049e20
0x01001000 - 0x0104dfcf is .text
0x0104dfe0 - 0x01077344 is __TEXT.__rodata
(...)
(gdb) b *0x1049e20
Breakpoint 1 at 0x1049e20
(gdb)


There is no `at` in the GDB outputs, the version of Go is `go version 
go1.11 darwin/amd64` and:

$ ls -al /usr/local/bin | grep go
lrwxr-xr-x1 changkun  admin24 Aug 25 16:37 go -> ../Cellar/
go/1.11/bin/go



==

Same process in linux environment:

docker run -it --rm --name golang golang:1.11 bash

then entering container install `gdb`

root@1326d3f1a957:/# gdb main
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
(...)
(gdb) info files
Symbols from "/main".
Local exec file:
`/main', file type elf64-x86-64.
Entry point: 0x44a2e0
0x00401000 - 0x0044ea8f is .text
(...)
(gdb) b *0x44a2e0
Breakpoint 1 at 0x44a2e0: file 
/usr/local/go/src/runtime/rt0_linux_amd64.s, line 8.
(gdb)


Linux is able to show with a breakpoint, `(gdb) b *0x44a2e0
Breakpoint 1 at 0x44a2e0: file 
/usr/local/go/src/runtime/rt0_linux_amd64.s, line 8.`

What did I do wrong on macOS? How can I perform low-level debugging for Go 
programs on macOS with GDB?

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[go-nuts] Re: Debug Go program with GDB on macOS shows nothing

2018-09-27 Thread changkun
Works, thanks! Hope the information cloud added to golang.org/doc/gdb

On Thursday, September 27, 2018 at 8:02:20 PM UTC+2, David Chase wrote:
>
> You did nothing wrong, in 1.11 we started compressing the debug 
> information to reduce binary size, and gdb on the Mac does not understand 
> compressed DWARF.
> We had hoped that the several speedbumps involved in running gdb on modern 
> OSX would have caused most users to move to Delve, which handles compressed 
> DWARF, but obviously this was overoptimistic.
>
> The workaround is to also specify "-ldflags=-compressdwarf=false" which 
> does exactly what it claims.
> If you wanted to do this generally (so you did not need to remember), 
>
> export GOFLAGS="-ldflags=-compressdwarf=false"
>
> You might try not specifying "-N -l" and see whether optimized binaries 
> have adequate debugging information; it's not perfect, but we are hoping to 
> get it good enough that core dumps and probes from/on running applications 
> will yield useful information.
>
> Sorry for the inconvenience.
>
>
> On Thursday, September 27, 2018 at 9:38:23 AM UTC-4, changkun wrote:
>>
>> Debugging with GDB works fine on Linux, it is able to show everything 
>> with the breakpoint. However, nothing appears on macOS.
>>
>> A simple Go program, say `main.go`:
>>
>> package main
>> 
>> func main() {
>> println("hello, world!")
>> }
>>
>>
>> Then build with 
>>
>> go build -gcflags "-N -l" -o main main.go
>>
>>
>> Using GDB:
>>
>> $ gdb main
>> GNU gdb (GDB) 8.2
>> (...)
>> Reading symbols from main...(no debugging symbols found)...done.
>> Loading Go Runtime support.
>> (gdb) source /usr/local/Cellar/go/1.11/libexec/src/runtime/runtime-
>> gdb.py
>> Loading Go Runtime support.
>> (gdb) info files
>> Symbols from "/Users/changkun/Desktop/demo/main".
>> Local exec file:
>> `/Users/changkun/Desktop/demo/main', file type mach-o-x86-64.
>> Entry point: 0x1049e20
>> 0x01001000 - 0x0104dfcf is .text
>> 0x0104dfe0 - 0x01077344 is __TEXT.__rodata
>> (...)
>> (gdb) b *0x1049e20
>> Breakpoint 1 at 0x1049e20
>> (gdb)
>>
>>
>> There is no `at` in the GDB outputs, the version of Go is `go version 
>> go1.11 darwin/amd64` and:
>>
>> $ ls -al /usr/local/bin | grep go
>> lrwxr-xr-x1 changkun  admin24 Aug 25 16:37 go -> ../
>> Cellar/go/1.11/bin/go
>>
>>
>>
>> ==
>>
>> Same process in linux environment:
>>
>> docker run -it --rm --name golang golang:1.11 bash
>>
>> then entering container install `gdb`
>>
>> root@1326d3f1a957:/# gdb main
>> GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
>> (...)
>> (gdb) info files
>> Symbols from "/main".
>> Local exec file:
>> `/main', file type elf64-x86-64.
>> Entry point: 0x44a2e0
>> 0x00401000 - 0x0044ea8f is .text
>> (...)
>> (gdb) b *0x44a2e0
>> Breakpoint 1 at 0x44a2e0: file 
>> /usr/local/go/src/runtime/rt0_linux_amd64.s, line 8.
>> (gdb)
>>
>>
>> Linux is able to show with a breakpoint, `(gdb) b *0x44a2e0
>> Breakpoint 1 at 0x44a2e0: file 
>> /usr/local/go/src/runtime/rt0_linux_amd64.s, line 8.`
>>
>> What did I do wrong on macOS? How can I perform low-level debugging for 
>> Go programs on macOS with GDB?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[go-nuts] Are C allocated memory completely isolated to Go memory in any cases?

2018-09-28 Thread changkun
Hi there,

I am encountering a cgo problem (unsure, most likely) when 
migrating an old C library to Go.

A cgo call calls a component of the C library, which 
internally creates an OS thread and then calls libpango 
(pango_cairo_font_map_get_default() call).

The original pure C code is able to go through the pango call:

printf("before call");
font_map = pango_cairo_font_map_get_default();
printf("after call");

Outputs:

before call
after call

After involving Go, the code stuck at the Pango call after its 
first calling, and prints the following warning and critical 
message from GLib:

before call
(process:1): GLib-GObject-WARNING **: cannot register existing type 
'PangoFontMap'

(process:1): GLib-CRITICAL **: g_once_init_leave: assertion 'result != 0' 
failed

(process:1): GLib-GObject-CRITICAL **: g_type_register_static: assertion 
'parent_type > 0' failed

(process:1): GLib-CRITICAL **: g_once_init_leave: assertion 'result != 0' 
failed

(process:1): GLib-GObject-CRITICAL **: g_type_register_static: assertion 
'parent_type > 0' failed

(process:1): GLib-GObject-WARNING **: cannot register existing type 
'PangoCairoFontMap'

Call chain: 

round1: go code --> goroutine --> cgo --> c --> dlopen --> back to go --> 
hold a pointer points to C memory in Go--> goroutine --> cgo --> pass the 
pointer to C --> 
pthread_create --> pango_font_map_get_default() --> success!

round2: same call chain --> fail! message in above

My understanding of pango_font_map_get_default() is 
the call will always a success because of thread safe
(after v1.36, internal font-map managed by pango is per-thread), 
and can be called multiple times, i.e. the call must success.

Thus, a suspicion leads to somehow cgo or Go memory allocator
influence the C side.

My questions are:

1. Is my suspicion reasonable and correct? 
2. Why an error when involving cgo?
3. Are memories in a non-Go thread completely isolated to Go 
memory per-non-Go thread?
4. Are C memories allocated in a cgo call always stands in 
the same sharing heap even threading?
5. Any other theories and how cloud possibly solve the problem?
6. If I am asking a wrong question, could you shoot any ideas of
solving the issue described above?

Thank you very much in advance.

Best,
changkun

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Are C allocated memory completely isolated to Go memory in any cases?

2018-09-28 Thread changkun


On Friday, September 28, 2018 at 4:12:31 PM UTC+2, Ian Lance Taylor wrote:
>
> On Fri, Sep 28, 2018 at 7:08 AM, changkun  > wrote: 
> > 
> > 1. Is my suspicion reasonable and correct? 
>
> I wouldn't be my first guess.  You say that pango memory is 
> per-thread.  That suggests that you need to always call pango on a 
> consistent thread. 


Not really, the original C code calls pango in every new created thread.

After involving cgo, it doesn’t change anything of this calling behavior.

The pango call is per-thread, according to its document, 
it is literally a per-thread singleton and only allocate in the first call 
in every 
different thread if I read the document correctly:
https://developer.gnome.org/pango/stable/pango-Cairo-Rendering.html#pango-cairo-font-map-get-default

 

>  That will not happen by default.  You likely need 
> to arrange to make all your pango calls from a single goroutine, and 
> have that goroutine first call runtime.LockOSThread. 
>

Tried, it doesn’t work. A goroutine is created for the cgo call purpose,
which instantly entering C side, when C code creates a thread, 
then call pango inside that thread.
 

>
> > 2. Why an error when involving cgo? 
>
> My first guess would be that it is because you are calling the cgo 
> code from different threads. 
>

Nope. It is all about from Go to C then C doing something 
without sending data back to Go. I am not calling Go from C. 


> > 3. Are memories in a non-Go thread completely isolated to Go 
> > memory per-non-Go thread? 
>
> Sorry, I don't understand this question. 
>
>
I was trying to express that, are memories in every non-Go thread 
are isolated one another?

The warning from pango:

(process:1): GLib-GObject-WARNING **: cannot register existing type 
'PangoFontMap'

Looks like that the pango call is double registering the font map object.
If the memory are shared across all non-Go thread, it cloud be an issue, 
isn’t it?

Best,
changkun

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Are C allocated memory completely isolated to Go memory in any cases?

2018-09-28 Thread changkun


On Friday, September 28, 2018 at 6:44:10 PM UTC+2, Ian Lance Taylor wrote:
>
> On Fri, Sep 28, 2018 at 7:45 AM, changkun  > wrote: 
> > 
> > On Friday, September 28, 2018 at 4:12:31 PM UTC+2, Ian Lance Taylor 
> wrote: 
> >> 
> >> On Fri, Sep 28, 2018 at 7:08 AM, changkun  wrote: 
> >> > 
> >> > 1. Is my suspicion reasonable and correct? 
> >> 
> >> I wouldn't be my first guess.  You say that pango memory is 
> >> per-thread.  That suggests that you need to always call pango on a 
> >> consistent thread. 
> > 
> > 
> > Not really, the original C code calls pango in every new created thread. 
> > 
> > After involving cgo, it doesn’t change anything of this calling 
> behavior. 
>
> Right, but what changes using cgo is that consecutive calls using the 
> same pango objects will be on different threads.  That likely does not 
> happen in the original C code. 
>

Yep. I understand that goroutines run in different threads. They seem
not able to influence the entire C code.
 

>
>
> >>  That will not happen by default.  You likely need 
> >> to arrange to make all your pango calls from a single goroutine, and 
> >> have that goroutine first call runtime.LockOSThread. 
> > 
> > 
> > Tried, it doesn’t work. A goroutine is created for the cgo call purpose, 
> > which instantly entering C side, when C code creates a thread, 
> > then call pango inside that thread. 
>
> OK, sounds like I was wrong.  But I don't understand what it means for 
> pango to have per-thread data structures if you always make pango 
> calls in a newly created thread. 
>

That is actually the most wired part. Indeed, the Pango call is 
always called in a newly created thread.

So my concern is somehow someway, cgo or Go influences the memory
layout, then probably caused the "double variable registering" of pango 
underlying call to glib?
 

>
>
> >> > 2. Why an error when involving cgo? 
> >> 
> >> My first guess would be that it is because you are calling the cgo 
> >> code from different threads. 
> > 
> > 
> > Nope. It is all about from Go to C then C doing something 
> > without sending data back to Go. I am not calling Go from C. 
>
> No, I didn't think you were. 
>
> > 
> > I was trying to express that, are memories in every non-Go thread 
> > are isolated one another? 
> > 
> > The warning from pango: 
> > 
> > (process:1): GLib-GObject-WARNING **: cannot register existing type 
> > 'PangoFontMap' 
> > 
> > Looks like that the pango call is double registering the font map 
> object. 
> > If the memory are shared across all non-Go thread, it cloud be an issue, 
> > isn’t it? 
>
> All threads in both C and Go share the same memory space. 
>
> But, obviously, they do not share the same per-thread data structures. 
>

Perhaps, to understand where the problem comes from, the only way is to 
read what actually the call is doing, g_private_get() 
and pango_cairo_font_map_new() in GLib and Pango correspondingly.
https://github.com/GNOME/pango/blob/2bb6262be19dc17f7b56d3ce36c0defdbdb55c97/pango/pangocairo-fontmap.c#L184

However, I rather say, GTKs are extremely stable for years, consider the 
problem arise
when involving Go and cgo, and I do not change any of the C part, it is 
most likely the 
Go and cgo cause the problem :(

Best,
changkun

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Are C allocated memory completely isolated to Go memory in any cases?

2018-09-29 Thread changkun
As discussed before, the every pango call is called in a non-Go newly 
created thread by C. 
runtime.LockOSThread doesn't work.

On Saturday, September 29, 2018 at 1:24:21 PM UTC+2, Tamás Gulácsi wrote:
>
> You should LockOSThread and process every pango-facing call in that 
> goroutine.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Are C allocated memory completely isolated to Go memory in any cases?

2018-09-29 Thread changkun

On Saturday, September 29, 2018 at 2:08:03 PM UTC+2, Tamás Gulácsi wrote:
>
> Yes, but is that a one and only goroutine?


No. The cgo call is opened for every new incoming user request.

Let's summarize:

- Every network request creates a goroutine without response 
  processing result to a user of that goroutine.
- The goroutine instantly proceeds a cgo call and the cgo call 
  creates a non-Go thread, then a Pango call pango_font_map_get_default()
  is involved in the non-Go thread.
- According to pango_font_map_get_default(), it holds a static thread-safe 
variable. 
- The original pure C code is able to proceed execution without involving 
Go.
  But stuck at the Pango call when involving cgo
- runtime.LockOSThread doesn't work:
go func() {
runtime.LockOSThread()
handleConnection(timeout)
runtime.UnlockOSThread()
}()

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Are C allocated memory completely isolated to Go memory in any cases?

2018-09-29 Thread changkun


On Saturday, September 29, 2018 at 3:46:41 PM UTC+2, Robert Engels wrote:
>
> Your description makes no sense. Why are you creating another C thread, 
> you should just lock the OS thread the Go routine is on. Otherwise if you 
> create another C thread, in the cgo call you need to block that thread or 
> somehow communicate back into Go some other way - otherwise what is the 
> point?
>

As I mentioned on the top floor, I am migrating a very old pure C project 
to Go, and 
eventually become a pure Go project. the original project was linked by 
different components, and I am replacing them one by one...

I know it doesn't make sense (apparently, before creating a non-Go thread, 
there are some 
internal preprocessing they are irrelevant to Pango)  but to archive what 
you describe (in one goroutine) 
will consume loads of time for reading, understanding, and implementation, 
time is limited.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Are C allocated memory completely isolated to Go memory in any cases?

2018-09-29 Thread changkun


On Saturday, September 29, 2018 at 4:03:42 PM UTC+2, Tamás Gulácsi wrote:
>
> Create One (1) goroutine with LockOSThread and let it get the todo and 
> response channel from a channel and send the response back when ready, and 
> make all other goroutines communicate with this one and only one 
> pango-dedicated goroutine. Which is locked to a thread.
>

Can't do that at the moment, reason in above. 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[go-nuts] When will mexit be executed?

2018-10-13 Thread changkun
Hi golang nuts:

In "mstart", there is a call "mexit" after "mstart1".
Howeverm "mstart1" will be entering sched loop and never returns.

So, my question is when will "mexit" be executed? How?

mstart1()

// Exit this thread.
if GOOS == "windows" || GOOS == "solaris" || GOOS == "plan9" || GOOS == 
"darwin" || GOOS == "aix" {
// Window, Solaris, Darwin, AIX and Plan 9 always system-allocate
// the stack, but put it in _g_.stack before mstart,
// so the logic above hasn't set osStack yet.
osStack = true
}
mexit(osStack)



https://github.com/golang/go/blob/a5248acd91dcf0e90a68c1ff88ca389dc034557c/src/runtime/proc.go#L1181

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Are C allocated memory completely isolated to Go memory in any cases?

2018-10-13 Thread changkun
Hi,

sorry for late response. Your solution is very inspiration! I solved the 
problem
with letting pango call entering go domain and lock on os thread, then back 
to c domain.

Thank you very much!

cheers, changkun

On Saturday, September 29, 2018 at 6:16:10 PM UTC+2, Tamás Gulácsi wrote:
>
>
> 2018. szeptember 29., szombat 17:10:25 UTC+2 időpontban changkun a 
> következőt írta:
>>
>>
>>
>> On Saturday, September 29, 2018 at 4:03:42 PM UTC+2, Tamás Gulácsi wrote:
>>>
>>> Create One (1) goroutine with LockOSThread and let it get the todo and 
>>> response channel from a channel and send the response back when ready, and 
>>> make all other goroutines communicate with this one and only one 
>>> pango-dedicated goroutine. Which is locked to a thread.
>>>
>>
>> Can't do that at the moment, reason in above. 
>>
>
> I tought something like: 
>
> type todo struct {
>   FuncToCall string
>   Args []interface{}
>   Response ch chan<- response
> }
> type response struct {
>   ResultCode int
>   Err error
> }
>
> var todoCh = make(chan todo)
>
> go func(ch <- todo) {
> runtime.LockOSThread()
> defer runtime.UnlockOSThread()
> for t := range ch {
>   switch t.FuncToCall {
>   ...
>   }
>   t.Response <- response{Err:nil, ResultCode:0}
> }
> }()
>
> func handleConnection(timeout) error {
> ch := make(chan response, 1)
> ctx, cancel := context.WithTimeout(context.Background(), timeout)
> defer cancel()
> select {
> case <-ctx.Done():
>   // timeout
>   return ctx.Err()
> case todoCh <- todo{FuncToCall:..., Response:ch}:
>   select {
> case <-ctx.Done():
>   return ctx.Err()
> case resp := <-ch:
>   // success!
>   return nil
>   }
> }
> }
>  
>
> I don't feel that abstracting out the C (Pango) calls into an interface 
> with maybe several implementations (C or Go) would be such a great 
> undertaking,
> but would ease up writing correct code.
>
> And anyway you have to call that Pango thing from one thread.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [go-nuts] Are C allocated memory completely isolated to Go memory in any cases?

2018-10-13 Thread changkun
Hi,

Thanks for your recommendations, very interesting implementation :)

I solved the problem with a callback from c to go to c.

cheers, changkun

On Saturday, September 29, 2018 at 6:27:08 PM UTC+2, ohir wrote:
>
> On Fri, 28 Sep 2018 22:27:04 -0700 (PDT) 
> changkun > wrote: 
>
> > Not really, the original C code calls pango in every new created thread. 
> > [...] 
> > the Pango call is always called in a newly created thread. 
>
> IIUC its your old C code spinning a new thread for each Pango 
> call (?). If so, you should not bridge such code directly with Go. You 
> need 
> to make a wrapper on the C side (in C), init it from the **single** 
> goroutine 
> you pin to a single (go side) thread then communicate only via said 
> wrapper calls. 
>
> [Go] <-go-channels-> [comm goroutine] <-cgo-call(s)-> [wrapper] <-C-> 
> [library] 
>
> You can try to leverage existing C implementation of Go channels for 
> the wrapper: 
>
> https://github.com/tylertreat/chan 
>
> [IDNK if these are still functioning with go1.11, ofc] 
>
> Example: 
> https://github.com/matiasinsaurralde/cgo-channels 
>
> > Best, 
> > changkun 
> > 
>
> Hope this helps, 
>
> -- 
> Wojciech S. Czarnecki 
>  << ^oo^ >> OHIR-RIPE 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[go-nuts] Re: Are C allocated memory completely isolated to Go memory in any cases?

2018-10-13 Thread changkun
Thank you, I probably will use them in the future of more migrations~

cheers, changkun

On Sunday, September 30, 2018 at 12:06:09 AM UTC+2, K Davidson wrote:
>
> Not sure if it would be of any help, but maybe you can gleem some insight 
> from the way these packages did things?
>
> https://github.com/mattn/go-gtk
> https://github.com/gotk3/gotk3
>
> -K
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[go-nuts] Re: When will mexit be executed?

2018-10-13 Thread changkun
Just ignore this thread.

I've figured it out.

On Saturday, October 13, 2018 at 12:38:53 PM UTC+2, changkun wrote:
>
> Hi golang nuts:
>
> In "mstart", there is a call "mexit" after "mstart1".
> Howeverm "mstart1" will be entering sched loop and never returns.
>
> So, my question is when will "mexit" be executed? How?
>
> mstart1()
>
> // Exit this thread.
> if GOOS == "windows" || GOOS == "solaris" || GOOS == "plan9" || GOOS == 
> "darwin" || GOOS == "aix" {
> // Window, Solaris, Darwin, AIX and Plan 9 always system-allocate
> // the stack, but put it in _g_.stack before mstart,
> // so the logic above hasn't set osStack yet.
> osStack = true
> }
> mexit(osStack)
>
>
>
>
> https://github.com/golang/go/blob/a5248acd91dcf0e90a68c1ff88ca389dc034557c/src/runtime/proc.go#L1181
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[go-nuts] How to plug git version when using "go get" to fetch a cli?

2020-11-25 Thread changkun
Hi golang-nuts,

As far as I know, there are two approaches to add extra information at 
build time:
1. using -ldflags="-X path/pkg.Var=${ENV_VAR}", in a Makefile
2. using `go generate ./...`, before `go build`

These approaches are good as it is if I build my binary and distribute it 
to other users.
However, none of these are possible if a user uses `go get 
github.com/user/pkg/cmd/x-cli`, because:
1. `go get` does not understand Makefile
2. `go generate` does not execute with `go build` automatically.

What should I do in order to plug the extra information (in my case, the 
git version) if my user uses "go get"?

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/530a2119-8d1a-430e-ade0-a109af897be1n%40googlegroups.com.


Re: [go-nuts] How to plug git version when using "go get" to fetch a cli?

2020-11-25 Thread changkun
That is really unfortunate. Is there any alternative solution so that I can 
burn at least the version information to the binary when a user uses `go 
get`?

On Wednesday, November 25, 2020 at 6:36:17 PM UTC+1 Jan Mercl wrote:

> On Wed, Nov 25, 2020 at 6:25 PM changkun  wrote:
>
> > As far as I know, there are two approaches to add extra information at 
> build time:
> > 1. using -ldflags="-X path/pkg.Var=${ENV_VAR}", in a Makefile
> > 2. using `go generate ./...`, before `go build`
> >
> > These approaches are good as it is if I build my binary and distribute 
> it to other users.
> > However, none of these are possible if a user uses `go get 
> github.com/user/pkg/cmd/x-cli` <http://github.com/user/pkg/cmd/x-cli>, 
> because:
> > 1. `go get` does not understand Makefile
> > 2. `go generate` does not execute with `go build` automatically.
> >
> > What should I do in order to plug the extra information (in my case, the 
> git version) if my user uses "go get"?
>
> I don't think that can be done. Also IINM, in module mode `go get` no
> longer uses git. It just downloads the zipped version of the
> appropriate tag via http from the hosting service.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/a628138a-cc0c-40db-afc4-141b14e3fbd1n%40googlegroups.com.


[go-nuts] How to use atomic_int in cgo?

2021-02-15 Thread changkun
Hi golang-nuts,

I would like to call a C function from Go and get notified about the 
execution status in Go, say a goroutine wait until the C function finally 
made some progress.
Initially, I thought about passing a channel to C but the cgo document does 
not say anything about that:

Later, I am thinking about using an atomic variable to sync status 
according to its value, but somehow I end up with this warning while I am 
compiling a cgo program:

package main

/*
#include 
void ainit(atomic_int *val) {
atomic_init(val, 0);
}
*/
import "C"

func main() {
var v C.atomic_int
C.ainit(&v)
}

cgo-gcc-prolog: In function ‘_cgo_8a67e594de48_Cfunc_ainit’:
cgo-gcc-prolog:49:14: warning: passing argument 1 of ‘ainit’ from 
incompatible pointer type [-Wincompatible-pointer-types]
./main.go:5:24: note: expected ‘_Atomic atomic_int *’ {aka ‘_Atomic int *’} 
but argument is of type ‘int *’
5 | void ainit(atomic_int *val) {
  |^~~

According to the warning and note, it seems that cgo is lacking translating 
atomic_init? Did I do anything wrong? Or is there any better and preferred 
way to get notified from C function?

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/b4e8b7c7-015d-4b9e-80ef-e49c07bea301n%40googlegroups.com.


Re: [go-nuts] How to use atomic_int in cgo?

2021-02-15 Thread changkun
Hi Ian,

Thanks for the hint, but I have some follow-up questions:


> Even if there were a way to do this, an atomic variable is not a good 
> synchronization mechanism, because the other side has to poll the 
> variable.

Indeed, it the other side will be a spin loop to poll the atomic variable, 
which ...

 

> You can do this as a last resort, but it doesn't sound like 
> you are at a last resort here. I suggest that you have your C 
> function call a Go function to write a value on a channel.

seems easy to write than this (?). 

Say a Go function foo is called from the C side, to be able to send to the 
corresponding channel, C must pass that channel to the Go function, which 
brings to the initial question:  pass a channel from Go to C, is it 
supported at the moment (?)

How could a Go function be able to send content to differently allocated 
channels in correspondingly invoked C functions?


Sincerely,
Changkun

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/7adba3a6-ee2f-4923-905d-9afc2b9f796an%40googlegroups.com.


Re: [go-nuts] How to use atomic_int in cgo?

2021-02-20 Thread changkun
Dear Ian, thanks for the inspiration and sorry for the late response. I 
just got a chance to test your suggestion.
But, it turns out that the wrapping struct can cause the following error:

panic: runtime error: cgo argument has Go pointer to Go pointer

If my understanding of Cgo constraints correctly, the panic is to prevent a 
potential error when GC moves the Go pointer,
although we don't manipulate the Go pointer (to the channel) from the C 
side.

How could we avoid this?

On Monday, February 15, 2021 at 9:51:03 PM UTC+1 Ian Lance Taylor wrote:

> On Mon, Feb 15, 2021 at 12:39 PM changkun  wrote:
> >
> > Thanks for the hint, but I have some follow-up questions:
> >
> >>
> >> Even if there were a way to do this, an atomic variable is not a good
> >> synchronization mechanism, because the other side has to poll the
> >> variable.
> >
> > Indeed, it the other side will be a spin loop to poll the atomic 
> variable, which ...
> >
> >
> >>
> >> You can do this as a last resort, but it doesn't sound like
> >> you are at a last resort here. I suggest that you have your C
> >> function call a Go function to write a value on a channel.
> >
> > seems easy to write than this (?).
> >
> > Say a Go function foo is called from the C side, to be able to send to 
> the corresponding channel, C must pass that channel to the Go function, 
> which brings to the initial question: pass a channel from Go to C, is it 
> supported at the moment (?)
> >
> > How could a Go function be able to send content to differently allocated 
> channels in correspondingly invoked C functions?
>
> For example, on the Go side write
>
> type chanToUse struct {
> c chan int
> }
>
> //export MyGoFunction
> func MyGoFunction(u unsafe.Pointer, val int) {
> cu = (*chanToUse)(u)
> cu.c <- val
> }
>
> ch := make(chan int)
> ...
> C.MyCFunction(unsafe.Pointer(&chanToUse{ch}))
>
> and on the C side write
>
> void MyCFunction(void *goData) {
> ...
> MyGoFunction(goData, 0);
> }
>
> Yes, it's more work. But it's not a good idea to poll an atomic
> variable in Go. The Go scheduler reacts badly to busy loops.
>
> Ian
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/19463602-158d-4539-9132-fa6d5b09a8c3n%40googlegroups.com.


[go-nuts] A Question About Type Set and Parameter Design

2021-08-23 Thread changkun
Hi golang-nuts,

I am trying out the latest type parameter, and type sets design.
The purpose is to implement a Clamp function that works for numbers and vectors.

The version for numbers is straightforward and easy:

```go
// Number is a type set of numbers.
type Number interface {
~int | ~int8 | ~int32 | ~int64 | ~float32 | ~float64
}

// Clamp clamps a given value in [min, max].
func Clamp[N Number](n, min, max N) N {
if n < min { return min }
if n > max { return max }
return n
}
```

Everything is good so far. Then, let's define vector types:

```go
// Vec2 represents a 2D vector (x, y).
type Vec2[N Number] struct {
X, Y N
}

// Vec3 represents a 3D vector (x, y, z).
type Vec3[N Number] struct {
X, Y, Z N
}

// Vec4 represents homogeneous coordinates (x, y, z, w) that defines
// either a point (W=1) or a vector (W=0). Other cases of W need to apply
// a perspective division to get the actual coordinates of X, Y, Z.
type Vec4[N Number] struct {
X, Y, Z, W N
}
```

However, to declare a type set of all possible vectors, I tried
two possibilities:

```go
// Vec is a type set of vectors.
type Vec[N Number] interface {
Vec2[N] | Vec3[N] | Vec4[N] // ERROR: interface cannot have type parameters
}
```

```go
type Vec interface {
Vec2[N Number] | Vec3[N Number] | Vec4[N Number] // ERROR: interface cannot 
have type parameters
}
```

Let's enumerates all possibilities for the Vec type set:

```go
// Vec is a type set of vectors.
type Vec interface {
Vec2[float32] | Vec3[float32] | Vec4[float32] |
Vec2[float64] | Vec3[float64] | Vec4[float64]
}
```

However, with this definition, it remains very tricky to construct a
generic implementation for a clamp function:

```go
// ERROR: this function does not compile
func ClampVec[V Vec, N Number](v V, min, max N) V {
switch (interface{})(v).(type) {
case Vec2[float32]:
return Vec2[float32]{
Clamp[float32](v.X, min, max),
Clamp[float32](v.Y, min, max),
}
case Vec2[float64]:
return Vec2[float64]{
Clamp[float64](v.X, min, max),
Clamp[float64](v.Y, min, max),
}
case Vec3[float32]:
return Vec3[float32]{
Clamp[float32](v.X, min, max),
Clamp[float32](v.Y, min, max),
Clamp[float32](v.Z, min, max),
}
case Vec3[float64]:
return Vec3[float64]{
Clamp[float64](v.X, min, max),
Clamp[float64](v.Y, min, max),
Clamp[float64](v.Z, min, max),
}
case Vec4[float32]:
return Vec4[float32]{
Clamp[float32](v.X, min, max),
Clamp[float32](v.Y, min, max),
Clamp[float32](v.Z, min, max),
Clamp[float32](v.W, min, max),
}
case Vec4[float64]:
return Clamp[float64]{
Clamp[float64](v.X, min, max),
Clamp[float64](v.Y, min, max),
Clamp[float64](v.Z, min, max),
Clamp[float64](v.W, min, max),
}
default:
panic(fmt.Sprintf("unexpected type %T", v))
}
}
```

I wish I could converge to a version similar like this:

```go
func Clamp[N Number](n, min, max N) N {
if n < min { return min }
if n > max { return max }
return n
}

// ERROR: this functions does not compile
func ClampVec[N Number, V Vec[N]](v V[N], min, max N) V[N] {
switch (interface{})(v).(type) {
case Vec2[N]: // If V is Vec2[N], then return a Vec2[N].
return Vec2[N]{
Clamp[N](v.X, min, max),
Clamp[N](v.Y, min, max),
}
case Vec3[N]: // Similar
return Vec3[N]{
Clamp[N](v.X, min, max),
Clamp[N](v.Y, min, max),
Clamp[N](v.Z, min, max),
}
case Vec4[N]: // Similar
return Vec4[N]{
Clamp[N](v.X, min, max),
Clamp[N](v.Y, min, max),
Clamp[N](v.Z, min, max),
Clamp[N](v.W, min, max),
}
default:
panic(fmt.Sprintf("unexpected type %T", v))
}
}

// caller side:

Clamp[float32](256, 0, 255) // 255
Clamp[float64, Vec2[float64]]({1, 2, 3}, 0, 1)  // Vec2[float32]{1, 1, 1}
...
```

I found myself trapped and not able to proceed further. 

- Is the above code legal with the current design but 
because the compiler has not implemented it yet?
- Any ideas on how could the current design be able to produce something even 
simpler?
- If it is impossible to archive a similar implementation, what could be the 
best practices to implement a generic clamp function?

Thank you in advance for your read and help.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/073b7b40-7505-49ae-963d-83441b9d766dn%40googlegroups

[go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-19 Thread changkun
I am comparing the performance regarding sync.Mutex and Go channels. Here 
is my benchmark: https://play.golang.org/p/zLjVtsSx9gd

The performance comparison visualization is as follows:

[image: sync.Mutex performance (1).png]
What are the reasons that 

1. sync.Mutex encounter a large performance drop when the number of 
goroutines goes higher than roughly 3400?
2. Go channels are pretty stable but slower than sync.Mutex before?



Raw bench data by benchstat (go test -bench=. -count=5):

MutexWrite/goroutines-2400-8  48.6ns ± 1%
MutexWrite/goroutines-2480-8  49.1ns ± 0%
MutexWrite/goroutines-2560-8  49.7ns ± 1%
MutexWrite/goroutines-2640-8  50.5ns ± 3%
MutexWrite/goroutines-2720-8  50.9ns ± 2%
MutexWrite/goroutines-2800-8  51.8ns ± 3%
MutexWrite/goroutines-2880-8  52.5ns ± 2%
MutexWrite/goroutines-2960-8  54.1ns ± 4%
MutexWrite/goroutines-3040-8  54.5ns ± 2%
MutexWrite/goroutines-3120-8  56.1ns ± 3%
MutexWrite/goroutines-3200-8  63.2ns ± 5%
MutexWrite/goroutines-3280-8  77.5ns ± 6%
MutexWrite/goroutines-3360-8   141ns ± 6%
MutexWrite/goroutines-3440-8   239ns ± 8%
MutexWrite/goroutines-3520-8   248ns ± 3%
MutexWrite/goroutines-3600-8   254ns ± 2%
MutexWrite/goroutines-3680-8   256ns ± 1%
MutexWrite/goroutines-3760-8   261ns ± 2%
MutexWrite/goroutines-3840-8   266ns ± 3%
MutexWrite/goroutines-3920-8   276ns ± 3%
MutexWrite/goroutines-4000-8   278ns ± 3%
MutexWrite/goroutines-4080-8   286ns ± 5%
MutexWrite/goroutines-4160-8   293ns ± 4%
MutexWrite/goroutines-4240-8   295ns ± 2%
MutexWrite/goroutines-4320-8   280ns ± 8%
MutexWrite/goroutines-4400-8   294ns ± 9%
MutexWrite/goroutines-4480-8   285ns ±10%
MutexWrite/goroutines-4560-8   290ns ± 8%
MutexWrite/goroutines-4640-8   271ns ± 3%
MutexWrite/goroutines-4720-8   271ns ± 4%

ChanWrite/goroutines-2400-8  158ns ± 3%
ChanWrite/goroutines-2480-8  159ns ± 2%
ChanWrite/goroutines-2560-8  161ns ± 2%
ChanWrite/goroutines-2640-8  161ns ± 1%
ChanWrite/goroutines-2720-8  163ns ± 1%
ChanWrite/goroutines-2800-8  166ns ± 3%
ChanWrite/goroutines-2880-8  168ns ± 1%
ChanWrite/goroutines-2960-8  176ns ± 4%
ChanWrite/goroutines-3040-8  176ns ± 2%
ChanWrite/goroutines-3120-8  180ns ± 1%
ChanWrite/goroutines-3200-8  180ns ± 1%
ChanWrite/goroutines-3280-8  181ns ± 2%
ChanWrite/goroutines-3360-8  183ns ± 2%
ChanWrite/goroutines-3440-8  188ns ± 3%
ChanWrite/goroutines-3520-8  190ns ± 2%
ChanWrite/goroutines-3600-8  193ns ± 2%
ChanWrite/goroutines-3680-8  196ns ± 3%
ChanWrite/goroutines-3760-8  199ns ± 2%
ChanWrite/goroutines-3840-8  206ns ± 2%
ChanWrite/goroutines-3920-8  209ns ± 2%
ChanWrite/goroutines-4000-8  206ns ± 2%
ChanWrite/goroutines-4080-8  209ns ± 2%
ChanWrite/goroutines-4160-8  208ns ± 2%
ChanWrite/goroutines-4240-8  209ns ± 3%
ChanWrite/goroutines-4320-8  213ns ± 2%
ChanWrite/goroutines-4400-8  209ns ± 2%
ChanWrite/goroutines-4480-8  211ns ± 1%
ChanWrite/goroutines-4560-8  213ns ± 2%
ChanWrite/goroutines-4640-8  215ns ± 1%
ChanWrite/goroutines-4720-8  218ns ± 3%


-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/3275fb21-dfbd-411d-be42-683386e7ebe2%40googlegroups.com.


Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-19 Thread changkun
Hi Ian,

Thanks for the hint. I tried on a Mac mini with i5-8500B, it seems the 
unexpected performance drop still exists (let GOMAXPROCS(8)) the control 
condition is the CPU:

[image: sync.Mutex performance (GOMAXPROCS == 8).png]


On Monday, August 19, 2019 at 9:15:50 PM UTC+2, Ian Lance Taylor wrote:
>
> On Mon, Aug 19, 2019 at 10:50 AM changkun  > wrote: 
> > 
> > I am comparing the performance regarding sync.Mutex and Go channels. 
> Here is my benchmark: https://play.golang.org/p/zLjVtsSx9gd 
>
> Might be interesting to try running your benchmark on a machine with 
> different hardware. 
>
> Ian 
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/bf03b161-bfbe-46ff-bd25-8c6143cff126%40googlegroups.com.


Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-20 Thread changkun
Hi Robert,

Thanks for your explanation. But how could I "logged the number of 
operations done per Go routine", which particular debug settings you 
referring to?
It is reasonable that sync.Mutex rely on runtime scheduler but channels do 
not. However, it is unclear why a significant performance drop appears. Is 
it possible to determine when the performance will appear?

Best,
Changkun

On Monday, August 19, 2019 at 10:27:19 PM UTC+2, Robert Engels wrote:
>
> I think you'll find the reason that the Mutex uses the Go scheduler. The 
> chan is controlled by a 'mutex' which eventually defers to the OS futex - 
> and the OS futex is probably more efficient at scheduling in the face of 
> large contention - although you would think it should be the other way 
> around.
>
> I am guessing that if you logged the number of operations done per Go 
> routine, you will see that the Mutex version is very fair, and the 
> chan/futex version is unfair - meaning many are starved.
>
> -Original Message- 
> From: changkun 
> Sent: Aug 19, 2019 12:50 PM 
> To: golang-nuts 
> Subject: [go-nuts] sync.Mutex encounter large performance drop when 
> goroutine contention more than 3400 
>
> I am comparing the performance regarding sync.Mutex and Go channels. Here 
> is my benchmark: https://play.golang.org/p/zLjVtsSx9gd
>
> The performance comparison visualization is as follows:
>
> [image: sync.Mutex performance (1).png]
> What are the reasons that 
>
> 1. sync.Mutex encounter a large performance drop when the number of 
> goroutines goes higher than roughly 3400?
> 2. Go channels are pretty stable but slower than sync.Mutex before?
>
>
>
> Raw bench data by benchstat (go test -bench=. -count=5):
>
> MutexWrite/goroutines-2400-8  48.6ns ± 1%
> MutexWrite/goroutines-2480-8  49.1ns ± 0%
> MutexWrite/goroutines-2560-8  49.7ns ± 1%
> MutexWrite/goroutines-2640-8  50.5ns ± 3%
> MutexWrite/goroutines-2720-8  50.9ns ± 2%
> MutexWrite/goroutines-2800-8  51.8ns ± 3%
> MutexWrite/goroutines-2880-8  52.5ns ± 2%
> MutexWrite/goroutines-2960-8  54.1ns ± 4%
> MutexWrite/goroutines-3040-8  54.5ns ± 2%
> MutexWrite/goroutines-3120-8  56.1ns ± 3%
> MutexWrite/goroutines-3200-8  63.2ns ± 5%
> MutexWrite/goroutines-3280-8  77.5ns ± 6%
> MutexWrite/goroutines-3360-8   141ns ± 6%
> MutexWrite/goroutines-3440-8   239ns ± 8%
> MutexWrite/goroutines-3520-8   248ns ± 3%
> MutexWrite/goroutines-3600-8   254ns ± 2%
> MutexWrite/goroutines-3680-8   256ns ± 1%
> MutexWrite/goroutines-3760-8   261ns ± 2%
> MutexWrite/goroutines-3840-8   266ns ± 3%
> MutexWrite/goroutines-3920-8   276ns ± 3%
> MutexWrite/goroutines-4000-8   278ns ± 3%
> MutexWrite/goroutines-4080-8   286ns ± 5%
> MutexWrite/goroutines-4160-8   293ns ± 4%
> MutexWrite/goroutines-4240-8   295ns ± 2%
> MutexWrite/goroutines-4320-8   280ns ± 8%
> MutexWrite/goroutines-4400-8   294ns ± 9%
> MutexWrite/goroutines-4480-8   285ns ±10%
> MutexWrite/goroutines-4560-8   290ns ± 8%
> MutexWrite/goroutines-4640-8   271ns ± 3%
> MutexWrite/goroutines-4720-8   271ns ± 4%
>
> ChanWrite/goroutines-2400-8  158ns ± 3%
> ChanWrite/goroutines-2480-8  159ns ± 2%
> ChanWrite/goroutines-2560-8  161ns ± 2%
> ChanWrite/goroutines-2640-8  161ns ± 1%
> ChanWrite/goroutines-2720-8  163ns ± 1%
> ChanWrite/goroutines-2800-8  166ns ± 3%
> ChanWrite/goroutines-2880-8  168ns ± 1%
> ChanWrite/goroutines-2960-8  176ns ± 4%
> ChanWrite/goroutines-3040-8  176ns ± 2%
> ChanWrite/goroutines-3120-8  180ns ± 1%
> ChanWrite/goroutines-3200-8  180ns ± 1%
> ChanWrite/goroutines-3280-8  181ns ± 2%
> ChanWrite/goroutines-3360-8  183ns ± 2%
> ChanWrite/goroutines-3440-8  188ns ± 3%
> ChanWrite/goroutines-3520-8  190ns ± 2%
> ChanWrite/goroutines-3600-8  193ns ± 2%
> ChanWrite/goroutines-3680-8  196ns ± 3%
> ChanWrite/goroutines-3760-8  199ns ± 2%
> ChanWrite/goroutines-3840-8  206ns ± 2%
> ChanWrite/goroutines-3920-8  209ns ± 2%
> ChanWrite/goroutines-4000-8  206ns ± 2%
> ChanWrite/goroutines-4080-8  209ns ± 2%
> ChanWrite/goroutines-4160-8  208ns ± 2%
> ChanWrite/goroutines-4240-8  209ns ± 3%
> ChanWrite/goroutines-4320-8  213ns ± 2%
> ChanWrite/goroutines-4400-8  209ns ± 2%
> ChanWrite/goroutines-4480-8  211ns ± 1%
> ChanWrite/goroutines-4560-8  213ns ± 2%
> ChanWrite/goroutines-4640-8  215ns ± 1%
> ChanWrite/goroutines-4720-8  218ns ± 3%
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golan...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/3275fb21-dfbd-411d-be42-683386e7e

Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-20 Thread changkun
Hi Ian Davis, I read the issue before I post this thread. 
I think the old issue is quite different than here. Here the problem 
discusses sudden performance drop and unexpected regression, 
but the issue#5183 only did experiment on a very limited number of 
goroutines, and Dmitry's answer is fair in that case.

Best, Changkun

On Tuesday, August 20, 2019 at 10:55:13 AM UTC+2, Ian Davis wrote:
>
>
> On Tue, 20 Aug 2019, at 9:33 AM, changkun wrote:
>
> Hi Robert,
>
> Thanks for your explanation. But how could I "logged the number of 
> operations done per Go routine", which particular debug settings you 
> referring to?
> It is reasonable that sync.Mutex rely on runtime scheduler but channels do 
> not. However, it is unclear why a significant performance drop appears. Is 
> it possible to determine when the performance will appear?
>
>
> This comment by Dmitry Vyukov on a very old issue might help (I have no 
> idea if it is still valid after 6 years though) 
>
> If you are interested why chan does not have the same issue, runtime handles 
> chan's in a
> special way (active/passive spinning + thread parking instead of goroutine 
> parking),
> because it knowns that the critical section is bounded and small. For 
> sync.Mutex it does
> not have any knowledge as to critical section size.
>
>
> See https://github.com/golang/go/issues/5183
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/e90669c4-42de-4f33-9327-41d35cf26077%40googlegroups.com.


Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-21 Thread changkun
"less than N Go routines it fits in the L1 CPU cache," I am guessing that 
you are thinking of local queues on each M, the scheduler's local queue 
size is strict to 256 goroutines. However, in our case, all blocking 
goroutines don't go the run queue but blocked and stored on semtable, which 
is a forest and each tree is an unlimited balanced tree. When a lock is 
released, only a single goroutine will be detached and put into the local 
queue (so scheduler only schedules runq with a single goroutine without 
content to globalq). 
How could an L1/L2 problem appear here? Do you think this is still some 
kind of "limited L1 cache to store large mount of goroutines" ?

What interests me is a newly created issue, I am not sure if this question 
is relevant to https://github.com/golang/go/issues/33747
The issue talked about small contention on large Ps, but a full scale of my 
benchmark is shown as follows:

[image: performance_ channel v.s. sync.Mutex v.s. atomic.png]



On Tuesday, August 20, 2019 at 6:10:32 PM UTC+2, Robert Engels wrote:
>
> I am assuming that there is an internal Go structure/process that when 
> there is less than N Go routines it fits in the L1 CPU cache, and beyond a 
> certain point it spills to the L2 or higher - thus the nearly order of 
> magnitude performance decrease, yet consistent times within a range.
>
> Since the worker code is so trivial, you are seeing this. Most worker code 
> is not as trivial so the overhead of the locking/scheduler constructs have 
> far less effect (or the worker is causing L1 evictions anyway - so you 
> never see the optimum performance possible of the scheduler).
>
> -Original Message- 
> From: changkun 
> Sent: Aug 20, 2019 3:33 AM 
> To: golang-nuts 
> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
> goroutine contention more than 3400 
>
> Hi Robert,
>
> Thanks for your explanation. But how could I "logged the number of 
> operations done per Go routine", which particular debug settings you 
> referring to?
> It is reasonable that sync.Mutex rely on runtime scheduler but channels do 
> not. However, it is unclear why a significant performance drop appears. Is 
> it possible to determine when the performance will appear?
>
> Best,
> Changkun
>
> On Monday, August 19, 2019 at 10:27:19 PM UTC+2, Robert Engels wrote:
>>
>> I think you'll find the reason that the Mutex uses the Go scheduler. The 
>> chan is controlled by a 'mutex' which eventually defers to the OS futex - 
>> and the OS futex is probably more efficient at scheduling in the face of 
>> large contention - although you would think it should be the other way 
>> around.
>>
>> I am guessing that if you logged the number of operations done per Go 
>> routine, you will see that the Mutex version is very fair, and the 
>> chan/futex version is unfair - meaning many are starved.
>>
>> -Original Message- 
>> From: changkun 
>> Sent: Aug 19, 2019 12:50 PM 
>> To: golang-nuts 
>> Subject: [go-nuts] sync.Mutex encounter large performance drop when 
>> goroutine contention more than 3400 
>>
>> I am comparing the performance regarding sync.Mutex and Go channels. Here 
>> is my benchmark: https://play.golang.org/p/zLjVtsSx9gd
>>
>> The performance comparison visualization is as follows:
>>
>> [image: sync.Mutex performance (1).png]
>> What are the reasons that 
>>
>> 1. sync.Mutex encounter a large performance drop when the number of 
>> goroutines goes higher than roughly 3400?
>> 2. Go channels are pretty stable but slower than sync.Mutex before?
>>
>>
>>
>> Raw bench data by benchstat (go test -bench=. -count=5):
>>
>> MutexWrite/goroutines-2400-8  48.6ns ± 1%
>> MutexWrite/goroutines-2480-8  49.1ns ± 0%
>> MutexWrite/goroutines-2560-8  49.7ns ± 1%
>> MutexWrite/goroutines-2640-8  50.5ns ± 3%
>> MutexWrite/goroutines-2720-8  50.9ns ± 2%
>> MutexWrite/goroutines-2800-8  51.8ns ± 3%
>> MutexWrite/goroutines-2880-8  52.5ns ± 2%
>> MutexWrite/goroutines-2960-8  54.1ns ± 4%
>> MutexWrite/goroutines-3040-8  54.5ns ± 2%
>> MutexWrite/goroutines-3120-8  56.1ns ± 3%
>> MutexWrite/goroutines-3200-8  63.2ns ± 5%
>> MutexWrite/goroutines-3280-8  77.5ns ± 6%
>> MutexWrite/goroutines-3360-8   141ns ± 6%
>> MutexWrite/goroutines-3440-8   239ns ± 8%
>> MutexWrite/goroutines-3520-8   248ns ± 3%
>> MutexWrite/goroutines-3600-8   254ns ± 2%
>> MutexWrite/goroutines-3680-8   256ns ± 1%
>> MutexWrite/goroutines-3760-8   261ns ± 2%
>> MutexWrite/goroutines-3840-8   266ns ± 3%
>> MutexWrite/gorouti

Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-26 Thread changkun
Sorry for late response. Do you mean the total execution was not the same? 
If so then it is not true, you see below two bench are executed both 5000 
times:

goos: linux
goarch: amd64
BenchmarkMutexWrite/goroutines-2400-8   5000
46.5 ns/op

Type: cpu
Time: Aug 26, 2019 at 6:19pm (CEST)
Duration: 2.50s, Total samples = 5.47s (218.62%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top20
Showing nodes accounting for 5.14s, 93.97% of 5.47s total
Dropped 15 nodes (cum <= 0.03s)
Showing top 20 nodes out of 48
  flat  flat%   sum%cum   cum%
 1.77s 32.36% 32.36%  1.77s 32.36%  runtime.procyield
 1.08s 19.74% 52.10%  2.20s 40.22%  sync.(*Mutex).Lock
 0.40s  7.31% 59.41%  1.34s 24.50%  sync.(*Mutex).Unlock
 0.32s  5.85% 65.27%  2.06s 37.66%  runtime.lock
 0.23s  4.20% 69.47%  1.53s 27.97%  runtime.findrunnable
 0.20s  3.66% 73.13%  0.25s  4.57%  runtime.unlock
 0.18s  3.29% 76.42%  0.19s  3.47%  runtime.pidleput
 0.13s  2.38% 78.79%  0.13s  2.38%  runtime.cansemacquire
 0.12s  2.19% 80.99%  0.12s  2.19%  runtime.futex
 0.10s  1.83% 82.82%  0.11s  2.01%  sync.runtime_nanotime
 0.09s  1.65% 84.46%  3.64s 66.54%  
_/home/changkun/dev/tests_test.BenchmarkMutexWrite.func1.1
 0.09s  1.65% 86.11%  0.09s  1.65%  runtime.casgstatus
 0.08s  1.46% 87.57%  0.08s  1.46%  runtime.(*semaRoot).dequeue
 0.08s  1.46% 89.03%  0.94s 17.18%  runtime.semrelease1
 0.07s  1.28% 90.31%  0.07s  1.28%  runtime.gopark
 0.06s  1.10% 91.41%  0.97s 17.73%  runtime.semacquire1
 0.04s  0.73% 92.14%  0.04s  0.73%  runtime.runqempty
 0.04s  0.73% 92.87%  0.04s  0.73%  sync.runtime_canSpin
 0.03s  0.55% 93.42%  0.03s  0.55%  runtime.(*guintptr).cas
 0.03s  0.55% 93.97%  0.03s  0.55%  runtime.gogo




goos: linux
goarch: amd64
BenchmarkMutexWrite/goroutines-4800-8   5000   317 
ns/op
PASS
ok  _/home/changkun/dev/tests   16.020s

Type: cpu
Time: Aug 26, 2019 at 6:18pm (CEST)
Duration: 16.01s, Total samples = 17.03s (106.35%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top20
Showing nodes accounting for 14640ms, 85.97% of 17030ms total
Dropped 19 nodes (cum <= 85.15ms)
Showing top 20 nodes out of 51
  flat  flat%   sum%cum   cum%
2130ms 12.51% 12.51% 2160ms 12.68%  runtime.gopark
1940ms 11.39% 23.90% 7210ms 42.34%  sync.(*Mutex).Lock
1640ms  9.63% 33.53% 1810ms 10.63%  sync.runtime_nanotime
1490ms  8.75% 42.28% 4560ms 26.78%  runtime.semacquire1
1130ms  6.64% 48.91% 1130ms  6.64%  runtime.casgstatus
 800ms  4.70% 53.61% 3740ms 21.96%  runtime.semrelease1
 590ms  3.46% 57.08%  590ms  3.46%  runtime.(*guintptr).cas
 560ms  3.29% 60.36%  560ms  3.29%  runtime.futex
 530ms  3.11% 63.48%  610ms  3.58%  runtime.lock
 520ms  3.05% 66.53%  520ms  3.05%  runtime.unlock
 440ms  2.58% 69.11%  440ms  2.58%  runtime.semroot
 430ms  2.52% 71.64%  430ms  2.52%  runtime.usleep
 430ms  2.52% 74.16% 4210ms 24.72%  sync.(*Mutex).Unlock
 370ms  2.17% 76.34% 1320ms  7.75%  runtime.ready
 370ms  2.17% 78.51% 4930ms 28.95%  sync.runtime_SemacquireMutex
 340ms  2.00% 80.50%  340ms  2.00%  runtime.(*semaRoot).dequeue
 270ms  1.59% 82.09%11820ms 69.41%  
_/home/changkun/dev/tests_test.BenchmarkMutexWrite.func1.1
 220ms  1.29% 83.38%  220ms  1.29%  runtime.cansemacquire
 220ms  1.29% 84.67%  290ms  1.70%  runtime.releaseSudog
 220ms  1.29% 85.97% 2490ms 14.62%  runtime.schedule




On Friday, August 23, 2019 at 5:44:46 AM UTC+2, robert engels wrote:
>
> As I expected, the test is not testing what you think it is. Many of the 
> Go routines created do not perform the same number of iterations. The 
> benchmark harness is only designed to try and perform enough iterations to 
> get a time per op while “running N routines”.
>
> You need to rework the test so that all Go routines run the same number of 
> iterations - and time the entire process - then you can see how the 
> concurrency/scheduling latency affects things (or the time of the operation 
> when fully contended). Then the time per is = total time / (iterations * 
> nRoutines)
>
> Here is the code that shows the number of iterations per routine: 
> https://play.golang.org/p/LkAvB39X3_Z
>
>
>
> On Aug 21, 2019, at 6:31 PM, Robert Engels  > wrote:
>
> I don't think you've posted code for the atomic version...
>
> Each Go routine has its own stack. So when you cycle through many Go 
> routines you will be destroying the cache as each touches N memory 
> addresses (that are obviously not shared).
>
> Tha

Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-26 Thread changkun
And it looks like the `semacquire1` executed too many `gopark`, which means 
indicating that `cansemacquire` failed a lot when too much contention 
happens.

On Monday, August 26, 2019 at 6:27:14 PM UTC+2, changkun wrote:
>
> Sorry for late response. Do you mean the total execution was not the same? 
> If so then it is not true, you see below two bench are executed both 5000 
> times:
>
> goos: linux
> goarch: amd64
> BenchmarkMutexWrite/goroutines-2400-8   5000
> 46.5 ns/op
>
> Type: cpu
> Time: Aug 26, 2019 at 6:19pm (CEST)
> Duration: 2.50s, Total samples = 5.47s (218.62%)
> Entering interactive mode (type "help" for commands, "o" for options)
> (pprof) top20
> Showing nodes accounting for 5.14s, 93.97% of 5.47s total
> Dropped 15 nodes (cum <= 0.03s)
> Showing top 20 nodes out of 48
>   flat  flat%   sum%cum   cum%
>  1.77s 32.36% 32.36%  1.77s 32.36%  runtime.procyield
>  1.08s 19.74% 52.10%  2.20s 40.22%  sync.(*Mutex).Lock
>  0.40s  7.31% 59.41%  1.34s 24.50%  sync.(*Mutex).Unlock
>  0.32s  5.85% 65.27%  2.06s 37.66%  runtime.lock
>  0.23s  4.20% 69.47%  1.53s 27.97%  runtime.findrunnable
>  0.20s  3.66% 73.13%  0.25s  4.57%  runtime.unlock
>  0.18s  3.29% 76.42%  0.19s  3.47%  runtime.pidleput
>  0.13s  2.38% 78.79%  0.13s  2.38%  runtime.cansemacquire
>  0.12s  2.19% 80.99%  0.12s  2.19%  runtime.futex
>  0.10s  1.83% 82.82%  0.11s  2.01%  sync.runtime_nanotime
>  0.09s  1.65% 84.46%  3.64s 66.54%  
> _/home/changkun/dev/tests_test.BenchmarkMutexWrite.func1.1
>  0.09s  1.65% 86.11%  0.09s  1.65%  runtime.casgstatus
>  0.08s  1.46% 87.57%  0.08s  1.46%  runtime.(*semaRoot).dequeue
>  0.08s  1.46% 89.03%  0.94s 17.18%  runtime.semrelease1
>  0.07s  1.28% 90.31%  0.07s  1.28%  runtime.gopark
>  0.06s  1.10% 91.41%  0.97s 17.73%  runtime.semacquire1
>  0.04s  0.73% 92.14%  0.04s  0.73%  runtime.runqempty
>  0.04s  0.73% 92.87%  0.04s  0.73%  sync.runtime_canSpin
>  0.03s  0.55% 93.42%  0.03s  0.55%  runtime.(*guintptr).cas
>  0.03s  0.55% 93.97%  0.03s  0.55%  runtime.gogo
>
>
> ----
>
> goos: linux
> goarch: amd64
> BenchmarkMutexWrite/goroutines-4800-8   5000   317 
> ns/op
> PASS
> ok  _/home/changkun/dev/tests   16.020s
>
> Type: cpu
> Time: Aug 26, 2019 at 6:18pm (CEST)
> Duration: 16.01s, Total samples = 17.03s (106.35%)
> Entering interactive mode (type "help" for commands, "o" for options)
> (pprof) top20
> Showing nodes accounting for 14640ms, 85.97% of 17030ms total
> Dropped 19 nodes (cum <= 85.15ms)
> Showing top 20 nodes out of 51
>   flat  flat%   sum%cum   cum%
> 2130ms 12.51% 12.51% 2160ms 12.68%  runtime.gopark
> 1940ms 11.39% 23.90% 7210ms 42.34%  sync.(*Mutex).Lock
> 1640ms  9.63% 33.53% 1810ms 10.63%  sync.runtime_nanotime
> 1490ms  8.75% 42.28% 4560ms 26.78%  runtime.semacquire1
> 1130ms  6.64% 48.91% 1130ms  6.64%  runtime.casgstatus
>  800ms  4.70% 53.61% 3740ms 21.96%  runtime.semrelease1
>  590ms  3.46% 57.08%  590ms  3.46%  runtime.(*guintptr).cas
>  560ms  3.29% 60.36%  560ms  3.29%  runtime.futex
>  530ms  3.11% 63.48%  610ms  3.58%  runtime.lock
>  520ms  3.05% 66.53%  520ms  3.05%  runtime.unlock
>  440ms  2.58% 69.11%  440ms  2.58%  runtime.semroot
>  430ms  2.52% 71.64%  430ms  2.52%  runtime.usleep
>  430ms  2.52% 74.16% 4210ms 24.72%  sync.(*Mutex).Unlock
>  370ms  2.17% 76.34% 1320ms  7.75%  runtime.ready
>  370ms  2.17% 78.51% 4930ms 28.95%  sync.runtime_SemacquireMutex
>  340ms  2.00% 80.50%  340ms  2.00%  runtime.(*semaRoot).dequeue
>  270ms  1.59% 82.09%11820ms 69.41%  
> _/home/changkun/dev/tests_test.BenchmarkMutexWrite.func1.1
>  220ms  1.29% 83.38%  220ms  1.29%  runtime.cansemacquire
>  220ms  1.29% 84.67%  290ms  1.70%  runtime.releaseSudog
>  220ms  1.29% 85.97% 2490ms 14.62%  runtime.schedule
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/f75c01a0-d953-483d-87a7-b3b878871d8b%40googlegroups.com.


Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-26 Thread changkun
According to your formula let's sample three points:

2400 goroutines: 2.508s/(5000*2400) = 2.09 × 10^-11 s
3600 goroutines: 12.219s/(5000*3600) = 6.7883 × 10-11 seconds
4800 goroutines: 16.020s/(5000*4800) = 6.67500 × 10^-11 s

One can observe that 3600 and 4800 mostly equal to each other, but they 
both three times slower than 2400.

goos: linux
goarch: amd64
BenchmarkMutexWrite/goroutines-2400-8   5000
46.5 ns/op
PASS
ok  _/home/changkun/dev/tests   2.508s

goos: linux
goarch: amd64
BenchmarkMutexWrite/goroutines-3600-8   5000   240 
ns/op
PASS
ok  _/home/changkun/dev/tests   12.219s

goos: linux
goarch: amd64
BenchmarkMutexWrite/goroutines-4800-8   5000   317 
ns/op
PASS
ok  _/home/changkun/dev/tests   16.020s


-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/6dd6ec66-b0cc-4c8e-a341-94bff187214f%40googlegroups.com.


Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-26 Thread changkun
Your cache theory is very reasonable, but this was clear in the beginning 
post:  "before or after the massive increase, performance drops linearly".
Your hypothesis is reasonable, but how can you prove your hypothesis? By 
host machine cache usage monitoring? 
Matching of a concept is still not persuasive.

On Monday, August 26, 2019 at 8:08:27 PM UTC+2, Robert Engels wrote:
>
> Which is what I would expect - once the number of routines exhaust the 
> cache, it will take the next level (or never since its main memory) to see 
> an massive increase in time. 4800 is 30% slower than 3600 - so it is 
> increasing linearly with the number of Go routines.
>
>
> -----Original Message- 
> From: changkun 
> Sent: Aug 26, 2019 11:49 AM 
> To: golang-nuts 
> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
> goroutine contention more than 3400 
>
> According to your formula let's sample three points:
>
> 2400 goroutines: 2.508s/(5000*2400) = 2.09 × 10^-11 s
> 3600 goroutines: 12.219s/(5000*3600) = 6.7883 × 10-11 seconds
> 4800 goroutines: 16.020s/(5000*4800) = 6.67500 × 10^-11 s
>
> One can observe that 3600 and 4800 mostly equal to each other, but they 
> both three times slower than 2400.
>
> goos: linux
> goarch: amd64
> BenchmarkMutexWrite/goroutines-2400-8   5000
> 46.5 ns/op
> PASS
> ok  _/home/changkun/dev/tests   2.508s
>
> goos: linux
> goarch: amd64
> BenchmarkMutexWrite/goroutines-3600-8   5000   240 
> ns/op
> PASS
> ok  _/home/changkun/dev/tests   12.219s
>
> goos: linux
> goarch: amd64
> BenchmarkMutexWrite/goroutines-4800-8   5000   317 
> ns/op
> PASS
> ok  _/home/changkun/dev/tests   16.020s
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golan...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/6dd6ec66-b0cc-4c8e-a341-94bff187214f%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/golang-nuts/6dd6ec66-b0cc-4c8e-a341-94bff187214f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/495e22e8-4a5f-4a1d-88f8-59ff2b0a4006%40googlegroups.com.


Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-26 Thread changkun
Did I do anything wrong, the cache hint ratio decrease linearly, is it an 
expected result? I thought the cache hint ratio would have a significant 
drop:

[image: chart.png]
Raw data:

#goroutines cache-references cache-misses hint/(hint+miss)
2400 697103572 17641686 0.9753175194
3200 798160789 54169784 0.9364451004
3360 1387972473 148415678 0.9033996208
3600 1824541062 272166355 0.8701934506
4000 2053779401 393586501 0.8391795437
4800 1885622275 461872899 0.8032486268
On Monday, August 26, 2019 at 9:26:05 PM UTC+2, Robert Engels wrote:
>
> You can run the process under 'perf' and monitor the cpu cache hit/miss 
> ratio.
>
> -Original Message- 
> From: changkun 
> Sent: Aug 26, 2019 2:23 PM 
> To: golang-nuts 
> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
> goroutine contention more than 3400 
>
> Your cache theory is very reasonable, but this was clear in the beginning 
> post:  "before or after the massive increase, performance drops linearly".
> Your hypothesis is reasonable, but how can you prove your hypothesis? By 
> host machine cache usage monitoring? 
> Matching of a concept is still not persuasive.
>
> On Monday, August 26, 2019 at 8:08:27 PM UTC+2, Robert Engels wrote:
>>
>> Which is what I would expect - once the number of routines exhaust the 
>> cache, it will take the next level (or never since its main memory) to see 
>> an massive increase in time. 4800 is 30% slower than 3600 - so it is 
>> increasing linearly with the number of Go routines.
>>
>>
>> -Original Message- 
>> From: changkun 
>> Sent: Aug 26, 2019 11:49 AM 
>> To: golang-nuts 
>> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
>> goroutine contention more than 3400 
>>
>> According to your formula let's sample three points:
>>
>> 2400 goroutines: 2.508s/(5000*2400) = 2.09 × 10^-11 s
>> 3600 goroutines: 12.219s/(5000*3600) = 6.7883 × 10-11 seconds
>> 4800 goroutines: 16.020s/(5000*4800) = 6.67500 × 10^-11 s
>>
>> One can observe that 3600 and 4800 mostly equal to each other, but they 
>> both three times slower than 2400.
>>
>> goos: linux
>> goarch: amd64
>> BenchmarkMutexWrite/goroutines-2400-8   5000    
>> 46.5 ns/op
>> PASS
>> ok  _/home/changkun/dev/tests   2.508s
>>
>> goos: linux
>> goarch: amd64
>> BenchmarkMutexWrite/goroutines-3600-8   5000  
>>  240 ns/op
>> PASS
>> ok  _/home/changkun/dev/tests   12.219s
>>
>> goos: linux
>> goarch: amd64
>> BenchmarkMutexWrite/goroutines-4800-8   5000  
>>  317 ns/op
>> PASS
>> ok  _/home/changkun/dev/tests   16.020s
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to golan...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/golang-nuts/6dd6ec66-b0cc-4c8e-a341-94bff187214f%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/golang-nuts/6dd6ec66-b0cc-4c8e-a341-94bff187214f%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>>
>>
>>
>> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golan...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/495e22e8-4a5f-4a1d-88f8-59ff2b0a4006%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/golang-nuts/495e22e8-4a5f-4a1d-88f8-59ff2b0a4006%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/0c2b4dc1-3774-4fe6-9cd6-a841dd7265c4%40googlegroups.com.


Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-26 Thread changkun
Based on the pprof graph, I would rather believe that the massive 
performance drop happens because of the `semacquire1` implementation.
When the number of goroutines is small, most of the `semacquire1` success 
in the `cansemacquire ` fast path, or a middle path where a lock was 
required but then `cansemacquire` success again.
The drop happens in the case that goroutines are failed for fast path and 
middle path, and therefore needs to be parked, which involves runtime 
schedule costs.
How do you refute to this argument?

On Monday, August 26, 2019 at 10:56:21 PM UTC+2, changkun wrote:
>
> I also tested many times with `go tool pprof`, and it reproducible reports 
> the following difference:
>
> Here is for 2400 goroutines:
>
> [image: 2400.png]
>
> Here is for 4800 goroutines:
>
> [image: 4800.png]
>
> The difference here is: 4800 goroutines heavily call `gopark` and  2400 
> goroutines heavily calls runtime.procyield, have you notice this 
> difference? Are they normal?
> In attachment, you find the SVG graphs.
>
> On Monday, August 26, 2019 at 10:41:42 PM UTC+2, Robert Engels wrote:
>>
>> You might want to try 'perf mem' to report the access delays - it may be 
>> contention on the memory controller as well.
>>
>> Thinking about it again, I wouldn't expect a large jump if things were 
>> fair - for example, if at 100 they all fit in the cache, at 110, some are 
>> still in the cache, but some operations are slower, etc. so I would expect 
>> a jump but not as large as you see.
>>
>> Still, most linux context switches are 3-4 us, and you are talking about 
>> 300 ns, so you're still doing pretty good, and at approx 40 ns, there are 
>> so many aspects that come into play, i'm not sure you or anyone has the 
>> time to figure out - maybe the HFT guys are interested...
>>
>> Like I said, on my OSX machine the times are very similar with both 
>> approaches, so it is OS dependent, and probably OS and hardware 
>> configuration dependent - so I think I've probably reached the end of being 
>> able to help.
>>
>> And finally, it probably doesn't matter at all - if the Go routine is 
>> doing anything of value, 300 ns is probably an insignificant cost.
>>
>>
>> -Original Message- 
>> From: changkun 
>> Sent: Aug 26, 2019 3:15 PM 
>> To: golang-nuts 
>> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
>> goroutine contention more than 3400 
>>
>> Did I do anything wrong, the cache hint ratio decrease linearly, is it an 
>> expected result? I thought the cache hint ratio would have a significant 
>> drop:
>>
>> [image: chart.png]
>> Raw data:
>>
>> #goroutines cache-references cache-misses hint/(hint+miss)
>> 2400 697103572 17641686 0.9753175194
>> 3200 798160789 54169784 0.9364451004
>> 3360 1387972473 148415678 0.9033996208
>> 3600 1824541062 272166355 0.8701934506
>> 4000 2053779401 393586501 0.8391795437
>> 4800 1885622275 461872899 0.8032486268
>> On Monday, August 26, 2019 at 9:26:05 PM UTC+2, Robert Engels wrote:
>>>
>>> You can run the process under 'perf' and monitor the cpu cache hit/miss 
>>> ratio.
>>>
>>> -Original Message- 
>>> From: changkun 
>>> Sent: Aug 26, 2019 2:23 PM 
>>> To: golang-nuts 
>>> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
>>> goroutine contention more than 3400 
>>>
>>> Your cache theory is very reasonable, but this was clear in the 
>>> beginning post:  "before or after the massive increase, performance drops 
>>> linearly".
>>> Your hypothesis is reasonable, but how can you prove your hypothesis? By 
>>> host machine cache usage monitoring? 
>>> Matching of a concept is still not persuasive.
>>>
>>> On Monday, August 26, 2019 at 8:08:27 PM UTC+2, Robert Engels wrote:
>>>>
>>>> Which is what I would expect - once the number of routines exhaust the 
>>>> cache, it will take the next level (or never since its main memory) to see 
>>>> an massive increase in time. 4800 is 30% slower than 3600 - so it is 
>>>> increasing linearly with the number of Go routines.
>>>>
>>>>
>>>> -Original Message- 
>>>> From: changkun 
>>>> Sent: Aug 26, 2019 11:49 AM 
>>>> To: golang-nuts 
>>>> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
>>>> goroutine contention more than 3400 
>>>>
&g

Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-26 Thread changkun
Hi Robert, you misunderstand my point. Your first response was talking 
about the difference between chan and mutex implementation, here I am 
comparing mutex with difference number of goroutines.
Basically what you suspected doesn't match what was observed from 
statistics.

On Tuesday, August 27, 2019 at 12:34:25 AM UTC+2, Robert Engels wrote:
>
> I said in my very first response to you, that the mechanisms of the 
> implementation are different, with the in-kernel futex of the channel 
> implementation faster that the Go. Much of this is probably because the 
> thread is dedicated at this point. All that means is that up to a certain 
> point - the CAS works, but then due to contention that path no longer works.
>
> So you get better far performance up to N routines and slightly worse 
> performance after. Seems like a decent design decision for a lot of 
> workloads.
>
> Still, you keep ignoring this aspect - in the context of actual workloads 
> the difference is negligible.
>
> -Original Message- 
> From: changkun 
> Sent: Aug 26, 2019 4:08 PM 
> To: golang-nuts 
> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
> goroutine contention more than 3400 
>
> Based on the pprof graph, I would rather believe that the massive 
> performance drop happens because of the `semacquire1` implementation.
> When the number of goroutines is small, most of the `semacquire1` success 
> in the `cansemacquire ` fast path, or a middle path where a lock was 
> required but then `cansemacquire` success again.
> The drop happens in the case that goroutines are failed for fast path and 
> middle path, and therefore needs to be parked, which involves runtime 
> schedule costs.
> How do you refute to this argument?
>
> On Monday, August 26, 2019 at 10:56:21 PM UTC+2, changkun wrote:
>>
>> I also tested many times with `go tool pprof`, and it 
>> reproducible reports the following difference:
>>
>> Here is for 2400 goroutines:
>>
>> [image: 2400.png]
>>
>> Here is for 4800 goroutines:
>>
>> [image: 4800.png]
>>
>> The difference here is: 4800 goroutines heavily call `gopark` and  2400 
>> goroutines heavily calls runtime.procyield, have you notice this 
>> difference? Are they normal?
>> In attachment, you find the SVG graphs.
>>
>> On Monday, August 26, 2019 at 10:41:42 PM UTC+2, Robert Engels wrote:
>>>
>>> You might want to try 'perf mem' to report the access delays - it may be 
>>> contention on the memory controller as well.
>>>
>>> Thinking about it again, I wouldn't expect a large jump if things were 
>>> fair - for example, if at 100 they all fit in the cache, at 110, some are 
>>> still in the cache, but some operations are slower, etc. so I would expect 
>>> a jump but not as large as you see.
>>>
>>> Still, most linux context switches are 3-4 us, and you are talking about 
>>> 300 ns, so you're still doing pretty good, and at approx 40 ns, there are 
>>> so many aspects that come into play, i'm not sure you or anyone has the 
>>> time to figure out - maybe the HFT guys are interested...
>>>
>>> Like I said, on my OSX machine the times are very similar with both 
>>> approaches, so it is OS dependent, and probably OS and hardware 
>>> configuration dependent - so I think I've probably reached the end of being 
>>> able to help.
>>>
>>> And finally, it probably doesn't matter at all - if the Go routine is 
>>> doing anything of value, 300 ns is probably an insignificant cost.
>>>
>>>
>>> -Original Message- 
>>> From: changkun 
>>> Sent: Aug 26, 2019 3:15 PM 
>>> To: golang-nuts 
>>> Subject: Re: [go-nuts] sync.Mutex encounter large performance drop when 
>>> goroutine contention more than 3400 
>>>
>>> Did I do anything wrong, the cache hint ratio decrease linearly, is it 
>>> an expected result? I thought the cache hint ratio would have a significant 
>>> drop:
>>>
>>> [image: chart.png]
>>> Raw data:
>>>
>>> #goroutines cache-references cache-misses hint/(hint+miss)
>>> 2400 697103572 17641686 0.9753175194
>>> 3200 798160789 54169784 0.9364451004
>>> 3360 1387972473 148415678 0.9033996208
>>> 3600 1824541062 272166355 0.8701934506
>>> 4000 2053779401 393586501 0.8391795437
>>> 4800 1885622275 461872899 0.8032486268
>>> On Monday, August 26, 2019 at 9:26:05 PM UTC+2, Robert Engels wrote:
>>>>
>>>> You can run the pr

[go-nuts] How to do a forward compatibility support for go 1.13?

2019-09-06 Thread changkun
I have upgraded my code to Go 1.13 with newly introduced errors APIs.

However, I am not able to upgrade the production environment Go version 
which is Go 1.11, which means I have to run Go 1.13 code on a Go 1.11 
environment.

What are the way to make my builds both happy on local and production 
environment with a single code base?

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/0a915d38-39f4-4083-b730-75320a4eecc6%40googlegroups.com.


[go-nuts] What is the memory order when select on multiple channels that one is closing and the other is receiving?

2019-09-08 Thread changkun
Hi, golang nuts,

Let's think about this snippet: https://play.golang.org/p/3cNGho3gWTG

In the code snippet, a ticker is activating and another that that is 
closing, it seems that they can happen concurrently and result 
in two different outcomes: either ticker case being executed first or the 
other way around.
It is because of the pseudo-randomization of the select statement.

Intuitively, the close statement should happens-before select statement 
that starts choosing which case 
should be executing, and select a closed channel with the highest priority 
to prevent another receive case being executed once more.

My questions are:
Does the Go team or anybody else think of this memory order before? What 
was the decision that didn't make it?
If not, is it worth to be defined within the language? There are no 
language changes and only affects the runtime implementation.



-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/6992b605-ff10-4659-8016-dd96066d4588%40googlegroups.com.


Re: [go-nuts] What is the memory order when select on multiple channels that one is closing and the other is receiving?

2019-09-08 Thread changkun
Hi Robert,

The provided code snipped on my machine can result in different outputs, 
which basically shows that it could occur in any order.

The randomization mechanism in select statement made the verification hard. 
Logically, my argument is rigorous


On Sunday, September 8, 2019 at 5:31:49 PM UTC+2, robert engels wrote:
>
> You need to code it knowing that either can occur in any order.
>
> On Sep 8, 2019, at 10:14 AM, changkun > 
> wrote:
>
> Hi, golang nuts,
>
> Let's think about this snippet: https://play.golang.org/p/3cNGho3gWTG
>
> In the code snippet, a ticker is activating and another that that is 
> closing, it seems that they can happen concurrently and result 
> in two different outcomes: either ticker case being executed first or the 
> other way around.
> It is because of the pseudo-randomization of the select statement.
>
> Intuitively, the close statement should happens-before select statement 
> that starts choosing which case 
> should be executing, and select a closed channel with the highest priority 
> to prevent another receive case being executed once more.
>
> My questions are:
> Does the Go team or anybody else think of this memory order before? What 
> was the decision that didn't make it?
> If not, is it worth to be defined within the language? There are no 
> language changes and only affects the runtime implementation.
>
>
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golan...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/6992b605-ff10-4659-8016-dd96066d4588%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/golang-nuts/6992b605-ff10-4659-8016-dd96066d4588%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/5beab5a7-56c6-45bb-a8f7-2aaa9ba7ac5c%40googlegroups.com.


Re: [go-nuts] What is the memory order when select on multiple channels that one is closing and the other is receiving?

2019-09-08 Thread changkun
Hi Kurtis,

I am aware that you are talking about the happen-before algorithm which is 
basically the vector clock.

However, this discussion aims for the discussion regarding this proposal:

"the close statement should happens-before select statement that starts 
choosing which case 
should be executing, and select a closed channel with the highest priority 
to prevent another receive case being executed once more."

We are not entering write any code before we confirm that it is worthy.

On Sunday, September 8, 2019 at 5:46:43 PM UTC+2, Kurtis Rader wrote:
>
> On Sun, Sep 8, 2019 at 8:40 AM changkun > 
> wrote:
>
>> The provided code snipped on my machine can result in different outputs, 
>> which basically shows that it could occur in any order.
>>
>
> Yes
>  
>
>> The randomization mechanism in select statement made the verification 
>> hard. Logically, my argument is rigorous
>>
>
> No, it isn't. You need to learn a lot more about concurrency and race 
> conditions.
>
> -- 
> Kurtis Rader
> Caretaker of the exceptional canines Junior and Hank
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/d1c412e6-c775-49a6-8d1d-c417497dc9be%40googlegroups.com.


Re: [go-nuts] What is the memory order when select on multiple channels that one is closing and the other is receiving?

2019-09-09 Thread changkun
Hi Alex,

Thank you for giving such great clarification. You thought is more 
sophisticated than me. 

First: The close *does* happen-before the case clause of the select for the 
> closed channel, as per the memory model. 
>

Indeed, no question here.
 

> Second: If you actually establish an ordering "close happens-before the 
> select statement, which happens-before the ticker firing", you should 
> actually get the result you want, in that only the close-case is 
> deterministically chosen, AIUI. So we are only talking about the case you 
> tried to enforce, where you *don't* have that ordering - that is, where the 
> close happens concurrently with the ticker firing and both happen 
> concurrently to the select starting to execute.
>

Here comes with more interesting points. My "intuitive" comes from this 
scenario: 
I have an unbuffered channel "closedCh" that indicates if a network 
connection is closed, 
a timer handles for heartbeat detection and polls some data, if a 
connection is closed, then returns an error.

At some point, I close the "closedCh" for the termination of previous 
for-select loop.
I initially thought that the heartbeat detection case will not be executed 
and 
closedCh will safely return and terminates the loop.
However, there are some "false negative" results returned by the loop, 
because select still can select the heartbeat case if the heartbeat and 
closedCh arrive concurrently.
See the following code for TLDR:

// goroutine1
closedCh := make(chan struct{})

// do some preparation
...

for {
select {
case <- ticker.C:
// it is possible that this case can still being executed
// one more time if closedCh arrives.

// heartbeat detection, data processing..
...

case <- closedCh
return
}



// goroutine2
close(closedCh)

To fix the issue I have encountered, I have to use an atomic value for 
double check
if the channel is closed inside the ticker case:

// "global" data
closedCh := make(chan struct{})
closed := uint32(0)

// goroutine1

// do some preparation
...

for {
select {
case <- ticker.C:
// now it is ok
if atomic.LoadUint32(&closed) == 1 {
return
}

// heartbeat detection, data processing..
...

case <- closedCh
return
}



// goroutine2
if !atomic.CompareAndSwapUint32(&closed, 0, 1) {
return ErrClosed
}
close(closedCh)

which does not like an ideal solution (not sure if there is a better way?), 
because for close a ticker, 
I need an atomic value and an unbuffered channel that appear in three 
different places.
 

>
> Now, ISTM that the simplest intuitive interpretation of what "event A and 
> B happen concurrently" (defined abstractly as "are not ordered in the 
> happens-before partial order) is that the order that A and B happen in real 
> time in is unclear and could happen either way olin practice. And under 
> that intuitive understanding, I don't know how you conclude that the select 
> should prioritize either or not execute the ticker case at all. After all - 
> you can never know whether the ticker *has actually fired* at the concrete 
> real point in time the select executed.
>
Sorry for lack of problem statement, see above
 

>
> Note, by the way, that the fact that you observe either result doesn't 
> actually say anything about the pseudo-randomness or lack thereof of the 
> select statement: You can't, from the output of the program, distinguish 
> between "both cases where ready to proceed and select flipped a coin coming 
> up close" from "only close was ready to proceed, because select executed in 
> between the closing/firing" (and the same for the timer firing). The 
> pseudo-randomness of select is irrelevant for your case, it's the fact that 
> these events are concurrent, both as specified and IMO as intuitively 
> obvious.
>
You are right, it is not a suitable example for the question I have. I feel 
sorry about it. I hope you didn't get trouble for reading it. 
In fact, I am curious: if select work with a random selection, is it 
possible that a case will never be executed?
How can select statement provide fairness similar to the FIFO-semantic 
channel (https://github.com/golang/go/issues/11506)?
 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/7736fea9-b4c0-4b1b-8b0e-9bc3b0e3bd7b%40googlegroups.com.


Re: [go-nuts] What is the memory order when select on multiple channels that one is closing and the other is receiving?

2019-09-09 Thread changkun
Sincerely sorry for the typo of your name :( Axel

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/0e9ae45e-77cc-4610-a9fe-b551c6411e29%40googlegroups.com.


[go-nuts] Why runtime.sigtrampgo calls runtime.morestack?

2019-12-08 Thread changkun
As we all know that a program is interrupted by a signal and entering 
kernel space then switch to a userspace signal handler. After the signal 
handler is accomplished, it will be reentering the kernel space then switch 
back to where was interrupted.

I am recently reading the newly implemented async preemption in go 1.14, 
which uses the OS signal to interrupt a "non-preemptive" user goroutine. I 
am debugging very simple program:

package main


import (
"runtime"
"time"
)


func tightloop() {
for {
}
}


func main() {
runtime.GOMAXPROCS(1)
go tightloop()


time.Sleep(time.Millisecond)
println("OK")
runtime.Gosched()
}


In Go 1.14, when a preempt signal arrives, the `tightloop` will be 
interrupted by the OS and entering the pre-configured signal handler 
`runtime·sigtramp`:

TEXT runtime·sigtramp(SB),NOSPLIT,$72
MOVQ DX, ctx-56(SP)
MOVQ SI, info-64(SP)
MOVQ DI, signum-72(SP)
MOVQ $runtime·sigtrampgo(SB), AX
CALL AX
RET


which `sigtrampgo` eventually calls the `sighandler`.

//go:nosplit
//go:nowritebarrierrec
func sigtrampgo(sig uint32, info *siginfo, ctx unsafe.Pointer) {
(...)
setg(g.m.gsignal)
(...)
sighandler(sig, info, ctx, g)
setg(g)
(...)
}


As far as I read the `sighandler` function, it calls `doSigPreempt` and 
modifies the `ctx` that passed from system kernel, and sets the `rip` to 
the prologue of `runtime.asyncPreempt`.

//go:nowritebarrierrec
func sighandler(sig uint32, info *siginfo, ctxt unsafe.Pointer, gp *g) {
_g_ := getg()
c := &sigctxt{info, ctxt}


(...)
if sig == sigPreempt {
doSigPreempt(gp, c)
}
}
func doSigPreempt(gp *g, ctxt *sigctxt) {
if canPreempt {
// here modifies the rip and rsp
ctxt.pushCall(funcPC(asyncPreempt))
}


(...)
}


However, I noticed that the asyncPreempt is not immediately executed when
the signal handler is complete, instead:

1. `morestack` or `morestack_noctxt` is called after `sighandler` is 
**returned** (not entering either the epilogue or prologue), which calls 
`newstack` and check checks the preempt flag and entering schedule loop and 
therefore schedules the main goroutine to finish the async preemption.

2. the `OK` outputs before executing `asyncPreempt`

Here are my inserted print logs in runtime:

mstart1 call schedule()
enter schedule()
park_m call schedule()
enter schedule()
mstart1 call schedule()
enter schedule()
mstart1 call schedule()
enter schedule()
park_m call schedule()
enter schedule()
park_m call schedule()
enter schedule()
park_m call schedule()
enter schedule()
mstart1 call schedule()
enter schedule()
park_m call schedule()
enter schedule()
rip: 17149264 eip: 824634034136
before pushCall asyncPreempt
after pushCall asyncPreempt
rip: 17124704 eip: 824634034128  // rip points to asyncPreempt
calling newstack: m0, g0 // how could newstack is called?
newstack call gopreempt_m
gopreempt_m call goschedImpl
goschedImpl call schedule()
enter schedule()
OK
gosched_m call goschedImpl
goschedImpl call schedule()
enter schedule()
asyncPreempt2
asyncPreempt2
asyncPreempt2
asyncPreempt2
preemptPark
gopreempt_m call goschedImpl
goschedImpl call schedule()
enter schedule()


while I checked the dumped assembly code, there is no stack split check
in neither `asyncPreempt` or `sigtramp`.

Sorry for the long story, my questions are:

- When, who, and how runtime calls the `morestack` after `sighandler`? What 
did I miss?
- Does modifying `ctx` changes program jumps to the modified `rip` 
instruction after finishing the signal handler?

Thank you very much for reading the question and thanks to the go team 
building such a brilliant feature.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/c221f5d3-531d-41b3-9b44-7e14e96dc590%40googlegroups.com.


Re: [go-nuts] Why runtime.sigtrampgo calls runtime.morestack?

2019-12-08 Thread changkun
Dear Ian,

Thank you so much for your hint, I think I've figured it out.
The root cause seems similar to the "uncertainty principle".

As an observer, by adding a `println` call in `asyncPreempt` 
as well as `asyncPreempt2` influences the actual behavior 
after signal handling. The `println` involves stack split check, 
which calls the `morestack`.

It took me a while to realize that `morestack` stores its caller 
pc in `g.m.morebuf.pc` since `getcallerpc` in `newstack` 
always returns the pc from `morestack`, which doesn't tell 
too much information.

//go:nosplit
func asyncPreempt2() {
 // println("asyncPreempt2 is called") // comment here omits calling 
morestack.
 gp := getg()
 gp.asyncSafePoint = true
 if gp.preemptStop {
 mcall(preemptPark)
 } else {
 mcall(gopreempt_m)
 }
 println("asyncPreempt2 finished")
 gp.asyncSafePoint = false
}

On Sunday, December 8, 2019 at 7:37:12 PM UTC+1, Ian Lance Taylor wrote:
>
> On Sun, Dec 8, 2019 at 7:02 AM changkun > 
> wrote: 
> > 
> > As we all know that a program is interrupted by a signal and entering 
> kernel space then switch to a userspace signal handler. After the signal 
> handler is accomplished, it will be reentering the kernel space then switch 
> back to where was interrupted. 
> > 
> > I am recently reading the newly implemented async preemption in go 1.14, 
> which uses the OS signal to interrupt a "non-preemptive" user goroutine. I 
> am debugging very simple program: 
> > 
> > package main 
> > 
> > 
> > import ( 
> > "runtime" 
> > "time" 
> > ) 
> > 
> > 
> > func tightloop() { 
> > for { 
> > } 
> > } 
> > 
> > 
> > func main() { 
> > runtime.GOMAXPROCS(1) 
> > go tightloop() 
> > 
> > 
> > time.Sleep(time.Millisecond) 
> > println("OK") 
> > runtime.Gosched() 
> > } 
> > 
> > 
> > In Go 1.14, when a preempt signal arrives, the `tightloop` will be 
> interrupted by the OS and entering the pre-configured signal handler 
> `runtime·sigtramp`: 
> > 
> > TEXT runtime·sigtramp(SB),NOSPLIT,$72 
> > MOVQ DX, ctx-56(SP) 
> > MOVQ SI, info-64(SP) 
> > MOVQ DI, signum-72(SP) 
> > MOVQ $runtime·sigtrampgo(SB), AX 
> > CALL AX 
> > RET 
> > 
> > 
> > which `sigtrampgo` eventually calls the `sighandler`. 
> > 
> > //go:nosplit 
> > //go:nowritebarrierrec 
> > func sigtrampgo(sig uint32, info *siginfo, ctx unsafe.Pointer) { 
> > (...) 
> > setg(g.m.gsignal) 
> > (...) 
> > sighandler(sig, info, ctx, g) 
> > setg(g) 
> > (...) 
> > } 
> > 
> > 
> > As far as I read the `sighandler` function, it calls `doSigPreempt` and 
> modifies the `ctx` that passed from system kernel, and sets the `rip` to 
> the prologue of `runtime.asyncPreempt`. 
> > 
> > //go:nowritebarrierrec 
> > func sighandler(sig uint32, info *siginfo, ctxt unsafe.Pointer, gp 
> *g) { 
> > _g_ := getg() 
> > c := &sigctxt{info, ctxt} 
> > 
> > 
> > (...) 
> > if sig == sigPreempt { 
> > doSigPreempt(gp, c) 
> > } 
> > } 
> > func doSigPreempt(gp *g, ctxt *sigctxt) { 
> > if canPreempt { 
> > // here modifies the rip and rsp 
> > ctxt.pushCall(funcPC(asyncPreempt)) 
> > } 
> > 
> > 
> > (...) 
> > } 
> > 
> > 
> > However, I noticed that the asyncPreempt is not immediately executed 
> when 
> > the signal handler is complete, instead: 
> > 
> > 1. `morestack` or `morestack_noctxt` is called after `sighandler` is 
> **returned** (not entering either the epilogue or prologue), which calls 
> `newstack` and check checks the preempt flag and entering schedule loop and 
> therefore schedules the main goroutine to finish the async preemption. 
> > 
> > 2. the `OK` outputs before executing `asyncPreempt` 
> > 
> > Here are my inserted print logs in runtime: 
> > 
> > mstart1 call schedule() 
> > enter schedule() 
> > park_m call schedule() 
> > enter schedule() 
> > mstart1 call schedule() 
> > enter schedule() 
> > mstart1 call schedule() 
> > enter schedule() 
> > park_m call schedule() 
> > enter sche

Re: [go-nuts] Why runtime.sigtrampgo calls runtime.morestack?

2019-12-08 Thread changkun
That's a really useful workaround! It makes much more sense now.

Conclusion: when `sigtramp` is returned, nothing is called before entering 
`ayncPreempt`.

Thank you so much for releasing me from confusing :)

sigtramp is returned
asyncPreempt is called
OK


 CALL runtime·sigtrampgo(SB)
 CALL runtime·printmsg(SB)


TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0
 CALL runtime·printmsg2(SB)


var msg1 = []byte("sigtramp is returned\n")
var msg2 = []byte("asyncPreempt is called\n")


//go:nosplit
func printmsg() {
 write(2, unsafe.Pointer(&msg1[0]), int32(len(msg1)))
}


//go:nosplit
func printmsg2() {
 write(2, unsafe.Pointer(&msg2[0]), int32(len(msg2)))
}




On Monday, December 9, 2019 at 12:42:42 AM UTC+1, Ian Lance Taylor wrote:
>
> On Sun, Dec 8, 2019 at 1:10 PM changkun > 
> wrote: 
> > 
> > Thank you so much for your hint, I think I've figured it out. 
> > The root cause seems similar to the "uncertainty principle". 
> > 
> > As an observer, by adding a `println` call in `asyncPreempt` 
> > as well as `asyncPreempt2` influences the actual behavior 
> > after signal handling. The `println` involves stack split check, 
> > which calls the `morestack`. 
>
> Ah, of course.  One workaround is to use write, as in 
>
> // Package-scope variable 
> var myMsg = []byte("my message") 
>
> // In function. 
> write(2, unsafe.Pointer(&myMsg[0]), int32(len(myMsg))) 
>
> This works because the various implementations of write are marked 
> nosplit. 
>
> Ian 
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/a5f72949-0b52-48cb-b477-73f0fc4151ee%40googlegroups.com.


[go-nuts] Re: why p's local run queue size is 256?

2020-01-26 Thread changkun
256 run queue size is designed for the work-steal scheduler to prevent 
false sharing.

128 run queue size exactly fits one cache line. Since the run queue can be 
stolen half of the run queue from the tail by other Ps, 256 run queue size 
prevents false sharing when the work-steal happens on different cores.

On Sunday, January 26, 2020 at 6:31:27 PM UTC+8, jin wang wrote:
>
> why p's local run queue size is 256, not 128 or 512?
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/65c84faa-0f04-47b4-a6a0-1d2eb12557d5%40googlegroups.com.


[go-nuts] Is it considered to separate runtime and cmd/compile irrelevant standard libraries out from the core language repo and hosted in a different module, eg. std?

2020-01-27 Thread changkun
Dear golang-nuts,

As https://github.com/golang/go/issues/27151, 
https://github.com/golang/go/issues/6853 and many relevant issues 
discussed, Go download is huge.

The Go download contains everything in the main repo. Since Go modules are 
now a success, is it considered to separate all runtime irrelevant 
(meaning, no need runtime support to implement, e.g. net poller need 
runtime support because there is a scheduler) libraries out of the main 
repo?

For instance, the following packages can be separated to be stored in a 
different repository, called std module:

std/archive
std/bufio
std/bytes
std/compress
std/container
std/context
std/crypto
std/database
std/encoding
std/errors
std/expvar
std/flag
std/go
std/hash
std/html
std/image
std/index
std/io
std/log
std/math
std/mime
std/path
std/plugin
std/regexp
std/sort
std/strconv
std/strings
std/syscall
std/testdata
std/testing
std/text
std/unicode

Say we have a separate repository called golang/std, import these packages 
can have the same import path, instead of "std/io", go modules can 
overwrite the import path exactly like what it did for 
"github.com/user/mod/v2", set the imported std modules in go.mod file, 
import as "io" directly.

Thanks for reading.
Changkun

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/4a6b9418-a15d-4f71-8993-2b41f3fba068%40googlegroups.com.


[go-nuts] Re: Is it considered to separate runtime and cmd/compile irrelevant standard libraries out from the core language repo and hosted in a different module, eg. std?

2020-01-27 Thread changkun

>
> but the others are needed for "go build" & co.
>
This can be solved by first issuing an std module first, then using go 
modules to import these packages for go build.

The idea is to separate compiler and runtime, then have a minimum go core 
distribution. The Go development process then can speed up and separate 
from std and core language. Just like tools in golang/tools repo.

Moreover, right now, the issue tracker is now centralized in golang/go 
repo, which is verbose for the repo subscriber, a separate repo can also 
making a decentralized issue tracer possible.


 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/6fde727b-8075-47ca-8c05-d287d5380b04%40googlegroups.com.


Re: [go-nuts] Re: Is it considered to separate runtime and cmd/compile irrelevant standard libraries out from the core language repo and hosted in a different module, eg. std?

2020-01-28 Thread changkun
uld be useful in 
> some ways, but it is also much harder to support.  Considering how 
> much trouble we have making a single unified release, it's not obvious 
> to me that the Go team has the bandwidth to handle separate release 
> cycles that need to work together. 
>
> Ian 
>
>
>
> > On Mon, Jan 27, 2020 at 3:31 PM Volker Dobler  > wrote: 
> >> 
> >> On Monday, 27 January 2020 12:27:35 UTC+1, changkun wrote: 
> >>> 
> >>> Dear golang-nuts, 
> >>> 
> >>> As https://github.com/golang/go/issues/27151, 
> https://github.com/golang/go/issues/6853 and many relevant issues 
> discussed, Go download is huge. 
> >> 
> >> 
> >> Neither of these issues benefits from splitting the stdlib from the 
> >> compiler and/or the "runtime" (whatever you mean by that). 
> >> Separating the compiler from its stdlib seems strange at least 
> >> and does not help, neither to increase development speed nor 
> >> to increase convenience. 
> >> 
> >> V. 
> >> 
> >>> 
> >>> 
> >>> The Go download contains everything in the main repo. Since Go modules 
> are now a success, is it considered to separate all runtime irrelevant 
> (meaning, no need runtime support to implement, e.g. net poller need 
> runtime support because there is a scheduler) libraries out of the main 
> repo? 
> >>> 
> >>> For instance, the following packages can be separated to be stored in 
> a different repository, called std module: 
> >>> 
> >>> std/archive 
> >>> std/bufio 
> >>> std/bytes 
> >>> std/compress 
> >>> std/container 
> >>> std/context 
> >>> std/crypto 
> >>> std/database 
> >>> std/encoding 
> >>> std/errors 
> >>> std/expvar 
> >>> std/flag 
> >>> std/go 
> >>> std/hash 
> >>> std/html 
> >>> std/image 
> >>> std/index 
> >>> std/io 
> >>> std/log 
> >>> std/math 
> >>> std/mime 
> >>> std/path 
> >>> std/plugin 
> >>> std/regexp 
> >>> std/sort 
> >>> std/strconv 
> >>> std/strings 
> >>> std/syscall 
> >>> std/testdata 
> >>> std/testing 
> >>> std/text 
> >>> std/unicode 
> >>> 
> >>> Say we have a separate repository called golang/std, import these 
> packages can have the same import path, instead of "std/io", go modules can 
> overwrite the import path exactly like what it did for "
> github.com/user/mod/v2", set the imported std modules in go.mod file, 
> import as "io" directly. 
> >>> 
> >>> Thanks for reading. 
> >>> Changkun 
> >> 
> >> -- 
> >> You received this message because you are subscribed to the Google 
> Groups "golang-nuts" group. 
> >> To unsubscribe from this group and stop receiving emails from it, send 
> an email to golan...@googlegroups.com . 
> >> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/e1f86127-bdcb-411b-b50a-991845fc3e90%40googlegroups.com.
>  
>
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "golang-nuts" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to golan...@googlegroups.com . 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CAEkBMfHmJi0vOkG39CuTXrqAY0uO-x9aL12desRXE_hsmBUoVw%40mail.gmail.com.
>  
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/638ee93e-d250-4e2a-9621-fc144a5bbffc%40googlegroups.com.


Re: [go-nuts] Re: Is it considered to separate runtime and cmd/compile irrelevant standard libraries out from the core language repo and hosted in a different module, eg. std?

2020-01-30 Thread changkun
Dear Ian,

Don't worry, I didn't mean to give suggestions about how the Go team should 
work, apparently your past decade experience is more convenient than any 
whimsicality. 
Besides, I believe the Go team's decision is much sophisticated and better, 
I was randomly posted this and see if I can find any undetailed planned 
proposal leaks, sorry about any inappropriate wording.

 Changkun

On Tuesday, January 28, 2020 at 11:38:20 PM UTC+8, Ian Lance Taylor wrote:
>
> On Tue, Jan 28, 2020 at 12:27 AM changkun  > wrote: 
> > 
> > - the main repo team focuses on the core language distribution (meaning 
> a compiler that implements language spec, a runtime package that supports 
> all kinds of runtime mechanism support), provides the minimum "std-only" 
> APIs to support all standard library use cases. Also, a dedicated issue 
> tracker, remove all kinds of sub repo, std issues, better managing language 
> evolution history. 
> > - the std repo team (or "hand over to Go community" but may never be 
> possible), as one of the Go user groups, provides use case and experiences 
> reports to the main repo, similar to the regular regression reports from 
> kubernetes project, versioned std evolves without bothering the main repo 
> team. Go users can tracking std issues without bothered by whatever cmd/* 
> has proposed to change. 
>
> You just took one team and turned it into two teams.  Where are those 
> extra people going to come from? 
>
> Again, the single team can barely manage to do a Go release today. 
> We're late every time, and we find it just as frustrating as everyone 
> else.  Increasing the amount of work may be more scalable, but making 
> something more scalable only helps if you have idle people who want to 
> help but have nothing to do. 
>
> Ian 
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/0f536e16-72af-4ec0-b557-cd6f34871759%40googlegroups.com.


[go-nuts] A question about Type Sets

2021-08-23 Thread Changkun Ou
Hi golang-nuts,

I am trying out the latest type parameter and type sets design.
The purpose is to implement a Clamp function that works for numbers and
vectors.

The version for numbers is straightforward and easy:

```go
// Number is a type set of numbers.
type Number interface {
~int | ~int8 | ~int32 | ~int64 | ~float32 | ~float64
}

// Clamp clamps a given value in [min, max].
func Clamp[N Number](n, min, max N) N {
if n < min { return min }
if n > max { return max }
return n
}
```

Everything is good so far. Then, let's define vector types:

```go
// Vec2 represents a 2D vector (x, y).
type Vec2[N Number] struct {
X, Y N
}

// Vec3 represents a 3D vector (x, y, z).
type Vec3[N Number] struct {
X, Y, Z N
}

// Vec4 represents homogeneous coordinates (x, y, z, w) that defines
// either a point (W=1) or a vector (W=0). Other case of W need to apply
// a perspective division to get the actual coordinates of X, Y, Z.
type Vec4[N Number] struct {
X, Y, Z, W N
}
```

However, in order to declare a type set of all possible vectors, I tried
two possibilities:

```go
// Vec is a type set of vectors.
type Vec[N Number] interface {
Vec2[N] | Vec3[N] | Vec4[N] // ERROR: interface cannot have type parameters
}
```

```go
type Vec interface {
Vec2[N Number] | Vec3[N Number] | Vec4[N Number] // ERROR: interface cannot
have type parameters
}
```

Let's just enumerates all possibilities for the Vec type set:

```go
// Vec is a type set of vectors.
type Vec interface {
Vec2[float32] | Vec3[float32] | Vec4[float32] |
Vec2[float64] | Vec3[float64] | Vec4[float64]
}
```

However, with this definition, it remains very tricky to construct a
generic implementation for a clamp function:

```go
// ERROR: this function does not compile
func ClampVec[V Vec, N Number](v V, min, max N) V {
switch (interface{})(v).(type) {
case Vec2[float32]:
return Vec2[float32]{
Clamp[float32](v.X, min, max),
Clamp[float32](v.Y, min, max),
}
case Vec2[float64]:
return Vec2[float64]{
Clamp[float64](v.X, min, max),
Clamp[float64](v.Y, min, max),
}
case Vec3[float32]:
return Vec3[float32]{
Clamp[float32](v.X, min, max),
Clamp[float32](v.Y, min, max),
Clamp[float32](v.Z, min, max),
}
case Vec3[float64]:
return Vec3[float64]{
Clamp[float64](v.X, min, max),
Clamp[float64](v.Y, min, max),
Clamp[float64](v.Z, min, max),
}
case Vec4[float32]:
return Vec4[float32]{
Clamp[float32](v.X, min, max),
Clamp[float32](v.Y, min, max),
Clamp[float32](v.Z, min, max),
Clamp[float32](v.W, min, max),
}
case Vec4[float64]:
return Clamp[float64]{
Clamp[float64](v.X, min, max),
Clamp[float64](v.Y, min, max),
Clamp[float64](v.Z, min, max),
Clamp[float64](v.W, min, max),
}
default:
panic(fmt.Sprintf("unexpected type %T", v))
}
}
```

I wish I could converge to a version similar like this:

```go
func Clamp[N Number](n, min, max N) N {
if n < min { return min }
if n > max { return max }
return n
}

// ERROR: this functions does not compile
func ClampVec[N Number, V Vec[N]](v V[N], min, max N) V[N] {
switch (interface{})(v).(type) {
case Vec2[N]: // If V is Vec2[N], then return a Vec2[N].
return Vec2[N]{
Clamp[N](v.X, min, max),
Clamp[N](v.Y, min, max),
}
case Vec3[N]: // Similar
return Vec3[N]{
Clamp[N](v.X, min, max),
Clamp[N](v.Y, min, max),
Clamp[N](v.Z, min, max),
}
case Vec4[N]: // Similar
return Vec4[N]{
Clamp[N](v.X, min, max),
Clamp[N](v.Y, min, max),
Clamp[N](v.Z, min, max),
Clamp[N](v.W, min, max),
}
default:
panic(fmt.Sprintf("unexpected type %T", v))
}
}

// caller side:

Clamp[float32](256, 0, 255) // 255
Clamp[float64, Vec2[float64]]({1, 2, 3}, 0, 1) // Vec2[float32]{1, 1, 1}
...
```

I found myself trapped and not able to further proceed. Is the above code
legal
with the current design but just because the compiler has not implemented
it yet?
Any ideas on how could the current design be able to produce something even
simpler?

Thank you in advance for your read and help.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAP%2BmWpJO3xE4Wwg22cXAeY-JKFxk%3D3uneOTEKwLrWjcqLt_Y9A%40mail.gmail.com.


[go-nuts] Is introducing ... to parameter list a breaking change? How is it considered in the proposal process?

2022-05-03 Thread Changkun Ou
Hi gophers,

I wonder how the Go project defines a breaking change, specifically for the 
case where we have an existing API but want to add ... to its parameter 
list for customizations.

For instance, we have an API:

package foo
func Bar() {}

And the current code uses this function as:

foo.Bar()

Now, we added a ... to the Bar() and having a new function signature:

func Bar(args ...any) {}

Technically, the language allows the existing users to continue to work:

foo.Bar() // still OK.

As long as we don't change how Bar() behaves by default, it seems that the 
code that previously used foo.Bar() is still considered valid and not 
breaking. Is introducing this type of API signature change considered a 
breaking change in the standard library?
What if we propose API changes like this to the existing standard library, 
will it be considered differently compared to an actual breaking change?

Thanks!
Changkun

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/9e9c58f0-7a79-4c43-9389-72f8d2954080n%40googlegroups.com.


Re: [go-nuts] Is introducing ... to parameter list a breaking change? How is it considered in the proposal process?

2022-05-03 Thread Changkun Ou
Hi Axel and Ian,

Thanks for the elaborative answer! I was too naive while considering the 
case and didn't observe the existing practices well. Assignability seems to 
create a lot more complicated compatibility issues.

Regarding the compatibility promises to users, it always seems the 
interpretation of developers and users is less effective in this area. I 
wonder what might be a possible way to involve a successful impact here. Is 
there any old practice that exists? Especially in the Go project itself :)

Also, the fantastic talk by Jonathan illustrates and saves me potentially a 
lot of time from a very early attempt to tackle the problem once again.

Changkun

On Tuesday, May 3, 2022 at 9:38:35 PM UTC+2 Ian Lance Taylor wrote:

> On Tue, May 3, 2022 at 12:28 PM Changkun Ou  wrote:
> >
> > I wonder how the Go project defines a breaking change, specifically for 
> the case where we have an existing API but want to add ... to its parameter 
> list for customizations.
> >
> > For instance, we have an API:
> >
> > package foo
> > func Bar() {}
> >
> > And the current code uses this function as:
> >
> > foo.Bar()
> >
> > Now, we added a ... to the Bar() and having a new function signature:
> >
> > func Bar(args ...any) {}
> >
> > Technically, the language allows the existing users to continue to work:
> >
> > foo.Bar() // still OK.
> >
> > As long as we don't change how Bar() behaves by default, it seems that 
> the code that previously used foo.Bar() is still considered valid and not 
> breaking. Is introducing this type of API signature change considered a 
> breaking change in the standard library?
> > What if we propose API changes like this to the existing standard 
> library, will it be considered differently compared to an actual breaking 
> change?
>
> In the standard library we would consider that a breaking change,
> because it would break code like
>
> var f = []func() { Bar }
>
> You might be interested in Jonathan Amsterdam's Gophercon talk:
> https://www.youtube.com/watch?v=JhdL5AkH-AQ
>
> Ian
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/b4db9ee0-ee65-4f69-a51f-11ed736b43dan%40googlegroups.com.


Re: [go-nuts] About Go 1.19 memory model clarificaitons on atomic operations.

2022-08-15 Thread Changkun Ou
I think the new memory model does not guarantee this program always prints 1:

1. There is no synchronization guarantee between line 13 and line 14
as these atomic operations are manipulated on the different memory
locations.
2. It is *not* prohibited for the compiler to switch line 13 and line
14 (as I read from section https://go.dev/ref/mem#badcompiler) because
of the above reason, and also, there is no order between line 13 and
line 20. So this is possible: line 14 < line 18 < line 20 < line 13.
3. Depending on the memory layout of a and b, if they are on the same
cache line, then the program will always print 1.


On Mon, Aug 15, 2022 at 8:48 AM 'Axel Wagner' via golang-nuts
 wrote:
>
> Why wouldn't it?
>>
>> If the effect of an atomic operation A is observed by atomic operation B, 
>> then A is synchronized before B.
>
> To me, it seems pretty clear that it will. Line 13 is synchronized before 
> line 14, which is synchronized before any load observing its effects (i.e. 
> any execution of line 18 which runs into the branch) - and such a load is 
> synchronized before the load in line 20.
>
> Therefore, the store in Line 13 is synchronized before the load in line 20.
>
>
> On Mon, Aug 15, 2022 at 8:37 AM tapi...@gmail.com  wrote:
>>
>> By the latest version of Go Memory Model article: https://go.dev/ref/mem, 
>> will the following program always print 1?
>>
>> https://go.dev/play/p/RICYGip5y8M
>>
>> --
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to golang-nuts+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/golang-nuts/4d9b8130-d06c-4519-9b99-d161e922d8f6n%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CAEkBMfHpwchwjAMtXNtpVmhb42Ncw9ENKhz5xH9Sv1z_-DMrRA%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAGrj7%2B%2BRZeMdnspKHK4yOKJ9eHka-B6jk_bhM5M95XbyamVC%3Dg%40mail.gmail.com.


Re: [go-nuts] About Go 1.19 memory model clarificaitons on atomic operations.

2022-08-15 Thread Changkun Ou
> The memory operations in each goroutine must correspond to a correct 
> sequential execution of that goroutine, given the values read from and 
> written to memory. That execution must be consistent with the sequenced 
> before relation, defined as the partial order requirements set out by the Go 
> language specification for Go's control flow constructs as well as the order 
> of evaluation for expressions.

This rule seems a bit unclear in its wording to me. These questions may occur:
1. What does it mean by "a correct sequential execution"? What defines
correct in this case? Is it implied by the variables' written order?
2. Is the rule applicable to multiple variables by written order or
only on different variables separately?

In lines 13 and 14's goroutine, it seems there is no control flow or
any given language spec required order of execution. The most relaxing
requirement: permit the compiler to switch the statement.


On Mon, Aug 15, 2022 at 9:32 AM Axel Wagner
 wrote:
>
>
>
> On Mon, Aug 15, 2022 at 9:06 AM Changkun Ou  wrote:
>>
>> I think the new memory model does not guarantee this program always prints 1:
>>
>> 1. There is no synchronization guarantee between line 13 and line 14
>> as these atomic operations are manipulated on the different memory
>> locations.
>
>
> Yes, there is:
>>
>> The memory operations in each goroutine must correspond to a correct 
>> sequential execution of that goroutine, given the values read from and 
>> written to memory. That execution must be consistent with the sequenced 
>> before relation, defined as the partial order requirements set out by the Go 
>> language specification for Go's control flow constructs as well as the order 
>> of evaluation for expressions.
>
>
>> 2. It is *not* prohibited for the compiler to switch line 13 and line
>> 14 (as I read from section https://go.dev/ref/mem#badcompiler) because
>> of the above reason, and also, there is no order between line 13 and
>> line 20. So this is possible: line 14 < line 18 < line 20 < line 13.
>> 3. Depending on the memory layout of a and b, if they are on the same
>> cache line, then the program will always print 1.
>>
>>
>> On Mon, Aug 15, 2022 at 8:48 AM 'Axel Wagner' via golang-nuts
>>  wrote:
>> >
>> > Why wouldn't it?
>> >>
>> >> If the effect of an atomic operation A is observed by atomic operation B, 
>> >> then A is synchronized before B.
>> >
>> > To me, it seems pretty clear that it will. Line 13 is synchronized before 
>> > line 14, which is synchronized before any load observing its effects (i.e. 
>> > any execution of line 18 which runs into the branch) - and such a load is 
>> > synchronized before the load in line 20.
>> >
>> > Therefore, the store in Line 13 is synchronized before the load in line 20.
>> >
>> >
>> > On Mon, Aug 15, 2022 at 8:37 AM tapi...@gmail.com  
>> > wrote:
>> >>
>> >> By the latest version of Go Memory Model article: https://go.dev/ref/mem, 
>> >> will the following program always print 1?
>> >>
>> >> https://go.dev/play/p/RICYGip5y8M
>> >>
>> >> --
>> >> You received this message because you are subscribed to the Google Groups 
>> >> "golang-nuts" group.
>> >> To unsubscribe from this group and stop receiving emails from it, send an 
>> >> email to golang-nuts+unsubscr...@googlegroups.com.
>> >> To view this discussion on the web visit 
>> >> https://groups.google.com/d/msgid/golang-nuts/4d9b8130-d06c-4519-9b99-d161e922d8f6n%40googlegroups.com.
>> >
>> > --
>> > You received this message because you are subscribed to the Google Groups 
>> > "golang-nuts" group.
>> > To unsubscribe from this group and stop receiving emails from it, send an 
>> > email to golang-nuts+unsubscr...@googlegroups.com.
>> > To view this discussion on the web visit 
>> > https://groups.google.com/d/msgid/golang-nuts/CAEkBMfHpwchwjAMtXNtpVmhb42Ncw9ENKhz5xH9Sv1z_-DMrRA%40mail.gmail.com.



--
Changkun Ou (he/him/his)
München, Deutschland
https://changkun.de/

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAGrj7%2BKqOk2s6EB%3Dd%2B-xocGFcmyuK95o-TuG%3DAo7-cPf%3D02Mcg%40mail.gmail.com.


Re: [go-nuts] About Go 1.19 memory model clarificaitons on atomic operations.

2022-08-15 Thread Changkun Ou
> Any other reading, to me, is trying to find an ambiguity for the sole sake of 
> finding an ambiguity.

A reader does not have to be motivated to find ambiguity. If the
sentence can be interpreted with different meanings, other readers may
perceive it differently. To me, the first read of this sentence is
perceived to be ambiguous regarding a single location or multiple
locations. The posted example discusses a and b as two memory
locations.

Atomic operations on a and b are two different statements. It remains
unclear where exactly is the sentence that tries to say this: atomic
operations on different memory locations obey the program statements
order within a goroutine.

On Mon, Aug 15, 2022 at 10:16 AM Axel Wagner





 wrote:
>
> On Mon, Aug 15, 2022 at 10:02 AM Changkun Ou  wrote:
>>
>> > The memory operations in each goroutine must correspond to a correct 
>> > sequential execution of that goroutine, given the values read from and 
>> > written to memory. That execution must be consistent with the sequenced 
>> > before relation, defined as the partial order requirements set out by the 
>> > Go language specification for Go's control flow constructs as well as the 
>> > order of evaluation for expressions.
>>
>> This rule seems a bit unclear in its wording to me. These questions may 
>> occur:
>> 1. What does it mean by "a correct sequential execution"? What defines
>> correct in this case? Is it implied by the variables' written order?
>>
>> 2. Is the rule applicable to multiple variables by written order or
>> only on different variables separately?
>
>
> I think this is trying to pick nits. The rule seems very clear in its 
> intended outcome: A single goroutine should behave as if executing each 
> statement sequentially, and obeying the spec about order of evaluation within 
> a statement.
>
> Any other reading, to me, is trying to find an ambiguity for the sole sake of 
> finding an ambiguity. Generally, the Go spec and the memory model as well 
> doesn't try to be tricky just to be tricky. It expects the reader to apply a 
> certain amount of common sense and biasing themselves towards the most 
> sensible reading.
>
>> In lines 13 and 14's goroutine, it seems there is no control flow or
>> any given language spec required order of execution. The most relaxing
>> requirement: permit the compiler to switch the statement.
>>
>>
>> On Mon, Aug 15, 2022 at 9:32 AM Axel Wagner
>>  wrote:
>> >
>> >
>> >
>> > On Mon, Aug 15, 2022 at 9:06 AM Changkun Ou  wrote:
>> >>
>> >> I think the new memory model does not guarantee this program always 
>> >> prints 1:
>> >>
>> >> 1. There is no synchronization guarantee between line 13 and line 14
>> >> as these atomic operations are manipulated on the different memory
>> >> locations.
>> >
>> >
>> > Yes, there is:
>> >>
>> >> The memory operations in each goroutine must correspond to a correct 
>> >> sequential execution of that goroutine, given the values read from and 
>> >> written to memory. That execution must be consistent with the sequenced 
>> >> before relation, defined as the partial order requirements set out by the 
>> >> Go language specification for Go's control flow constructs as well as the 
>> >> order of evaluation for expressions.
>> >
>> >
>> >> 2. It is *not* prohibited for the compiler to switch line 13 and line
>> >> 14 (as I read from section https://go.dev/ref/mem#badcompiler) because
>> >> of the above reason, and also, there is no order between line 13 and
>> >> line 20. So this is possible: line 14 < line 18 < line 20 < line 13.
>> >> 3. Depending on the memory layout of a and b, if they are on the same
>> >> cache line, then the program will always print 1.
>> >>
>> >>
>> >> On Mon, Aug 15, 2022 at 8:48 AM 'Axel Wagner' via golang-nuts
>> >>  wrote:
>> >> >
>> >> > Why wouldn't it?
>> >> >>
>> >> >> If the effect of an atomic operation A is observed by atomic operation 
>> >> >> B, then A is synchronized before B.
>> >> >
>> >> > To me, it seems pretty clear that it will. Line 13 is synchronized 
>> >> > before line 14, which is synchronized before any load observing its 
>> >> > effects (i.e. any execution of line 18 which runs into the branch) - 
>> >

[go-nuts] Re: Is golang/mobile still active?

2022-09-12 Thread Changkun Ou
golang/mobile is still maintenance. If you encounter any usage issues, feel 
free to report them here: https://github.com/golang/go/issues

On Monday, September 12, 2022 at 3:33:18 AM UTC+2 sytuv ccoxf wrote:

> Is golang/mobile still active?
>
> It seems like does not committed since Jul 23 2022.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/a16c52c1-57c2-42ad-8398-26ec24f41f3bn%40googlegroups.com.


Re: [go-nuts] sync.Mutex encounter large performance drop when goroutine contention more than 3400

2019-08-27 Thread Changkun Ou
Many thank for the perf tool, it is pretty awesome.

> On 27. Aug 2019, at 13:36, Robert Engels  wrote:
> 
> Ok maybe it wasn’t the cache issue, so then try this, below a certain number 
> of go routines given the workload the spin and cas works, beyond a certain 
> point it is forced into the semaphore mode which is much slower - it is also 
> slower than the futex used with channels. 
> 
> I’ll repeat it is OS and hardware dependent so beyond this you’ll need to do 
> your own more detailed investigation. I referred you to some tools that may 
> help. Good luck. 
> 

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/26B0E48D-38DA-432E-A667-FB3BAECB58A8%40gmail.com.