hi,
On Thu, Oct 19, 2017 at 8:54 AM, mura <arumaka...@gmail.com> wrote: > Hi, > Can you publish the C++ test program as well ? I have the ROOT installed > and would like to explore the issue further. > here's everything: - https://cern.ch/binet/go-hep/rootio-perf/ -> read-data.go -> read-data.cxx -> perf-cxx.data (perf file) -> perf-cxx.data.txt (output of perf report) -> perf-go.data -> perf-go.data.txt nothing jumped at me when looking at the perf reports. (but I am a bit rusty with perf thanks to "go tool pprof" :) -s > > -mura > > > On Wednesday, October 18, 2017 at 6:21:17 PM UTC+8, mura wrote: >> >> Could you use `perf record ./test f64s.root` and then `perf report >> --stdio` for both the Go and C++ programs? (You may need to `echo 0 | sudo >> tee /proc/sys/kernel/kptr_restrict before perf record, supposing that you >> can test it on Linux) >> >> I am just interested in the performance comparison between C++ and Go >> too. :D >> >> -mura >> >> On Tuesday, October 17, 2017 at 6:48:43 PM UTC+8, Sebastien Binet wrote: >>> >>> hi, >>> >>> (thanks for having a look at this) >>> >>> On Tue, Oct 17, 2017 at 6:01 AM, mura <aruma...@gmail.com> wrote: >>> >>>> Hi, >>>> >>>> A sequential read with `dd if=f64s.root of=/dev/null bs=32` takes >>>> around 8.5 secs (10.4 secs using read-data.go) on my machine. It would even >>>> take up to 55 secs when bs = 1. >>>> Also I observe that the MaxRSS of the C++ program is much bigger than >>>> the Go version, so I suspect the C++ program probably does some kind of >>>> caching or preloading. >>>> >>> >>> most of that extra RSS is coming from the C++ libraries that are linked >>> against. >>> that said, it's true the C++ version does both caching and prefetching. >>> I am trying to devise a test case where both are disabled. >>> >>> but, if I were to ioutil.ReadFile("foo.root") and then use that []byte >>> slice as input (implementing all the io.Reader, io.ReaderAt ... needed >>> interfaces) I should be able to at least recoup the caching effect, right? >>> (that's in effect pretty much the same thing than mmap-ing the whole in >>> memory, isn't?) >>> >>> -s >>> >>> >>>> >>>> -mura >>>> >>>> >>>> On Monday, October 16, 2017 at 11:53:56 PM UTC+8, Sebastien Binet wrote: >>>>> >>>>> hi there, >>>>> >>>>> I have this Go program that reads a binary file format coming from >>>>> CERN. >>>>> I am trying to get it running almost as fast as its C++ counter-part. >>>>> >>>>> the profile and code are here: >>>>> - https://cern.ch/binet/go-hep/rootio.cpu.pprof >>>>> - https://cern.ch/binet/go-hep/read-data.go >>>>> >>>>> on my machine, I get the following timings for the C++ (g++-7.2.0) and >>>>> the Go-1.9.1 programs: >>>>> >>>>> === C++/ROOT === >>>>> real=6.84 user=6.29 sys=0.52 CPU=99% MaxRSS=258384 I/O=0/0 >>>>> real=6.99 user=6.42 sys=0.55 CPU=99% MaxRSS=258312 I/O=0/0 >>>>> real=6.68 user=6.12 sys=0.55 CPU=99% MaxRSS=257956 I/O=0/0 >>>>> real=6.88 user=6.33 sys=0.54 CPU=99% MaxRSS=258964 I/O=0/0 >>>>> real=6.87 user=6.30 sys=0.56 CPU=99% MaxRSS=258024 I/O=0/0 >>>>> real=6.89 user=6.32 sys=0.56 CPU=99% MaxRSS=260452 I/O=0/0 >>>>> real=6.70 user=6.14 sys=0.56 CPU=99% MaxRSS=258368 I/O=0/0 >>>>> real=7.02 user=6.47 sys=0.54 CPU=99% MaxRSS=258128 I/O=0/0 >>>>> real=7.46 user=6.44 sys=0.56 CPU=93% MaxRSS=257972 I/O=1840/0 >>>>> real=7.11 user=6.47 sys=0.55 CPU=98% MaxRSS=258148 I/O=3984/0 >>>>> >>>>> === go-hep/rootio === >>>>> real=13.06 user=12.43 sys=0.61 CPU=99% MaxRSS=40864 I/O=0/0 >>>>> real=13.05 user=12.46 sys=0.58 CPU=100% MaxRSS=40892 I/O=0/0 >>>>> real=13.05 user=12.46 sys=0.59 CPU=100% MaxRSS=40888 I/O=0/0 >>>>> real=13.07 user=12.48 sys=0.58 CPU=99% MaxRSS=40940 I/O=0/0 >>>>> real=13.08 user=12.48 sys=0.60 CPU=100% MaxRSS=40924 I/O=0/0 >>>>> real=13.09 user=12.46 sys=0.60 CPU=99% MaxRSS=40976 I/O=0/0 >>>>> real=13.01 user=12.41 sys=0.60 CPU=100% MaxRSS=40876 I/O=0/0 >>>>> real=13.09 user=12.52 sys=0.58 CPU=100% MaxRSS=40880 I/O=0/0 >>>>> real=13.01 user=12.41 sys=0.60 CPU=100% MaxRSS=40880 I/O=0/0 >>>>> real=12.99 user=12.41 sys=0.58 CPU=100% MaxRSS=40872 I/O=0/0 >>>>> >>>>> any idea on how to speed that up? >>>>> >>>>> I have also this rather interesting strace-log for the Go program: >>>>> - https://cern.ch/binet/go-hep/strace-read-go.txt >>>>> >>>>> ie: there is a clear pattern of "epoll_wait/pselect6" on one thread >>>>> followed by a bunch of "pread64" on another thread. >>>>> I suspect this is the runtime I/O layer (?) dispatching data to the >>>>> "worker" thread/goroutine ? >>>>> (I have tried with GOGC=off and it was still there, as well as the >>>>> same timings for my Go binary.) >>>>> >>>>> I have put a ~764Mb data file there: >>>>> - https://cern.ch/binet/go-hep/f64s.root >>>>> >>>>> if anybody wants to have a look. >>>>> >>>>> I initially thought it was a (lack of) bufio.Reader issue (as the file >>>>> format forces you to do a bunch of seeks or readat, using bufio.Reader >>>>> wasn't completely straightforward.) >>>>> but, mmap-ing the whole file in memory didn't help. >>>>> >>>>> here is the top50 from the CPU profile: >>>>> >>>>> (pprof) top50 >>>>> Showing nodes accounting for 12.90s, 99.15% of 13.01s total >>>>> Dropped 4 nodes (cum <= 0.07s) >>>>> flat flat% sum% cum cum% >>>>> 2.83s 21.75% 21.75% 3.21s 24.67% >>>>> go-hep.org/x/hep/rootio.(*LeafD).readBasket >>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/leaf_gen.go >>>>> 1.35s 10.38% 32.13% 12.99s 99.85% >>>>> go-hep.org/x/hep/rootio.(*Scanner).Scan /home/binet/dev/go/gocode/src/ >>>>> go-hep.org/x/hep/rootio/scanner.go >>>>> 1.26s 9.68% 41.81% 1.26s 9.68% >>>>> go-hep.org/x/hep/rootio.(*tbranch).loadEntry >>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/basket.go >>>>> 1.17s 8.99% 50.81% 2.01s 15.45% >>>>> go-hep.org/x/hep/rootio.(*tbranch).loadBasket >>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/branch.go >>>>> 1.15s 8.84% 59.65% 7.15s 54.96% >>>>> go-hep.org/x/hep/rootio.(*tbranch).loadEntry >>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/branch.go >>>>> 0.82s 6.30% 65.95% 3.06s 23.52% >>>>> go-hep.org/x/hep/rootio.(*tbranch).scan /home/binet/dev/go/gocode/src/ >>>>> go-hep.org/x/hep/rootio/branch.go >>>>> 0.76s 5.84% 71.79% 3.90s 29.98% >>>>> go-hep.org/x/hep/rootio.(*Basket).readLeaf >>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/basket.go >>>>> 0.62s 4.77% 76.56% 1.55s 11.91% >>>>> go-hep.org/x/hep/rootio.(*LeafD).scan /home/binet/dev/go/gocode/src/ >>>>> go-hep.org/x/hep/rootio/leaf_gen.go >>>>> 0.56s 4.30% 80.86% 0.56s 4.30% syscall.Syscall6 >>>>> /usr/lib/go/src/syscall/asm_linux_amd64.s >>>>> 0.44s 3.38% 84.24% 0.93s 7.15% reflect.Indirect >>>>> /usr/lib/go/src/reflect/value.go >>>>> 0.38s 2.92% 87.16% 0.38s 2.92% reflect.Value.Elem >>>>> /usr/lib/go/src/reflect/value.go >>>>> 0.34s 2.61% 89.78% 0.34s 2.61% reflect.ValueOf >>>>> /usr/lib/go/src/reflect/value.go >>>>> 0.28s 2.15% 91.93% 0.28s 2.15% >>>>> go-hep.org/x/hep/rootio.(*tbranch).findBasketIndex >>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/branch.go >>>>> 0.21s 1.61% 93.54% 0.21s 1.61% >>>>> go-hep.org/x/hep/rootio.(*LeafD).Len <autogenerated> >>>>> 0.18s 1.38% 94.93% 0.18s 1.38% >>>>> go-hep.org/x/hep/rootio.(*LeafD).UnmarshalROOT >>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/rbuffer.go >>>>> 0.17s 1.31% 96.23% 0.17s 1.31% >>>>> go-hep.org/x/hep/rootio.(*tbranch).loadEntry >>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/rbuffer.go >>>>> 0.11s 0.85% 97.08% 0.11s 0.85% reflect.Value.Elem >>>>> /usr/lib/go/src/reflect/type.go >>>>> 0.09s 0.69% 97.77% 0.09s 0.69% >>>>> go-hep.org/x/hep/rootio.(*Basket).readLeaf >>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/rbuffer.go >>>>> 0.09s 0.69% 98.46% 0.09s 0.69% >>>>> go-hep.org/x/hep/rootio.(*LeafD).readBasket >>>>> /usr/lib/go/src/encoding/binary/binary.go >>>>> 0.08s 0.61% 99.08% 0.08s 0.61% >>>>> go-hep.org/x/hep/rootio.(*tleaf).Offset /home/binet/dev/go/gocode/src/ >>>>> go-hep.org/x/hep/rootio/leaf.go >>>>> 0.01s 0.077% 99.15% 13.01s 100% main.main >>>>> /tmp/acat/read-data.go >>>>> 0 0% 99.15% 0.56s 4.30% >>>>> go-hep.org/x/hep/rootio.(*File).ReadAt /home/binet/dev/go/gocode/src/ >>>>> go-hep.org/x/hep/rootio/file.go >>>>> 0 0% 99.15% 0.11s 0.85% >>>>> go-hep.org/x/hep/rootio.(*Key).load /home/binet/dev/go/gocode/src/ >>>>> go-hep.org/x/hep/rootio/key.go >>>>> 0 0% 99.15% 0.56s 4.30% internal/poll.(*FD).Pread >>>>> /usr/lib/go/src/internal/poll/fd_unix.go >>>>> 0 0% 99.15% 0.11s 0.85% io.(*SectionReader).Read >>>>> /usr/lib/go/src/io/io.go >>>>> 0 0% 99.15% 0.11s 0.85% io.ReadAtLeast >>>>> /usr/lib/go/src/io/io.go >>>>> 0 0% 99.15% 0.11s 0.85% io.ReadFull >>>>> /usr/lib/go/src/io/io.go >>>>> 0 0% 99.15% 0.56s 4.30% os.(*File).ReadAt >>>>> /usr/lib/go/src/os/file.go >>>>> 0 0% 99.15% 0.56s 4.30% os.(*File).pread >>>>> /usr/lib/go/src/os/file_unix.go >>>>> 0 0% 99.15% 13.01s 100% runtime.main >>>>> /usr/lib/go/src/runtime/proc.go >>>>> 0 0% 99.15% 0.56s 4.30% syscall.Pread >>>>> /usr/lib/go/src/syscall/zsyscall_linux_amd64.go >>>>> >>>>> >>>>> anything obvious I missed ? >>>>> >>>>> cheers, >>>>> -s >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "golang-nuts" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to golang-nuts...@googlegroups.com. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.