Hi,
Can you publish the C++ test program as well ? I have the ROOT installed 
and would like to explore the issue further.

-mura

On Wednesday, October 18, 2017 at 6:21:17 PM UTC+8, mura wrote:
>
> Could you use `perf record ./test f64s.root` and then `perf report 
> --stdio` for both the Go and C++ programs? (You may need to `echo 0 | sudo 
> tee /proc/sys/kernel/kptr_restrict before perf record, supposing that you 
> can test it on Linux)
>
> I am just interested in the performance comparison between C++ and Go too. 
> :D
>
> -mura
>
> On Tuesday, October 17, 2017 at 6:48:43 PM UTC+8, Sebastien Binet wrote:
>>
>> hi,
>>
>> (thanks for having a look at this)
>>
>> On Tue, Oct 17, 2017 at 6:01 AM, mura <aruma...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> A sequential read with `dd if=f64s.root of=/dev/null bs=32` takes around 
>>> 8.5 secs (10.4 secs using read-data.go) on my machine. It would even take 
>>> up to 55 secs when bs = 1.
>>> Also I observe that the MaxRSS of the C++ program is much bigger than 
>>> the Go version, so I suspect the C++ program probably does some kind of 
>>> caching or preloading.
>>>
>>
>> most of that extra RSS is coming from the C++ libraries that are linked 
>> against.
>> that said, it's true the C++ version does both caching and prefetching.
>> I am trying to devise a test case where both are disabled.
>>
>> but, if I were to ioutil.ReadFile("foo.root") and then use that []byte 
>> slice as input (implementing all the io.Reader, io.ReaderAt ... needed 
>> interfaces) I should be able to at least recoup the caching effect, right? 
>> (that's in effect pretty much the same thing than mmap-ing the whole in 
>> memory, isn't?)
>>
>> -s
>>  
>>
>>>
>>> -mura
>>>
>>>
>>> On Monday, October 16, 2017 at 11:53:56 PM UTC+8, Sebastien Binet wrote:
>>>>
>>>> hi there,
>>>>
>>>> I have this Go program that reads a binary file format coming from CERN.
>>>> I am trying to get it running almost as fast as its C++ counter-part.
>>>>
>>>> the profile and code are here:
>>>> - https://cern.ch/binet/go-hep/rootio.cpu.pprof
>>>> - https://cern.ch/binet/go-hep/read-data.go
>>>>
>>>> on my machine, I get the following timings for the C++ (g++-7.2.0) and 
>>>> the Go-1.9.1 programs:
>>>>
>>>> === C++/ROOT ===
>>>> real=6.84 user=6.29 sys=0.52 CPU=99% MaxRSS=258384 I/O=0/0
>>>> real=6.99 user=6.42 sys=0.55 CPU=99% MaxRSS=258312 I/O=0/0
>>>> real=6.68 user=6.12 sys=0.55 CPU=99% MaxRSS=257956 I/O=0/0
>>>> real=6.88 user=6.33 sys=0.54 CPU=99% MaxRSS=258964 I/O=0/0
>>>> real=6.87 user=6.30 sys=0.56 CPU=99% MaxRSS=258024 I/O=0/0
>>>> real=6.89 user=6.32 sys=0.56 CPU=99% MaxRSS=260452 I/O=0/0
>>>> real=6.70 user=6.14 sys=0.56 CPU=99% MaxRSS=258368 I/O=0/0
>>>> real=7.02 user=6.47 sys=0.54 CPU=99% MaxRSS=258128 I/O=0/0
>>>> real=7.46 user=6.44 sys=0.56 CPU=93% MaxRSS=257972 I/O=1840/0
>>>> real=7.11 user=6.47 sys=0.55 CPU=98% MaxRSS=258148 I/O=3984/0
>>>>
>>>> === go-hep/rootio ===
>>>> real=13.06 user=12.43 sys=0.61 CPU=99%  MaxRSS=40864 I/O=0/0
>>>> real=13.05 user=12.46 sys=0.58 CPU=100% MaxRSS=40892 I/O=0/0
>>>> real=13.05 user=12.46 sys=0.59 CPU=100% MaxRSS=40888 I/O=0/0
>>>> real=13.07 user=12.48 sys=0.58 CPU=99%  MaxRSS=40940 I/O=0/0
>>>> real=13.08 user=12.48 sys=0.60 CPU=100% MaxRSS=40924 I/O=0/0
>>>> real=13.09 user=12.46 sys=0.60 CPU=99%  MaxRSS=40976 I/O=0/0
>>>> real=13.01 user=12.41 sys=0.60 CPU=100% MaxRSS=40876 I/O=0/0
>>>> real=13.09 user=12.52 sys=0.58 CPU=100% MaxRSS=40880 I/O=0/0
>>>> real=13.01 user=12.41 sys=0.60 CPU=100% MaxRSS=40880 I/O=0/0
>>>> real=12.99 user=12.41 sys=0.58 CPU=100% MaxRSS=40872 I/O=0/0
>>>>
>>>> any idea on how to speed that up?
>>>>
>>>> I have also this rather interesting strace-log for the Go program:
>>>> - https://cern.ch/binet/go-hep/strace-read-go.txt
>>>>
>>>> ie: there is a clear pattern of "epoll_wait/pselect6" on one thread 
>>>> followed by a bunch of "pread64" on another thread.
>>>> I suspect this is the runtime I/O layer (?) dispatching data to the 
>>>> "worker" thread/goroutine ?
>>>> (I have tried with GOGC=off and it was still there, as well as the same 
>>>> timings for my Go binary.)
>>>>
>>>> I have put a ~764Mb data file there:
>>>> - https://cern.ch/binet/go-hep/f64s.root
>>>>
>>>> if anybody wants to have a look.
>>>>
>>>> I initially thought it was a (lack of) bufio.Reader issue (as the file 
>>>> format forces you to do a bunch of seeks or readat, using bufio.Reader 
>>>> wasn't completely straightforward.)
>>>> but, mmap-ing the whole file in memory didn't help.
>>>>
>>>> here is the top50 from the CPU profile:
>>>>
>>>> (pprof) top50   
>>>> Showing nodes accounting for 12.90s, 99.15% of 13.01s total
>>>> Dropped 4 nodes (cum <= 0.07s)
>>>>       flat  flat%   sum%        cum   cum%
>>>>      2.83s 21.75% 21.75%      3.21s 24.67%  
>>>> go-hep.org/x/hep/rootio.(*LeafD).readBasket 
>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/leaf_gen.go
>>>>      1.35s 10.38% 32.13%     12.99s 99.85%  
>>>> go-hep.org/x/hep/rootio.(*Scanner).Scan /home/binet/dev/go/gocode/src/
>>>> go-hep.org/x/hep/rootio/scanner.go
>>>>      1.26s  9.68% 41.81%      1.26s  9.68%  
>>>> go-hep.org/x/hep/rootio.(*tbranch).loadEntry 
>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/basket.go
>>>>      1.17s  8.99% 50.81%      2.01s 15.45%  
>>>> go-hep.org/x/hep/rootio.(*tbranch).loadBasket 
>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/branch.go
>>>>      1.15s  8.84% 59.65%      7.15s 54.96%  
>>>> go-hep.org/x/hep/rootio.(*tbranch).loadEntry 
>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/branch.go
>>>>      0.82s  6.30% 65.95%      3.06s 23.52%  
>>>> go-hep.org/x/hep/rootio.(*tbranch).scan /home/binet/dev/go/gocode/src/
>>>> go-hep.org/x/hep/rootio/branch.go
>>>>      0.76s  5.84% 71.79%      3.90s 29.98%  
>>>> go-hep.org/x/hep/rootio.(*Basket).readLeaf 
>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/basket.go
>>>>      0.62s  4.77% 76.56%      1.55s 11.91%  
>>>> go-hep.org/x/hep/rootio.(*LeafD).scan /home/binet/dev/go/gocode/src/
>>>> go-hep.org/x/hep/rootio/leaf_gen.go
>>>>      0.56s  4.30% 80.86%      0.56s  4.30%  syscall.Syscall6 
>>>> /usr/lib/go/src/syscall/asm_linux_amd64.s
>>>>      0.44s  3.38% 84.24%      0.93s  7.15%  reflect.Indirect 
>>>> /usr/lib/go/src/reflect/value.go
>>>>      0.38s  2.92% 87.16%      0.38s  2.92%  reflect.Value.Elem 
>>>> /usr/lib/go/src/reflect/value.go
>>>>      0.34s  2.61% 89.78%      0.34s  2.61%  reflect.ValueOf 
>>>> /usr/lib/go/src/reflect/value.go
>>>>      0.28s  2.15% 91.93%      0.28s  2.15%  
>>>> go-hep.org/x/hep/rootio.(*tbranch).findBasketIndex 
>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/branch.go
>>>>      0.21s  1.61% 93.54%      0.21s  1.61%  
>>>> go-hep.org/x/hep/rootio.(*LeafD).Len <autogenerated>
>>>>      0.18s  1.38% 94.93%      0.18s  1.38%  
>>>> go-hep.org/x/hep/rootio.(*LeafD).UnmarshalROOT 
>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/rbuffer.go
>>>>      0.17s  1.31% 96.23%      0.17s  1.31%  
>>>> go-hep.org/x/hep/rootio.(*tbranch).loadEntry 
>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/rbuffer.go
>>>>      0.11s  0.85% 97.08%      0.11s  0.85%  reflect.Value.Elem 
>>>> /usr/lib/go/src/reflect/type.go
>>>>      0.09s  0.69% 97.77%      0.09s  0.69%  
>>>> go-hep.org/x/hep/rootio.(*Basket).readLeaf 
>>>> /home/binet/dev/go/gocode/src/go-hep.org/x/hep/rootio/rbuffer.go
>>>>      0.09s  0.69% 98.46%      0.09s  0.69%  
>>>> go-hep.org/x/hep/rootio.(*LeafD).readBasket 
>>>> /usr/lib/go/src/encoding/binary/binary.go
>>>>      0.08s  0.61% 99.08%      0.08s  0.61%  
>>>> go-hep.org/x/hep/rootio.(*tleaf).Offset /home/binet/dev/go/gocode/src/
>>>> go-hep.org/x/hep/rootio/leaf.go
>>>>      0.01s 0.077% 99.15%     13.01s   100%  main.main 
>>>> /tmp/acat/read-data.go
>>>>          0     0% 99.15%      0.56s  4.30%  
>>>> go-hep.org/x/hep/rootio.(*File).ReadAt /home/binet/dev/go/gocode/src/
>>>> go-hep.org/x/hep/rootio/file.go
>>>>          0     0% 99.15%      0.11s  0.85%  
>>>> go-hep.org/x/hep/rootio.(*Key).load /home/binet/dev/go/gocode/src/
>>>> go-hep.org/x/hep/rootio/key.go
>>>>          0     0% 99.15%      0.56s  4.30%  internal/poll.(*FD).Pread 
>>>> /usr/lib/go/src/internal/poll/fd_unix.go
>>>>          0     0% 99.15%      0.11s  0.85%  io.(*SectionReader).Read 
>>>> /usr/lib/go/src/io/io.go
>>>>          0     0% 99.15%      0.11s  0.85%  io.ReadAtLeast 
>>>> /usr/lib/go/src/io/io.go
>>>>          0     0% 99.15%      0.11s  0.85%  io.ReadFull 
>>>> /usr/lib/go/src/io/io.go
>>>>          0     0% 99.15%      0.56s  4.30%  os.(*File).ReadAt 
>>>> /usr/lib/go/src/os/file.go
>>>>          0     0% 99.15%      0.56s  4.30%  os.(*File).pread 
>>>> /usr/lib/go/src/os/file_unix.go
>>>>          0     0% 99.15%     13.01s   100%  runtime.main 
>>>> /usr/lib/go/src/runtime/proc.go
>>>>          0     0% 99.15%      0.56s  4.30%  syscall.Pread 
>>>> /usr/lib/go/src/syscall/zsyscall_linux_amd64.go
>>>>
>>>>
>>>> anything obvious I missed ?
>>>>
>>>> cheers,
>>>> -s
>>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "golang-nuts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to golang-nuts...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to