On Fri, 2020-08-28 at 10:30 +0200, Christian Lindig wrote:
> +let find_opt k t =
> 
> +       (* avoid raising exceptions, they can be expensive *)
> 
> +       if mem k t then Some (find k t) else None
> 
> 
> 
> I disagree with this argument. Exceptions in OCaml are cheap because
> they don't walk the stack and cut to the exception handler directly.
> Is there a reason why they could be expensive here? In any case, the
> code is correct.
> 

Interesting question, it is best to measure.

The answer depends on whether the key is found or not in the map.
I'll change the impl to the one with find+catch, I think we  might be
looking up keys that are present more often than those that are missing
(although the 4.06 series will make this moot, 4.06 version is faster
than both approaches, regardless whether key is present or not).

To sum up the measurements below:
If the key is not found then my approach is faster (takes only 80% of
time), if the key is found then find+catch is faster (again an approx
80% of the time taken).

I based my comment on the documentation of raise_notrace, which says:
"A faster version raise which does not record the backtrace.",
which implies that recording the backtrace has a measurable perf
impact.
One could argue that if performance matters backtraces should be turned
off in production, but I think the value of having backtraces when some
hard-to-reprodue bug occurs outweights any perf penalty.
We should try to use exceptions only for unexpected situations though,
not finding a value in a map doesn't qualify.

See the attachment for a small micro-benchmark:
$ dune exec --profile=release -- ./updatet.exe raise
Estimated testing time 20s (2 benchmarks x 10s). Change using '-quota'.
┌───────────────┬──────────┬────────────┐
│ Name          │ Time/Run │ Percentage │
├───────────────┼──────────┼────────────┤
│ raise         │  33.52ns │    100.00% │
│ raise_notrace │  19.16ns │     57.16% │
└───────────────┴──────────┴────────────┘

So raising with a backtrace is measurably slower, taking the backtrace
spends some CPU cycles.

$ dune exec --profile=release -- ./updatet.exe find-opt
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──────────────────────────┬──────────┬────────────┐
│ Name                     │ Time/Run │ Percentage │
├──────────────────────────┼──────────┼────────────┤
│ find_opt 4.06:10         │  49.10ns │     24.06% │
│ find_opt 4.06:100        │ 115.38ns │     56.55% │
│ find_opt 4.06:1000       │ 161.27ns │     79.03% │
│ find_opt=mem+find:10     │  50.48ns │     24.74% │
│ find_opt=mem+find:100    │ 110.39ns │     54.10% │
│ find_opt=mem+find:1000   │ 162.48ns │     79.63% │
│ find_opt=find+catch:10   │  89.10ns │     43.67% │
│ find_opt=find+catch:100  │ 160.80ns │     78.80% │
│ find_opt=find+catch:1000 │ 204.04ns │    100.00% │
└──────────────────────────┴──────────┴────────────┘


4.06 and mem+find take 80% of the time of find+catch.

But of course if the key is actually found in the map then we have
this: 
edwin@edvin-tower:~/uex % dune exec --profile=release -- ./updatet.exe
find-opt-found
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌──────────────────────────┬──────────┬─────────┬────────────┐
│ Name                     │ Time/Run │ mWd/Run │ Percentage │
├──────────────────────────┼──────────┼─────────┼────────────┤
│ find_opt 4.06:10         │  38.38ns │   2.00w │     52.65% │
│ find_opt 4.06:100        │  20.66ns │   2.00w │     28.35% │
│ find_opt 4.06:1000       │  20.63ns │   2.00w │     28.30% │
│ find_opt=mem+find:10     │  72.89ns │   2.00w │    100.00% │
│ find_opt=mem+find:100    │  39.06ns │   2.00w │     53.59% │
│ find_opt=mem+find:1000   │  39.07ns │   2.00w │     53.60% │
│ find_opt=find+catch:10   │  49.54ns │   2.00w │     67.97% │
│ find_opt=find+catch:100  │  33.01ns │   2.00w │     45.29% │
│ find_opt=find+catch:1000 │  32.97ns │   2.00w │     45.23% │
└──────────────────────────┴──────────┴─────────┴────────────┘

In this case find+catch is faster.

And here is update for a key that is not present:
$ dune exec --profile=release -- ./updatet.exe update
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌───────────────────────────────────┬──────────┬─────────┬────────────┐
│ Name                              │ Time/Run │ mWd/Run │ Percentage │
├───────────────────────────────────┼──────────┼─────────┼────────────┤
│ update 4.06:10                    │  79.96ns │  24.00w │     17.27% │
│ update 4.06:100                   │ 171.96ns │  48.00w │     37.15% │
│ update 4.06:1000                  │ 243.95ns │  66.00w │     52.70% │
│ update=find+catch+add/remove:10   │ 183.46ns │  24.00w │     39.63% │
│ update=find+catch+add/remove:100  │ 340.00ns │  48.00w │     73.45% │
│ update=find+catch+add/remove:1000 │ 462.89ns │  66.00w │    100.00% │
│ update=mem+find+add/remove:10     │ 126.06ns │  24.00w │     27.23% │
│ update=mem+find+add/remove:100    │ 274.79ns │  48.00w │     59.36% │
│ update=mem+find+add/remove:1000   │ 401.62ns │  66.00w │     86.76% │
└───────────────────────────────────┴──────────┴─────────┴────────────┘

Here 4.06 is a clear win, and mem+add is faster than find+catch+add.
Estimated testing time 1m30s (9 benchmarks x 10s). Change using '-
quota'.
┌───────────────────────────────────┬──────────┬─────────┬────────────┐
│ Name                              │ Time/Run │ mWd/Run │ Percentage │
├───────────────────────────────────┼──────────┼─────────┼────────────┤
│ update 4.06:10                    │  72.76ns │  24.00w │     31.25% │
│ update 4.06:100                   │ 164.69ns │  48.00w │     70.74% │
│ update 4.06:1000                  │ 232.79ns │  66.00w │    100.00% │
│ update=find+catch+add/remove:10   │ 133.24ns │  23.00w │     57.24% │
│ update=find+catch+add/remove:100  │ 118.76ns │  35.00w │     51.02% │
│ update=find+catch+add/remove:1000 │ 161.22ns │  59.00w │     69.26% │
│ update=mem+find+add/remove:10     │ 156.29ns │  23.00w │     67.14% │
│ update=mem+find+add/remove:100    │ 122.98ns │  35.00w │     52.83% │
│ update=mem+find+add/remove:1000   │ 161.53ns │  59.00w │     69.39% │
└───────────────────────────────────┴──────────┴─────────┴────────────┘

Interestingly here the 4.06 implementation is actually slower and not
much difference between my other two.

Best regards,
--Edwin
(executable
 (name updatet)
 (libraries core_bench))
(lang dune 2.6)
module StringMap = Map.Make (String)

let find_opt_back k t =
  (* avoid raising exceptions, they can be expensive *)
  if StringMap.mem k t then Some (StringMap.find k t) else None

let find_opt_raise k t = try Some (StringMap.find k t) with Not_found -> None

let update_raise k f t =
  let r = find_opt_raise k t in
  let r' = f r in
  match (r, r') with
  | None, None ->
      t
  | Some _, None ->
      StringMap.remove k t
  | Some r, Some r' when r == r' ->
      t
  | _, Some r' ->
      StringMap.add k r' t

let update k f t =
  let r = find_opt_back k t in
  let r' = f r in
  match (r, r') with
  | None, None ->
      t
  | Some _, None ->
      StringMap.remove k t
  | Some r, Some r' when r == r' ->
      t
  | _, Some r' ->
      StringMap.add k r' t

let do_raise () = raise Not_found

let do_raise_notrace () = raise_notrace Not_found

let dummy = ref 0

let wrap f () =
  try if Sys.opaque_identity !dummy = 0 then f () with Not_found -> ()

(* open here to make sure we use stdlib impl above *)
open! Core
open Core_bench

let pre = "/local/domain/0"

let test_map = List.init 1000 (fun i -> pre ^ string_of_int i)

let args = [10; 100; 1000]

let mk_map_bench ~name f =
  Bench.Test.create_indexed ~name ~args (fun len ->
      (* this runs before the benchmark *)
      let m =
        List.init len (fun i -> (pre ^ string_of_int i, "value"))
        |> Caml.List.to_seq |> StringMap.of_seq
      in
      Staged.stage (fun () ->
          (* this is the benchmarked function, the benchmark framework takes care of
             running it multiple times, and avoiding the compiler optimizing it away
             by "using" the result *)
          f m))

let () =
  Printexc.record_backtrace true ;
  let key = pre ^ "/nonexistent" in
  let keyf = pre ^ "5" in
  let f = function Some x -> None | None -> Some "value2" in
  Command.run
    (Command.group ~summary:"exception handling benchmarkls"
       [ ( "raise"
         , Bench.make_command
             [ Bench.Test.create ~name:"raise" (wrap do_raise)
             ; Bench.Test.create ~name:"raise_notrace" (wrap do_raise_notrace)
             ] )
       ; ( "find-opt"
         , Bench.make_command
             [ mk_map_bench ~name:"find_opt 4.06" (fun m ->
                   StringMap.find_opt key m)
             ; mk_map_bench ~name:"find_opt=mem+find" (fun m ->
                   find_opt_back key m)
             ; mk_map_bench ~name:"find_opt=find+catch" (fun m ->
                   find_opt_raise key m) ] )
       ; ( "find-opt-found"
         , Bench.make_command
             [ mk_map_bench ~name:"find_opt 4.06" (fun m ->
                   StringMap.find_opt keyf m)
             ; mk_map_bench ~name:"find_opt=mem+find" (fun m ->
                   find_opt_back keyf m)
             ; mk_map_bench ~name:"find_opt=find+catch" (fun m ->
                   find_opt_raise keyf m) ] )
       ; ( "update"
         , Bench.make_command
             [ mk_map_bench ~name:"update 4.06" (fun m ->
                   StringMap.update key f m)
             ; mk_map_bench ~name:"update=find+catch+add/remove" (fun m ->
                   update_raise key f m)
             ; mk_map_bench ~name:"update=mem+find+add/remove" (fun m ->
                   update key f m) ]

             )
       ; ( "update-found"
         , Bench.make_command
             [ mk_map_bench ~name:"update 4.06" (fun m ->
                   StringMap.update key f m)
             ; mk_map_bench ~name:"update=find+catch+add/remove" (fun m ->
                   update_raise keyf f m)
             ; mk_map_bench ~name:"update=mem+find+add/remove" (fun m ->
                   update keyf f m) ] ) ])

Reply via email to