ChaiBapchya edited a comment on issue #17449: [Large Tensor] Implemented LT flag for OpPerf testing URL: https://github.com/apache/incubator-mxnet/pull/17449#issuecomment-593053525 While full opperf suite was run initially (and has been linked in the description) was full opperf run after the subsequent commits? like new ops added and merges? Could you paste opperf results after commit 56ad70 Coz right now, with master (cuda, cudnn ON) full opperf suite runs into error for lamb_update_phase1 ``` Traceback (most recent call last): File "incubator-mxnet/benchmark/opperf/opperf.py", line 213, in <module> sys.exit(main()) File "incubator-mxnet/benchmark/opperf/opperf.py", line 193, in main benchmark_results = run_all_mxnet_operator_benchmarks(ctx=ctx, dtype=dtype, profiler=profiler, int64_tensor=int64_tensor, warmup=warmup, runs=runs) File "incubator-mxnet/benchmark/opperf/opperf.py", line 111, in run_all_mxnet_operator_benchmarks mxnet_operator_benchmark_results.append(run_optimizer_operators_benchmarks(ctx=ctx, dtype=dtype, profiler=profiler, int64_tensor=int64_tensor, warmup=warmup, runs=runs)) File "/home/ubuntu/incubator-mxnet/benchmark/opperf/nd_operations/nn_optimizer_operators.py", line 142, in run_optimizer_operators_benchmarks mx_optimizer_op_results = run_op_benchmarks(mx_optimizer_ops, dtype, ctx, profiler, int64_tensor, warmup, runs) File "/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/benchmark_utils.py", line 210, in run_op_benchmarks warmup=warmup, runs=runs) File "/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/benchmark_utils.py", line 177, in run_performance_test benchmark_result = _run_nd_operator_performance_test(op, inputs, run_backward, warmup, runs, kwargs_list, profiler) File "/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/benchmark_utils.py", line 114, in _run_nd_operator_performance_test _, _ = benchmark_helper_func(op, warmup, **kwargs_list[0]) File "/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/profiler_utils.py", line 200, in cpp_profile_it res = func(*args, **kwargs) File "/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/ndarray_utils.py", line 97, in nd_forward_and_profile res = op(**kwargs_new) File "<string>", line 113, in lamb_update_phase1 File "/home/ubuntu/incubator-mxnet/python/mxnet/_ctypes/ndarray.py", line 91, in _imperative_invoke ctypes.byref(out_stypes))) File "/home/ubuntu/incubator-mxnet/python/mxnet/base.py", line 246, in check_call raise get_last_ffi_error() mxnet.base.MXNetError: MXNetError: Required parameter wd of float is not presented, in operator lamb_update_phase1(name="", t="1", rescale_grad="0.4", epsilon="1e-08", beta2="0.1", beta1="0.1") *** Error in `python': corrupted double-linked list: 0x000055b58a93f6c0 *** ``` The PR which introduced lamb_update_phase1 to opperf https://github.com/apache/incubator-mxnet/pull/17542 worked for CUDA CUDNN ON but now it doesn't.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services