ChaiBapchya edited a comment on issue #17449: [Large Tensor] Implemented LT 
flag for OpPerf testing
URL: https://github.com/apache/incubator-mxnet/pull/17449#issuecomment-593053525
 
 
   While full opperf suite was run initially (and has been linked in the 
description)
   was full opperf run after the subsequent commits? like new ops added and 
merges?
   
   Could you paste opperf results after commit 56ad70
   
   Coz right now, with master (cuda, cudnn ON)
   full opperf suite runs into error for lamb_update_phase1
   ```
   Traceback (most recent call last):
     File "incubator-mxnet/benchmark/opperf/opperf.py", line 213, in <module>
       sys.exit(main())
     File "incubator-mxnet/benchmark/opperf/opperf.py", line 193, in main
       benchmark_results = run_all_mxnet_operator_benchmarks(ctx=ctx, 
dtype=dtype, profiler=profiler, int64_tensor=int64_tensor, warmup=warmup, 
runs=runs)
     File "incubator-mxnet/benchmark/opperf/opperf.py", line 111, in 
run_all_mxnet_operator_benchmarks
       
mxnet_operator_benchmark_results.append(run_optimizer_operators_benchmarks(ctx=ctx,
 dtype=dtype, profiler=profiler, int64_tensor=int64_tensor, warmup=warmup, 
runs=runs))
     File 
"/home/ubuntu/incubator-mxnet/benchmark/opperf/nd_operations/nn_optimizer_operators.py",
 line 142, in run_optimizer_operators_benchmarks
       mx_optimizer_op_results = run_op_benchmarks(mx_optimizer_ops, dtype, 
ctx, profiler, int64_tensor, warmup, runs)
     File 
"/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/benchmark_utils.py", line 
210, in run_op_benchmarks
       warmup=warmup, runs=runs)
     File 
"/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/benchmark_utils.py", line 
177, in run_performance_test
       benchmark_result = _run_nd_operator_performance_test(op, inputs, 
run_backward, warmup, runs, kwargs_list, profiler)
     File 
"/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/benchmark_utils.py", line 
114, in _run_nd_operator_performance_test
       _, _ = benchmark_helper_func(op, warmup, **kwargs_list[0])
     File 
"/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/profiler_utils.py", line 
200, in cpp_profile_it
       res = func(*args, **kwargs)
     File 
"/home/ubuntu/incubator-mxnet/benchmark/opperf/utils/ndarray_utils.py", line 
97, in nd_forward_and_profile
       res = op(**kwargs_new)
     File "<string>", line 113, in lamb_update_phase1
     File "/home/ubuntu/incubator-mxnet/python/mxnet/_ctypes/ndarray.py", line 
91, in _imperative_invoke
       ctypes.byref(out_stypes)))
     File "/home/ubuntu/incubator-mxnet/python/mxnet/base.py", line 246, in 
check_call
       raise get_last_ffi_error()
   mxnet.base.MXNetError: MXNetError: Required parameter wd of float is not 
presented, in operator lamb_update_phase1(name="", t="1", rescale_grad="0.4", 
epsilon="1e-08", beta2="0.1", beta1="0.1")
   *** Error in `python': corrupted double-linked list: 0x000055b58a93f6c0 ***
   ```
   
   The PR which introduced lamb_update_phase1 to opperf 
https://github.com/apache/incubator-mxnet/pull/17542 worked for CUDA CUDNN ON
   but now it doesn't.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to