matteosal opened a new issue #20465: URL: https://github.com/apache/incubator-mxnet/issues/20465
[sym.zip](https://github.com/apache/incubator-mxnet/files/6868645/sym.zip) Shape inference on a big symbol (attached) fails from the C API but works from python. Python script is: ``` import mxnet as mx json_path = 'path/to/sym.json' sym = mx.sym.load(json_path) shape_dict = { '.Inputs.Key': (32, 203, 256), '.Parameters.ScoringNet.Nodes.1.Arrays.Weights': (10, 256), '.Inputs.Query': (32, 203, 256), '.Parameters.ScoringNet.Nodes.2.Arrays.Weights': (10, 256), '.Parameters.ScoringNet.Nodes.4.Arrays.Weights': (1, 10), '.Inputs.Value': (32, 203, 256) } sym.infer_shape(**shape_dict) print('done!') ``` C code is: ``` #include <iostream> #include <fstream> #include <string> #include <vector> #include "mxnet/c_api.h" #include "nnvm/c_api.h" #define checkedMXCall(func, ...) \ { \ if (func(__VA_ARGS__) != 0) { \ printf("MX call %s failed at line %d:\n%s", \ #func, __LINE__, MXGetLastError()); \ exit(1) ; \ } \ } int main(int argc, char *argv[]) { /* Read JSON file */ std::ifstream file("sym.json"); std::string json(std::istreambuf_iterator<char>{file}, {}); // Create symbol SymbolHandle sym; checkedMXCall(MXSymbolCreateFromJSON, json.c_str(), &sym); // Check argument ordering uint32_t n_args; const char **arg_names; checkedMXCall(MXSymbolListArguments, sym, &n_args, &arg_names); for(int i = 0; i < n_args; i++) { std::cout << arg_names[i] << "\n"; } // Print symbol to file const char *raw_s; checkedMXCall(MXSymbolPrint, sym, &raw_s); std::ofstream out_file; out_file.open("symbol_print.txt"); out_file << raw_s; out_file.close(); // Run shape inference std::vector<int> csr_data = { 32, 203, 256, 10, 256, 32, 203, 256, 10, 256, 1, 10, 32, 203, 256 }; std::vector<uint32_t> csr_indices = {0, 3, 5, 8, 10, 12, 15}; uint32_t arg_shape_count = 0, out_shape_count = 0, aux_shape_count = 0; const int *arg_shape_ranks, *out_shape_ranks, *aux_shape_ranks; const int **arg_shape_dims, **out_shape_dims, **aux_shape_dims; int complete; checkedMXCall(MXSymbolInferShape, sym, 6, nullptr, csr_indices.data(), csr_data.data(), &arg_shape_count, &arg_shape_ranks, &arg_shape_dims, &out_shape_count, &out_shape_ranks, &out_shape_dims, &aux_shape_count, &aux_shape_ranks, &aux_shape_dims, &complete ); return 0; } ``` The output of the C code is: ``` -------- Argument order is: .Inputs.Key .Parameters.ScoringNet.Nodes.1.Arrays.Weights .Inputs.Query .Parameters.ScoringNet.Nodes.2.Arrays.Weights .Parameters.ScoringNet.Nodes.4.Arrays.Weights .Inputs.Value -------- MX call MXSymbolInferShape failed at line 60: MXNetError: Error in operator 4:.$10: [14:23:04] /home/matteo/Git/mxnet-build/Build/Linux-x86-64/CUDA/mxnet/src/operator/tensor/./broadcast_reduce_op.h:520: Check failed: lhs_shape[copyfrom] == 1: Input axis 1 at dimension 0 cannot be broadcasted to 203 ``` The printed arguments show that the ordering of the shapes in `csr_data` is coherent with python, hence the input to the shape inference is the same. So where is the failure for operator `4:.$10` coming from? By looking at the JSON file, the graph connectivity leading to operator `4:.$10` is the following:  I have followed the shape inference procedure with gdb, and from python I could see all the operators preceding `4:.$10` infer their shape before `4:.$10` itself, each one producing the correct shape as expected. But from C, none of the preceding nodes are called before `4:.$10`! This led me to think that the symbol was somehow being changed from the original file content, but I have checked that running `MXSymbolPrint` and dumping the result in a file (see C code above) reports the expected connectivity. So I'm stuck with what seems to be a really weird shape inference bug. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
