Another way to do this which I am not sure makes sense is to just add sizeof(mca_pml_ob1_hdr_t) to the btl_eager_limit passed into by the user. Thus the defining the limit to be specifically for the user data and not the internal headers which the user may not have any inkling about. However, that may lead to the user to not realize there is a man behind the curtain bumping up the limit for the internal headers.

--td

Terry Dontje wrote:
I was playing around with some really silly fragment sizes (sub 72 bytes) when I ran into some asserts in the btl_openib_sendi. I traced the assert to be caused by mca_pml_ob1_send_request_start_btl() calculating the true eager_limit with the following line:

  size_t eager_limit = btl->btl_eager_limit - sizeof(mca_pml_ob1_hdr_t);

If btl_eager_limit ends up being less than the sizeof(mca_pml_ob1_hdr_t) the eager_limit calculated results in a very large number and an assert later on in the stack.

It seems to me that it would be nice to insert some checks in mca_btl_base_param_register() to make sure btl_eager_limit is > sizeof(mca_pml_ob1_hdr_t). Am I missing a reason why this was not done in the first place?

--td
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to