On Thu, Oct 15, 2009 at 10:40 AM, Michael Droettboom <md...@stsci.edu>wrote:
> I recently committed a regression test and bugfix for object pointers in > record arrays of unaligned size (meaning where each record is not a > multiple of sizeof(PyObject **)). > > For example: > > a1 = np.zeros((10,), dtype=[('o', 'O'), ('c', 'c')]) > a2 = np.zeros((10,), 'S10') > # This copying would segfault > a1['o'] = a2 > > http://projects.scipy.org/numpy/ticket/1198 > > Unfortunately, this unit test has opened up a whole hornet's nest of > alignment issues on Solaris. No surprise there. Good unit tests seem to routinely uncover hornet's nests and Solaris is a platform that exercises the alignment part of the code. I think it is great that you are finding these problems. We folks working on Intel don't see them so much. > The various reference counting functions > (PyArray_INCREF etc.) in refcnt.c all fail on unaligned object pointers, > for instance. Interestingly, there are comments in there saying > "handles misaligned data" (eg. line 190), but in fact it doesn't, and > doesn't look to me like it would. But I won't rule out a mistake in > building it on my part. > > So, how to fix this? > > One obvious workaround is for users to pass "align=True" to the dtype > constructor. This works if the dtype descriptor is a dictionary or > comma-separated string. Is there a reason it couldn't be made to work > with the string-of-tuples form that I'm missing? It would be marginally > more convenient from my application, but that's just a finesse issue. > > However, perhaps we should try to fix the underlying alignment > problems? Unfortunately, it's not clear to me how to resolve them > without at least some performance penalty. You either do an alignment > check of the pointer, and then memcpy if unaligned, or just always use > memcpy. Not sure which is faster, as memcpy may have a fast path > already. These are object arrays anyway, so there's plenty of overhead > already, and I don't think this would affect regular numerical arrays. > > I believe the memcpy approach is used for other unaligned parts of void types. There is an inherent performance penalty there, but I don't see how it can be avoided when using what are essentially packed structures. As to memcpy, it's performance seems to depend on the compiler/compiler version, old versions of gcc had *horrible* implementations of memcpy. I believe the situation has since improved. However, I'm not sure we should be coding to compiler issues unless it is unavoidable or the gain is huge. > If we choose not to fix it, perhaps we should we try to warn when > creating an unaligned recarray on platforms where it matters? I do > worry about having something that works perfectly well on one platform > fail on another. > > In the meantime, I'll just mark the new regression test to "skip on > Solaris". > > Chuck
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion