I'm looking for a better way to map the characters of a unicode string to indexes into an array of geometry. The following code is functional, but it seems sub-optimal with all that numpy has to offer::
textOrds = map(ord, text.encode('utf-8')) idx = indexMap[textOrds] textGeo = geometry[idx] text is a simple python string coming in. I then manually covert it to unicode ordinals. Those are then mapped through indexMap, which happens to be a 1-to-1 mapping between unicode ordinals and valid indexes into geometry. I then use the idx array to take a selection from geometry for the text. As I mentioned before, this works alright, however two things seem inefficient. First is the manual mapping to unicode ordinals. Is there a way to have numpy do that for me? Secondly is the mapping through indexMap, because it is only sparsely populated -- usually only a 2-5 thousand entries out of the 64 thousand allocated. I've thought of using unicode.translate, but characters cannot be used for indexes in numpy. What are your collective thoughts on making this cleaner and more efficient? Thanks, -Shane Holloway ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion