2009/8/20 David Gowers <00a...@gmail.com>: > On Thu, Aug 20, 2009 at 4:54 AM, Ralph Versteegen<teeem...@gmail.com> wrote: >> Sorry for the slow response. I wanted to finish the little test >> utility I was working on first, but that was a bad idea as it took a >> while. >>> >>> No; an array is directly usable. It just can't be translated to yaml. >>> It was never the intent to manipulate large amounts of objects using >>> yaml -- the array interface is close to 95% efficient, and is more >>> powerful for simple manipulations. >>> >>> I can see I need to provide more documentation -- and perhaps a better >>> README. I'll work on that. >>> >>> Array2py needs a better name. what it does is translate between an array and >>> {python lists and dicts (with appropriate scalar valuess in them)}, >>> which is easy to serialize to YAML. Hence it's in ohryaml.py >> >> I have since been learning and using a lot of numpy. I now see that >> the memmaps/ndarrays are very powerful and awesome. I guess array2py >> isn't really useful for much after all. Better documentation would be > YAML is more useful for users than programmers, and yeah, array2py is > really a transfer function for YAML (although it also means it's > possible to serialize in a ream of other formats with fair ease too) > >> good, but I have figured out everything I needed myself. The key was >> to just read numpy documentation; you didn't really stress that. > > Okay, thanks, I'll do that. I've put a skeleton list of some examples > into the README too (locally: I'll push it to gitorious once I've > corrected most or all of what you've kindly given feedback about. ) > >> >>>> Then why put it >>>> in ohryaml.py? However, I immediately ran into a problem: array2py >>>> takes forever on big lumps, so I guess I need a proper wrapper. >>> >>> big lumps? lumps that have many records, or lumps where each record is >>> big? (or lumps with a complex dtype?) >>> (logs are a big timesaver here -- consider using >>> http://ipython.scipy.org and turning logging on so you can post logs >>> of what you did.) >>> >>> A predicted typical usage pattern does not involve translation of >>> large amounts of records. Clearly this is inaccurate, can you tell me >>> what exactly you were aiming to do and how you went about it? >> >> I think I ran a tileset through it. Anyway, that was silly. I was >> trying to write a little program to process all the graphics in a >> game. > > Ah yeah, that is silly. kind of the sort of thing that numpy shines at. > >> >>>> >>>> Is 'old-ohrio-tests.rst' still any use as documentation? That's what I >>>> attempted to read. Also, is it necessary to use linearize and >>>> spliceattack on an RPG before being able to load those files? >>> >>> old-ohrio-tests.rst is only really useful as documentation on things >>> like starting offsets (BLOAD etc).. once knowledge of that is >>> incorporated in nohrio, old-ohrio-tests.rst will go away. >> >> I forgot to look at old-ohrio-tests.rst again, actually. Also, though >> I read through ohrrpgce.py, I don't know what all of it is for. Can >> you explain how/what textserialize_dtype_tweaks will do? >> http://www.castleparadox.com/ohr/viewtopic.php?p=79701#79701 > > IIRC I'm planning to use that mainly to get graphics to serialize in a > neat, human-editable format rather than as a lump (haven't checked how > they currently serialize; I would guess it is as a single escaped > string, which probably looks pretty ugly / irregular) .
Oh, I think I see, assuming you mean reading/writing data fields such as bitsets instead of graphics. >> >>> Currently nohrio only accepts linearized data (in cases where it was >>> planar) and nohrio accepts only spliced attack data. >>> This is mainly to do with issues relating to representing a single >>> record as a single record (rather than 200 x, 200 y, etc). In future >>> it will probably happen transparently >> >> Planar lumps other than tilesets/backdrops should be nearly trivial >> thanks to numpy's strides. > It's actually very easy, but extremely ugly.. well, think about door > locations.. > does it make any sense to have 200 x coordinates then 200 y coords > then 200 target door ids? > the rearrangement of this is so we can serialize it in a way that is > easy to export and lacks strange constraints (only exactly 200 > records? why?), with each x,y, target triple as a single record. I think you misunderstood. I meant something like this: l = mmap('viking.l00', (INT, (5,300)), BLOAD_SIZE).transpose(0,2,1).ravel().view(dtypes['l.linear']) This actually works, except it seems to do a copy as you can't use it to write back to the file. However, there's a much easier way that works in every way, letting you write to the file! l = np.memmap('viking.l00', dtype = dtypes['l.linear'], mode = 'r+', offset = BLOAD_SIZE, order='F') James must be the reincarnation of an old Fortran hacker! > For example, it's almost trivial to export tilemaps as grayscale PGM > graphics once you have them linearized. I have a really strong focus > on text serialization here, because once a thing is in a consistent > textual format, it can be parsed easily (even dumbly), allowing you > to, eg, find all attacks with "slash" in the name, find all 16-color > palettes which refer to color index #128, using simple tool like > 'grep' > (which lowers the learning requirement for data debugging) Which is going to be pretty neat, I'll have a look at the YAML some other time. >> But how do you hope to transparently >> rewrite attacks and til/mxs? With a wrapper object that writes back to >> the memmap as it is modified? I'd like to hear about this, because >> there are other things you could wrap, such as opening a lumped RPG. > Sorry, you just blew my mind (although, what I was planning was sort > of similar) . I'll get back to you on this subject. > >> >> In particular, how would Gfx4bpp work? > That's yet another 'scratch' class. I think I was intending it to > allow you to access 4bpp pixels as if they were 8bpp (ie. not 2 pixels > packed into one byte :) > No guarantee that will stay around (although some 16color graphics > nicety will be around) > >> >> Also, I'd like to embed python in that wxwidgets wrapper around Custom >> that I am writing, so that plugins or portions of editor could be >> (re)written in it, in which it makes sense to use the nohrio api to >> access game data. There might be differing requirements there. > > Yes, you would definitely want planar access there, I see that. > Hmm, I must meditate on this.. :) Planar access? Actually, I was thinking along the lines of possibly being handed a buffer containing a single record of some type, perhaps with INTs replaced with LONGs, instead of memmapping a lump. >> >> I'm sure I'll have many suggestions once I know how you intend all the >> wrapping to work. > > The documentation is supposed to be in the relevant source files (and > automatically extracted with pylit and built with sphinx to build html > etc docs), but there is certainly a lot missing. This is mainly > because I haven't worked it out yet myself: I tried to obey the > principle of 'for a start, do the minimum that works', because > "A complex system that works is invariably found to have evolved from > a simple system that works." > > http://en.wikipedia.org/wiki/Systemantics After previous experiences, I now know to always follow that rule :) > This conversation is being very helpful in clarifying my vision of NOHRIO :) > >> >>>> >>>> I'll also mention that I had problems with installing it: I ended up >>>> with a nohrio-0.33dev-py2.6.egg folder in my site-packages, which was >>>> empty aside from EGGINFO and nohrio subdirectories. Python wouldn't >>>> import ohrrpgce until I moved everything in nohrio up a directory. I >>>> don't know a thing about installing python packages. >>> >>> In that case, I'll guess that you didn't have setuptools (or it's fork >>> Distribute) installed (as the directory structure you describe is >>> normal and standard). Setuptools is fairly standard these days. >>> >>> http://pypi.python.org/pypi/setuptools/ >>> >>> I'll need to add that to the dependencies; it never really occurred to >>> me that someone might not have it. >> >> Well, the Python 2.6 Windows installer doesn't come with it. However, >> when I found it was required to install nohrio, I installed it myself, >> which was very painful: there is no windows installer for setuptools >> for python 2.6! I had to download an egg and the source and > .... > well, not exactly. > > http://thinkhole.org/wp/2007/02/01/howto-install-setuptools-in-windows/ Hmm, I saw a guide very similar to that one, accompanied by complaints that it didn't work, and an alternative less-automatic install procedure. > it's true there is not a 'windows installer' as in an exe with > installshield and blah blah.. > but IMO, the above instructions are not too hard (and are about the > same as what I need to do when installing setuptools under Linux > myself) > > AFAICS you just need to run the bootstrap script and then make sure > your PATH includes the right directory. > >> build/install that (what's the deal with python packages 'building' >> themselves?). Even then, I had to manually add the python scripts/ > > 'building' themselves? it's designed to match the standard 'make; sudo > make install' series of commands that you would use with projects that > use Make (ie. 90%+ of Linux software :) Yes, but what's the 'make' step for at all? I assumed python packages would be like Linux binary packages. > However, as far as I know, just like with 'make;sudo make install', > there is a dependency of install on build, which means that if you > just invoke install ('python setup.py install' on windows IIRC) it > will be automatically built before installing. I like to do them both > separately though, because if something goes subtly wrong in the build > process you can find the errors more easily. > >> folder to my PATH, as the setuptools install didn't. I often think >> that all these setup programs are useless and we should go back to >> manual installation! > > haha:) > > Believe me, if you want to install lots of python packages, you will > appreciate it -- easy_install makes installation as easy as just > naming the package you want installed. > >> >>> Thanks for all the feedback :D >>> >>> David >> >> >> The bugs: >> -It seems that you have to explicitly import sys in spliceattack under >> windows. >> -Also, 'rwb+', used in ohrrpgce.py, is not a valid mode, and throws an >> exception under windows: I assume you mean 'wb+'. > > Urrghh, I think thats an obsolete bit of code.. let me check.. > uh, no (works on Linux). At this point, I say 'f**k you, and the > dinosaur you rode in on too' to Windows. > > However, I've made the change you suggest (and tagged it, because I > suspect it may break write buffering) (locally -- will appear next > time I do a git push) No no, it's definitely not a valid mode at all: http://www.opengroup.org/onlinepubs/000095399/functions/fopen.html And sorry: I meant 'rb+', not 'wb+' which appends. >> >> I encountered a few quicks in the dtypes: > I like quicks! quirks, not always so much. > >> I think that more things should be wrapped by the API. > > Me too! > > I'm aiming to eventually have basically just two methods that are used > for virtually everything; load and save. This would definitely > automatically handle all headers. So no writeable memmaps? > For now, I'm just trying to obey the maxim 'release early and often' > by releasing as soon as my software could be useful. > >> Forexample, in >> my RPGdir class, I had a list of lump headers. I'd also like to point > > I definitely plan to do that. > >> out that it's not necessary to treat single record files as arrays: >> you can specify shape=() to memmap to get a plain record. > > This is for YAML serialization uniformity; if you look at > tests/milestone1.yaml > you can see that every record in a YAML file is built directly from > the array dtype. > We divide things up slightly differently in order to get sensible > 'scalar' units. (for example, npcdef, not (array of 100 npcdefs in > planar disorganization (as noted earlier in this email)) > And btw, what you describe is called a 'scalar array' (an array with > ndims == 0:); > if you have an array of shape = (1,), you can get a scalar array just > by assigning shape = () as you mentioned. > > Is there some specific difficulty this has caused you that I can > address? I'm certainly willing to rethink this scheme in light of new > evidence. > > (it's also possible I can specially handle scalar arrays, I think) I just found the need to tag [0] on when accessing lumps like gen as awkward as needing to supply the header size. However, I've realised that scalar arrays don't behave as well as they should: >>> game.lump('.gen', shape=())['maxmap'] memmap(48, dtype=int16) Which is for example unhashable. >> >> I noticed that the NPC definitions are each inside an 'npc' field, eg. >> mmap[npcno]['npc']['picture']. Why? > > see above -- nice record serialization. Don't think I follow: I haven't seen this extra layer anywhere else. > Again, it's easy to discard the 'npc' bit using array.view > > nuview = arr.view (dtype = arr.dtype['npc']) > > Probably is what you were expecting. > > My bias towards neat serialization is evident (and will be addressed!). > I've checked the code in colouruse.py and I can see how addressing it > that way is weird. > >> >> til.linear and mxs.linear aren't there, though I guess those are only >> temporary hacks anyway? > > Okay, more docs needed, clearly :) > > tests/ dir doesn't include some files which are just transformations > of other files > > http://gitorious.org/nohrio/nohrio/blobs/master/tests/bootstrap.sh > > creates those files :) Yes, I saw. I meant that those dtypes aren't in ohrrpgce.py. >> >> There are some inconsistencies with field names, for example >> battlesprite, battlepalette and portrait, portrait_palette in .dt0 >> against portraitpic and portraitpal in .say (I much prefer the later) > > Okay, will fix (is just my rush to get this released -- in my > experience, the importance of releasing and getting feedback quickly > cannot be overestimated) > >> >> Also, is it really necessary to treat all lumps as records instead of >> arrays, even ones such as tilemaps? > > Does this inconvenience you? Remember you can get a view of them as > arrays if you want: > > arr = arr_i_dont_like.view (dtype = ('B', (200, 320))) I suppose that resorting to view() might not be as much as a hack as I first thought... > However, I take your point, that I will need to put more thought into > supporting things other than YAML (currently the project is very YAML > oriented) and also supporting both linear and planar dtypes (the one > for readability, the other for speed). I think you misunderstood me somewhere. I don't see much point in providing planar dtypes: I can't think of any reason that anyone might want a non-linearized view of lumps, unless extra speed was that big of a benefit, assuming that you otherwise manage to totally hide the real data layout on disk. And as I showed above you can provide linearized views for most of such lumps at no overhead. .til/.mxs will require some overhead, but everyone except those counting pixels will want those lumps linearized. > I would like to clarify also that external serializations (eg YAML, > SQLite, PyTables) will always be in linear format, as I regard doing > otherwise as a Crime Against Technology :) (Planar formats just fail > at easy processing, which is super-important from my point of view -- > say you wanted to apply a spatial filter like gaussian blur to some of > your graphics. scipy includes gaussian filtering and all sorts of > other nice stuff, but you'd have to write your own if you wanted to > process planar pixels like that.. ugh) > >> >> So, onto the little utility I wrote. It scans though all the graphics >> in a file, does computations, and then displays everything prettily >> with pygame. I posted details here: >> http://www.castleparadox.com/ohr/viewtopic.php?p=79701#79701 >> >> It shows numpy is pretty efficient; it can count all the pixels in the >> enormous number of graphics in Powerstick Man: Extended Edition in a >> small fraction of a second. However, searching through all the other >> lumps takes a long time. >> > memmapping is theoretically very good at exactly the kind of querying > you are trying to do here... I suspect it's all in the specific way > you do things here, you may be doing more looping in Python than is > needed. Hmm, yes, I could replace, for example: for box in game.lump('.say'): if box['portraittype'] == 1: #fixed pic, pal = game.picnpal(8, box['portraitpic'], box['portraitpal']) sprpals[8][pic].add(pal) with: say = game.lump('.say') pairs[8] = np.unique(np.append(pairs[8], say[say['portraittype'] == 1][['portraitpic', 'portraitpal']])) Where sprpals have been ditched, and I look up all the default palettes later. > > There are a few optimizations that could be done -- for example, > there's no need to loop over til planes -- you don't care about planes > and can address the pixels as a big (64000,) shaped lump of bytes > using a view. I don't loop over the planes, I use the .flat iterator. But I don't know how the flatiter works, maybe it's more efficient to pass tset.ravel() to bincount(). >> It also shows that nohrio+numpy is really awesome for rapid development! > > Why thank you, that was certainly my aim, and numpy certainly is > pretty amazing :D > >> >> It wasn't intended to be directly very useful, however I think it >> contains a lot of useful code/examples. But unfortunately I really got >> carried away adding polish to it. It would only take two minutes to >> adapt it to do anything else with the graphics. >> >> I expect unpackpt can go straight into nohrio in some form. Are you >> planning on adding something like my RPGdir class to nohrio? It was a >> big simplification over using ohrrpgce.py directly. > Yes I am, always have been. I was just wary of doing that prematurely > (the soundness of the underlying layer being more important to get > right first) > > unpackpt needs a few fixes for assumptions it makes, > then I will be happy to put it into nohrio. (comments in the following email) > > >> >> I look forward to nohrio improvements so that I can fix the resulting >> breakages in colouruse.py! > > One thing that clearly shows in colouruse.py is that gen needs proper > field-based access (ie gen.masterpal rather than gen['masterpal']) > For read-only access this is easy: > > gen2 = np.recarray(shape = (1,), dtype = gen.dtype) > gen2[:] = gen > gen2.shape = () > # now gen.maxmap, gen.masterpal, etc do what you would expect :) Ah, wonderful, I completely missed recarray! But I'm not sure why you wrote it that way, doesn't this preserve memmap writeability?: gen.shape = () gen = gen.view(np.recarray) Though as I mentioned, it still only does 98% of what I expect, because I get scalar arrays instead of integers, and I wonder whether that will cause problems. > Is it okay if I use the content of these emails in NOHRIO > documentation? Eventually I might get rid of them, but currently they > can provide information that I haven't yet had the time to formally > document. Of course. > David _______________________________________________ Ohrrpgce mailing list ohrrpgce@lists.motherhamster.org http://lists.motherhamster.org/listinfo.cgi/ohrrpgce-motherhamster.org