Re: [Ohrrpgce] svn <-> git bridge, nohrio

Ralph Versteegen Fri, 21 Aug 2009 09:26:17 -0700

2009/8/20 David Gowers <00a...@gmail.com>:
> On Thu, Aug 20, 2009 at 4:54 AM, Ralph Versteegen<teeem...@gmail.com> wrote:
>> Sorry for the slow response. I wanted to finish the little test
>> utility I was working on first, but that was a bad idea as it took a
>> while.
>>>
>>> No; an array is directly usable. It just can't be translated to yaml.
>>> It was never the intent to manipulate large amounts of objects using
>>> yaml -- the array interface is close to 95% efficient, and is more
>>> powerful for simple manipulations.
>>>
>>> I can see I need to provide more documentation -- and perhaps a better
>>> README. I'll work on that.
>>>
>>> Array2py needs a better name. what it does is translate between an array and
>>> {python lists and dicts (with appropriate scalar valuess in them)},
>>> which is easy to serialize to YAML. Hence it's in ohryaml.py
>>
>> I have since been learning and using a lot of numpy. I now see that
>> the memmaps/ndarrays are very powerful and awesome. I guess array2py
>> isn't really useful for much after all. Better documentation would be
> YAML is more useful for users than programmers, and yeah, array2py is
> really a transfer function for YAML (although it also means it's
> possible to serialize in a ream of other formats with fair ease too)
>
>> good, but I have figured out everything I needed myself. The key was
>> to just read numpy documentation; you didn't really stress that.
>
> Okay, thanks, I'll do that. I've put a skeleton list of some examples
> into the README too (locally: I'll push it to gitorious once I've
> corrected most or all of what you've kindly given feedback about. )
>
>>
>>>> Then why put it
>>>> in ohryaml.py? However, I immediately ran into a problem: array2py
>>>> takes forever on big lumps, so I guess I need a proper wrapper.
>>>
>>> big lumps? lumps that have many records, or lumps where each record is
>>> big? (or lumps with a complex dtype?)
>>> (logs are a big timesaver here -- consider using
>>> http://ipython.scipy.org and turning logging on so you can post logs
>>> of what you did.)
>>>
>>> A predicted typical usage pattern does not involve translation of
>>> large amounts of records. Clearly this is inaccurate, can you tell me
>>> what exactly you were aiming to do and how you went about it?
>>
>> I think I ran a tileset through it. Anyway, that was silly. I was
>> trying to write a little program to process all the graphics in a
>> game.
>
> Ah yeah, that is silly. kind of the sort of thing that numpy shines at.
>
>>
>>>>
>>>> Is 'old-ohrio-tests.rst' still any use  as documentation? That's what I
>>>> attempted to read. Also, is it necessary to use linearize and
>>>> spliceattack on an RPG before being able to load those files?
>>>
>>> old-ohrio-tests.rst is only really useful as documentation on things
>>> like starting offsets (BLOAD etc).. once knowledge of that is
>>> incorporated in nohrio, old-ohrio-tests.rst will go away.
>>
>> I forgot to look at old-ohrio-tests.rst again, actually. Also, though
>> I read through ohrrpgce.py, I don't know what all of it is for. Can
>> you explain how/what textserialize_dtype_tweaks will do? 
>> http://www.castleparadox.com/ohr/viewtopic.php?p=79701#79701
>
> IIRC I'm planning to use that mainly to get graphics to serialize in a
> neat, human-editable format rather than as a lump (haven't checked how
> they currently serialize; I would guess it is as a single escaped
> string, which probably looks pretty ugly / irregular) .


Oh, I think I see, assuming you mean reading/writing data fields such
as bitsets instead of graphics.

>>
>>> Currently nohrio only accepts linearized data (in cases where it was
>>> planar) and nohrio accepts only spliced attack data.
>>> This is mainly to do with issues relating to representing a single
>>> record as a single record (rather than 200 x, 200 y, etc). In future
>>> it will probably happen transparently
>>
>> Planar lumps other than tilesets/backdrops should be nearly trivial
>> thanks to numpy's strides.
> It's actually very easy, but extremely ugly.. well, think about door 
> locations..
> does it make any sense to have 200 x coordinates then 200 y coords
> then 200 target door ids?
> the rearrangement of this is so we can serialize it in a way that is
> easy to export and lacks strange constraints (only exactly 200
> records? why?), with each x,y, target triple as a single record.

I think you misunderstood. I meant something like this:

l = mmap('viking.l00', (INT, (5,300)),
BLOAD_SIZE).transpose(0,2,1).ravel().view(dtypes['l.linear'])

This actually works, except it seems to do a copy as you can't use it
to write back to the file. However, there's a much easier way that
works in every way, letting you write to the file!

l = np.memmap('viking.l00', dtype = dtypes['l.linear'], mode = 'r+',
offset = BLOAD_SIZE, order='F')

James must be the reincarnation of an old Fortran hacker!

> For example, it's almost trivial to export tilemaps as grayscale PGM
> graphics once you have them linearized. I have a really strong focus
> on text serialization here, because once a thing is in a consistent
> textual format, it can be parsed easily (even dumbly), allowing you
> to, eg, find all attacks with "slash" in the name, find all 16-color
> palettes which refer to color index #128, using simple tool like
> 'grep'
> (which lowers the learning requirement for data debugging)

Which is going to be pretty neat, I'll have a look at the YAML some other time.

>> But how do you hope to transparently
>> rewrite attacks and til/mxs? With a wrapper object that writes back to
>> the memmap as it is modified? I'd like to hear about this, because
>> there are other things you could wrap, such as opening a lumped RPG.
> Sorry, you just blew my mind (although, what I was planning was sort
> of similar) . I'll get back to you on this subject.
>
>>
>> In particular, how would Gfx4bpp work?
> That's yet another 'scratch' class. I think I was intending it to
> allow you to access 4bpp pixels as if they were 8bpp (ie. not 2 pixels
> packed into one byte :)
> No guarantee that will stay around (although some 16color graphics
> nicety will be around)
>
>>
>> Also, I'd like to embed python in that wxwidgets wrapper around Custom
>> that I am writing, so that plugins or portions of editor could be
>> (re)written in it, in which it makes sense to use the nohrio api to
>> access game data. There might be differing requirements there.
>
> Yes, you would definitely want planar access there, I see that.
> Hmm, I must meditate on this.. :)

Planar access? Actually, I was thinking along the lines of possibly
being handed a buffer containing a single record of some type, perhaps
with INTs replaced with LONGs, instead of memmapping a lump.

>>
>> I'm sure I'll have many suggestions once I know how you intend all the
>> wrapping to work.
>
> The documentation is supposed to be in the relevant source files (and
> automatically extracted with pylit and built with sphinx to build html
> etc docs), but there is certainly a lot missing. This is mainly
> because I haven't worked it out yet myself: I tried to obey the
> principle of 'for a start, do the minimum that works', because
> "A complex system that works is invariably found to have evolved from
> a simple system that works."
>
> http://en.wikipedia.org/wiki/Systemantics

After previous experiences, I now know to always follow that rule :)

> This conversation is being very helpful in clarifying my vision of NOHRIO :)
>
>>
>>>>
>>>> I'll also mention that I had problems with installing it: I ended up
>>>> with a nohrio-0.33dev-py2.6.egg folder in my site-packages, which was
>>>> empty aside from EGGINFO and nohrio subdirectories. Python wouldn't
>>>> import ohrrpgce until I moved everything in nohrio up a directory. I
>>>> don't know a thing about installing python packages.
>>>
>>> In that case, I'll guess that you didn't have setuptools (or it's fork
>>> Distribute) installed (as the directory structure you describe is
>>> normal and standard). Setuptools is fairly standard these days.
>>>
>>> http://pypi.python.org/pypi/setuptools/
>>>
>>> I'll need to add that to the dependencies; it never really occurred to
>>> me that someone might not have it.
>>
>> Well, the Python 2.6 Windows installer doesn't come with it. However,
>> when I found it was required to install nohrio, I installed it myself,
>> which was very painful: there is no windows installer for setuptools
>> for python 2.6! I had to download an egg and the source and
> ....
> well, not exactly.
>
> http://thinkhole.org/wp/2007/02/01/howto-install-setuptools-in-windows/

Hmm, I saw a guide very similar to that one, accompanied by complaints
that it didn't work, and an alternative less-automatic install
procedure.

> it's true there is not a 'windows installer' as in an exe with
> installshield and blah blah..
> but IMO, the above instructions are not too hard (and are about the
> same as what I need to do when installing setuptools under Linux
> myself)
>
> AFAICS you just need to run the bootstrap script and then make sure
> your PATH includes the right directory.
>
>> build/install that (what's the deal with python packages 'building'
>> themselves?). Even then, I had to manually add the python scripts/
>
> 'building' themselves? it's designed to match the standard 'make; sudo
> make install' series of commands that you would use with projects that
> use Make (ie. 90%+ of Linux software :)

Yes, but what's the 'make' step for at all? I assumed python packages
would be like Linux binary packages.

> However, as far as I know, just like with 'make;sudo make install',
> there is a dependency of install on build, which means that if you
> just invoke install ('python setup.py install' on windows IIRC) it
> will be automatically built before installing. I like to do them both
> separately though, because if something goes subtly wrong in the build
> process you can find the errors more easily.
>
>> folder to my PATH, as the setuptools install didn't. I often think
>> that all these setup programs are useless and we should go back to
>> manual installation!
>
> haha:)
>
> Believe me, if you want to install lots of python packages, you will
> appreciate it -- easy_install makes installation as easy as just
> naming the package you want installed.
>
>>
>>> Thanks for all the feedback :D
>>>
>>> David
>>
>>
>> The bugs:
>> -It seems that you have to explicitly import sys in spliceattack under 
>> windows.
>> -Also, 'rwb+', used in ohrrpgce.py, is not a valid mode, and throws an
>> exception under windows: I assume you mean 'wb+'.
>
> Urrghh, I think thats an obsolete bit of code.. let me check..
> uh, no (works on Linux). At this point, I say 'f**k you, and the
> dinosaur you rode in on too' to Windows.
>
> However, I've made the change you suggest (and tagged it, because I
> suspect it may break write buffering) (locally -- will appear next
> time I do a git push)

No no, it's definitely not a valid mode at all:
http://www.opengroup.org/onlinepubs/000095399/functions/fopen.html

And sorry: I meant 'rb+', not 'wb+' which appends.

>>
>> I encountered a few quicks in the dtypes:
> I like quicks! quirks, not always so much.
>
>> I think that more things should be wrapped by the API.
>
> Me too!
>
> I'm aiming to eventually have basically just two methods that are used
> for virtually everything; load and save. This would definitely
> automatically handle all headers.

So no writeable memmaps?

> For now, I'm just trying to obey the maxim 'release early and often'
> by releasing as soon as my software could be useful.
>
>> Forexample, in
>> my RPGdir class, I had a list of lump headers. I'd also like to point
>
> I definitely plan to do that.
>
>> out that it's not necessary to treat single record files as arrays:
>> you can specify shape=() to memmap to get a plain record.
>
> This is for YAML serialization uniformity; if you look at 
> tests/milestone1.yaml
> you can see that every record in a YAML file is built directly from
> the array dtype.
> We divide things up slightly differently in order to get sensible
> 'scalar' units. (for example, npcdef, not (array of 100 npcdefs in
> planar disorganization (as noted earlier in this email))
> And btw, what you describe is called a 'scalar array' (an array with
> ndims == 0:);
> if you have an array of shape = (1,), you can get a scalar array just
> by assigning shape = () as you mentioned.
>
> Is there some specific difficulty this has caused you that I can
> address? I'm certainly willing to rethink this scheme in light of new
> evidence.
>
> (it's also possible I can specially handle scalar arrays, I think)

I just found the need to tag [0] on when accessing lumps like gen as
awkward as needing to supply the header size. However, I've realised
that scalar arrays don't behave as well as they should:
>>> game.lump('.gen', shape=())['maxmap']
memmap(48, dtype=int16)

Which is for example unhashable.

>>
>> I noticed that the NPC definitions are each inside an 'npc' field, eg.
>> mmap[npcno]['npc']['picture']. Why?
>
> see above -- nice record serialization.

Don't think I follow: I haven't seen this extra layer anywhere else.

> Again, it's easy to discard the 'npc' bit using array.view
>
> nuview = arr.view (dtype = arr.dtype['npc'])
>
> Probably is what you were expecting.
>
> My bias towards neat serialization is evident (and will be addressed!).
> I've checked the code in colouruse.py and I can see how addressing it
> that way is weird.
>
>>
>> til.linear and mxs.linear aren't there, though I guess those are only
>> temporary hacks anyway?
>
> Okay, more docs needed, clearly :)
>
> tests/ dir doesn't include some files which are just transformations
> of other files
>
> http://gitorious.org/nohrio/nohrio/blobs/master/tests/bootstrap.sh
>
> creates those files :)

Yes, I saw. I meant that those dtypes aren't in ohrrpgce.py.

>>
>> There are some inconsistencies with field names, for example
>> battlesprite, battlepalette and portrait, portrait_palette in .dt0
>> against portraitpic and portraitpal in .say (I much prefer the later)
>
> Okay, will fix (is just my rush to get this released -- in my
> experience, the importance of releasing and getting feedback quickly
> cannot be overestimated)
>
>>
>> Also, is it really necessary to treat all lumps as records instead of
>> arrays, even ones such as tilemaps?
>
> Does this inconvenience you? Remember you can get a view of them as
> arrays if you want:
>
> arr = arr_i_dont_like.view (dtype = ('B', (200, 320)))

I suppose that resorting to view() might not be as much as a hack as I
first thought...

> However, I take your point, that I will need to put more thought into
> supporting things other than YAML (currently the project is very YAML
> oriented) and also supporting both linear and planar dtypes (the one
> for readability, the other for speed).

I think you misunderstood me somewhere. I don't see much point in
providing planar dtypes: I can't think of any reason that anyone might
want a non-linearized view of lumps, unless extra speed was that big
of a benefit, assuming that you otherwise manage to totally hide the
real data layout on disk. And as I showed above you can provide
linearized views for most of such lumps at no overhead. .til/.mxs will
require some overhead, but everyone except those counting pixels will
want those lumps linearized.

> I would like to clarify also that external serializations (eg YAML,
> SQLite, PyTables) will always be in linear format, as I regard doing
> otherwise as a Crime Against Technology :) (Planar formats just fail
> at easy processing, which is super-important from my point of view --
> say you wanted to apply a spatial filter like gaussian blur to some of
> your graphics. scipy includes gaussian filtering and all sorts of
> other nice stuff, but you'd have to write your own if you wanted to
> process planar pixels like that.. ugh)
>
>>
>> So, onto the little utility I wrote. It scans though all the graphics
>> in a file, does computations, and then displays everything prettily
>> with pygame. I posted details here:
>> http://www.castleparadox.com/ohr/viewtopic.php?p=79701#79701
>>
>> It shows numpy is pretty efficient; it can count all the pixels in the
>> enormous number of graphics in Powerstick Man: Extended Edition in a
>> small fraction of a second. However, searching through all the other
>> lumps takes a long time.
>>
> memmapping is theoretically very good at exactly the kind of querying
> you are trying to do here... I suspect it's all in the specific way
> you do things here, you may be doing more looping in Python than is
> needed.

Hmm, yes, I could replace, for example:
    for box in game.lump('.say'):
        if box['portraittype'] == 1: #fixed
            pic, pal = game.picnpal(8, box['portraitpic'], box['portraitpal'])
            sprpals[8][pic].add(pal)
with:
    say = game.lump('.say')
    pairs[8] = np.unique(np.append(pairs[8], say[say['portraittype']
== 1][['portraitpic', 'portraitpal']]))

Where sprpals have been ditched, and I look up all the default palettes later.


>
> There are a few optimizations that could be done -- for example,
> there's no need to loop over til planes -- you don't care about planes
> and can address the pixels as a big (64000,) shaped lump of bytes
> using a view.

I don't loop over the planes, I use the .flat iterator. But I don't
know how the flatiter works, maybe it's more efficient to pass
tset.ravel() to bincount().

>> It also shows that nohrio+numpy is really awesome for rapid development!
>
> Why thank you, that was certainly my aim, and numpy certainly is
> pretty amazing :D
>
>>
>> It wasn't intended to be directly very useful, however I think it
>> contains a lot of useful code/examples. But unfortunately I really got
>> carried away adding polish to it. It would only take two minutes to
>> adapt it to do anything else with the graphics.
>>
>> I expect unpackpt can go straight into nohrio in some form. Are you
>> planning on adding something like my RPGdir class to nohrio? It was a
>> big simplification over using ohrrpgce.py directly.
> Yes I am, always have been. I was just wary of doing that prematurely
> (the soundness of the underlying layer being more important to get
> right first)
>
> unpackpt needs a few fixes for assumptions it makes,
> then I will be happy to put it into nohrio. (comments in the following email)
>
>
>>
>> I look forward to nohrio improvements so that I can fix the resulting
>> breakages in colouruse.py!
>
> One thing that clearly shows in colouruse.py is that gen needs proper
> field-based access (ie gen.masterpal rather than gen['masterpal'])
> For read-only access this is easy:
>
> gen2 = np.recarray(shape = (1,), dtype = gen.dtype)
> gen2[:] = gen
> gen2.shape = ()
> # now gen.maxmap, gen.masterpal, etc do what you would expect :)

Ah, wonderful, I completely missed recarray! But I'm not sure why you
wrote it that way, doesn't this preserve memmap writeability?:

gen.shape = ()
gen = gen.view(np.recarray)

Though as I mentioned, it still only does 98% of what I expect,
because I get scalar arrays instead of integers, and I wonder whether
that will cause problems.

> Is it okay if I use the content of these emails in NOHRIO
> documentation? Eventually I might get rid of them, but currently they
> can provide information that I haven't yet had the time to formally
> document.

Of course.

> David
_______________________________________________
Ohrrpgce mailing list
ohrrpgce@lists.motherhamster.org
http://lists.motherhamster.org/listinfo.cgi/ohrrpgce-motherhamster.org

Re: [Ohrrpgce] svn <-> git bridge, nohrio

Reply via email to