Re: [Ohrrpgce] svn <-> git bridge, nohrio

David Gowers Wed, 19 Aug 2009 20:53:11 -0700

On Thu, Aug 20, 2009 at 4:54 AM, Ralph Versteegen<teeem...@gmail.com> wrote:
> Sorry for the slow response. I wanted to finish the little test
> utility I was working on first, but that was a bad idea as it took a
> while.
>>
>> No; an array is directly usable. It just can't be translated to yaml.
>> It was never the intent to manipulate large amounts of objects using
>> yaml -- the array interface is close to 95% efficient, and is more
>> powerful for simple manipulations.
>>
>> I can see I need to provide more documentation -- and perhaps a better
>> README. I'll work on that.
>>
>> Array2py needs a better name. what it does is translate between an array and
>> {python lists and dicts (with appropriate scalar valuess in them)},
>> which is easy to serialize to YAML. Hence it's in ohryaml.py
>
> I have since been learning and using a lot of numpy. I now see that
> the memmaps/ndarrays are very powerful and awesome. I guess array2py
> isn't really useful for much after all. Better documentation would be
YAML is more useful for users than programmers, and yeah, array2py is
really a transfer function for YAML (although it also means it's
possible to serialize in a ream of other formats with fair ease too)


> good, but I have figured out everything I needed myself. The key was
> to just read numpy documentation; you didn't really stress that.

Okay, thanks, I'll do that. I've put a skeleton list of some examples
into the README too (locally: I'll push it to gitorious once I've
corrected most or all of what you've kindly given feedback about. )

>
>>> Then why put it
>>> in ohryaml.py? However, I immediately ran into a problem: array2py
>>> takes forever on big lumps, so I guess I need a proper wrapper.
>>
>> big lumps? lumps that have many records, or lumps where each record is
>> big? (or lumps with a complex dtype?)
>> (logs are a big timesaver here -- consider using
>> http://ipython.scipy.org and turning logging on so you can post logs
>> of what you did.)
>>
>> A predicted typical usage pattern does not involve translation of
>> large amounts of records. Clearly this is inaccurate, can you tell me
>> what exactly you were aiming to do and how you went about it?
>
> I think I ran a tileset through it. Anyway, that was silly. I was
> trying to write a little program to process all the graphics in a
> game.

Ah yeah, that is silly. kind of the sort of thing that numpy shines at.

>
>>>
>>> Is 'old-ohrio-tests.rst' still any use  as documentation? That's what I
>>> attempted to read. Also, is it necessary to use linearize and
>>> spliceattack on an RPG before being able to load those files?
>>
>> old-ohrio-tests.rst is only really useful as documentation on things
>> like starting offsets (BLOAD etc).. once knowledge of that is
>> incorporated in nohrio, old-ohrio-tests.rst will go away.
>
> I forgot to look at old-ohrio-tests.rst again, actually. Also, though
> I read through ohrrpgce.py, I don't know what all of it is for. Can
> you explain how/what textserialize_dtype_tweaks will do? 
> http://www.castleparadox.com/ohr/viewtopic.php?p=79701#79701

IIRC I'm planning to use that mainly to get graphics to serialize in a
neat, human-editable format rather than as a lump (haven't checked how
they currently serialize; I would guess it is as a single escaped
string, which probably looks pretty ugly / irregular) .

>
>> Currently nohrio only accepts linearized data (in cases where it was
>> planar) and nohrio accepts only spliced attack data.
>> This is mainly to do with issues relating to representing a single
>> record as a single record (rather than 200 x, 200 y, etc). In future
>> it will probably happen transparently
>
> Planar lumps other than tilesets/backdrops should be nearly trivial
> thanks to numpy's strides.
It's actually very easy, but extremely ugly.. well, think about door locations..
does it make any sense to have 200 x coordinates then 200 y coords
then 200 target door ids?
the rearrangement of this is so we can serialize it in a way that is
easy to export and lacks strange constraints (only exactly 200
records? why?), with each x,y, target triple as a single record.

For example, it's almost trivial to export tilemaps as grayscale PGM
graphics once you have them linearized. I have a really strong focus
on text serialization here, because once a thing is in a consistent
textual format, it can be parsed easily (even dumbly), allowing you
to, eg, find all attacks with "slash" in the name, find all 16-color
palettes which refer to color index #128, using simple tool like
'grep'
(which lowers the learning requirement for data debugging)

> But how do you hope to transparently
> rewrite attacks and til/mxs? With a wrapper object that writes back to
> the memmap as it is modified? I'd like to hear about this, because
> there are other things you could wrap, such as opening a lumped RPG.
Sorry, you just blew my mind (although, what I was planning was sort
of similar) . I'll get back to you on this subject.

>
> In particular, how would Gfx4bpp work?
That's yet another 'scratch' class. I think I was intending it to
allow you to access 4bpp pixels as if they were 8bpp (ie. not 2 pixels
packed into one byte :)
No guarantee that will stay around (although some 16color graphics
nicety will be around)

>
> Also, I'd like to embed python in that wxwidgets wrapper around Custom
> that I am writing, so that plugins or portions of editor could be
> (re)written in it, in which it makes sense to use the nohrio api to
> access game data. There might be differing requirements there.

Yes, you would definitely want planar access there, I see that.
Hmm, I must meditate on this.. :)

>
> I'm sure I'll have many suggestions once I know how you intend all the
> wrapping to work.

The documentation is supposed to be in the relevant source files (and
automatically extracted with pylit and built with sphinx to build html
etc docs), but there is certainly a lot missing. This is mainly
because I haven't worked it out yet myself: I tried to obey the
principle of 'for a start, do the minimum that works', because
"A complex system that works is invariably found to have evolved from
a simple system that works."

http://en.wikipedia.org/wiki/Systemantics

This conversation is being very helpful in clarifying my vision of NOHRIO :)

>
>>>
>>> I'll also mention that I had problems with installing it: I ended up
>>> with a nohrio-0.33dev-py2.6.egg folder in my site-packages, which was
>>> empty aside from EGGINFO and nohrio subdirectories. Python wouldn't
>>> import ohrrpgce until I moved everything in nohrio up a directory. I
>>> don't know a thing about installing python packages.
>>
>> In that case, I'll guess that you didn't have setuptools (or it's fork
>> Distribute) installed (as the directory structure you describe is
>> normal and standard). Setuptools is fairly standard these days.
>>
>> http://pypi.python.org/pypi/setuptools/
>>
>> I'll need to add that to the dependencies; it never really occurred to
>> me that someone might not have it.
>
> Well, the Python 2.6 Windows installer doesn't come with it. However,
> when I found it was required to install nohrio, I installed it myself,
> which was very painful: there is no windows installer for setuptools
> for python 2.6! I had to download an egg and the source and
....
well, not exactly.

http://thinkhole.org/wp/2007/02/01/howto-install-setuptools-in-windows/

it's true there is not a 'windows installer' as in an exe with
installshield and blah blah..
but IMO, the above instructions are not too hard (and are about the
same as what I need to do when installing setuptools under Linux
myself)

AFAICS you just need to run the bootstrap script and then make sure
your PATH includes the right directory.

> build/install that (what's the deal with python packages 'building'
> themselves?). Even then, I had to manually add the python scripts/

'building' themselves? it's designed to match the standard 'make; sudo
make install' series of commands that you would use with projects that
use Make (ie. 90%+ of Linux software :)

However, as far as I know, just like with 'make;sudo make install',
there is a dependency of install on build, which means that if you
just invoke install ('python setup.py install' on windows IIRC) it
will be automatically built before installing. I like to do them both
separately though, because if something goes subtly wrong in the build
process you can find the errors more easily.

> folder to my PATH, as the setuptools install didn't. I often think
> that all these setup programs are useless and we should go back to
> manual installation!

haha:)

Believe me, if you want to install lots of python packages, you will
appreciate it -- easy_install makes installation as easy as just
naming the package you want installed.

>
>> Thanks for all the feedback :D
>>
>> David
>
>
> The bugs:
> -It seems that you have to explicitly import sys in spliceattack under 
> windows.
> -Also, 'rwb+', used in ohrrpgce.py, is not a valid mode, and throws an
> exception under windows: I assume you mean 'wb+'.

Urrghh, I think thats an obsolete bit of code.. let me check..
uh, no (works on Linux). At this point, I say 'f**k you, and the
dinosaur you rode in on too' to Windows.

However, I've made the change you suggest (and tagged it, because I
suspect it may break write buffering) (locally -- will appear next
time I do a git push)


>
> I encountered a few quicks in the dtypes:
I like quicks! quirks, not always so much.

> I think that more things should be wrapped by the API.

Me too!

I'm aiming to eventually have basically just two methods that are used
for virtually everything; load and save. This would definitely
automatically handle all headers.

For now, I'm just trying to obey the maxim 'release early and often'
by releasing as soon as my software could be useful.

> Forexample, in
> my RPGdir class, I had a list of lump headers. I'd also like to point

I definitely plan to do that.

> out that it's not necessary to treat single record files as arrays:
> you can specify shape=() to memmap to get a plain record.

This is for YAML serialization uniformity; if you look at tests/milestone1.yaml
you can see that every record in a YAML file is built directly from
the array dtype.
We divide things up slightly differently in order to get sensible
'scalar' units. (for example, npcdef, not (array of 100 npcdefs in
planar disorganization (as noted earlier in this email))
And btw, what you describe is called a 'scalar array' (an array with
ndims == 0:);
if you have an array of shape = (1,), you can get a scalar array just
by assigning shape = () as you mentioned.

Is there some specific difficulty this has caused you that I can
address? I'm certainly willing to rethink this scheme in light of new
evidence.

(it's also possible I can specially handle scalar arrays, I think)

>
> I noticed that the NPC definitions are each inside an 'npc' field, eg.
> mmap[npcno]['npc']['picture']. Why?

see above -- nice record serialization.
Again, it's easy to discard the 'npc' bit using array.view

nuview = arr.view (dtype = arr.dtype['npc'])

Probably is what you were expecting.

My bias towards neat serialization is evident (and will be addressed!).
I've checked the code in colouruse.py and I can see how addressing it
that way is weird.

>
> til.linear and mxs.linear aren't there, though I guess those are only
> temporary hacks anyway?

Okay, more docs needed, clearly :)

tests/ dir doesn't include some files which are just transformations
of other files

http://gitorious.org/nohrio/nohrio/blobs/master/tests/bootstrap.sh

creates those files :)
>
> There are some inconsistencies with field names, for example
> battlesprite, battlepalette and portrait, portrait_palette in .dt0
> against portraitpic and portraitpal in .say (I much prefer the later)

Okay, will fix (is just my rush to get this released -- in my
experience, the importance of releasing and getting feedback quickly
cannot be overestimated)

>
> Also, is it really necessary to treat all lumps as records instead of
> arrays, even ones such as tilemaps?

Does this inconvenience you? Remember you can get a view of them as
arrays if you want:

arr = arr_i_dont_like.view (dtype = ('B', (200, 320)))

However, I take your point, that I will need to put more thought into
supporting things other than YAML (currently the project is very YAML
oriented) and also supporting both linear and planar dtypes (the one
for readability, the other for speed).

I would like to clarify also that external serializations (eg YAML,
SQLite, PyTables) will always be in linear format, as I regard doing
otherwise as a Crime Against Technology :) (Planar formats just fail
at easy processing, which is super-important from my point of view --
say you wanted to apply a spatial filter like gaussian blur to some of
your graphics. scipy includes gaussian filtering and all sorts of
other nice stuff, but you'd have to write your own if you wanted to
process planar pixels like that.. ugh)

>
> So, onto the little utility I wrote. It scans though all the graphics
> in a file, does computations, and then displays everything prettily
> with pygame. I posted details here:
> http://www.castleparadox.com/ohr/viewtopic.php?p=79701#79701
>
> It shows numpy is pretty efficient; it can count all the pixels in the
> enormous number of graphics in Powerstick Man: Extended Edition in a
> small fraction of a second. However, searching through all the other
> lumps takes a long time.
>
memmapping is theoretically very good at exactly the kind of querying
you are trying to do here... I suspect it's all in the specific way
you do things here, you may be doing more looping in Python than is
needed.

There are a few optimizations that could be done -- for example,
there's no need to loop over til planes -- you don't care about planes
and can address the pixels as a big (64000,) shaped lump of bytes
using a view.

> It also shows that nohrio+numpy is really awesome for rapid development!

Why thank you, that was certainly my aim, and numpy certainly is
pretty amazing :D

>
> It wasn't intended to be directly very useful, however I think it
> contains a lot of useful code/examples. But unfortunately I really got
> carried away adding polish to it. It would only take two minutes to
> adapt it to do anything else with the graphics.
>
> I expect unpackpt can go straight into nohrio in some form. Are you
> planning on adding something like my RPGdir class to nohrio? It was a
> big simplification over using ohrrpgce.py directly.
Yes I am, always have been. I was just wary of doing that prematurely
(the soundness of the underlying layer being more important to get
right first)

unpackpt needs a few fixes for assumptions it makes,
then I will be happy to put it into nohrio. (comments in the following email)


>
> I look forward to nohrio improvements so that I can fix the resulting
> breakages in colouruse.py!

One thing that clearly shows in colouruse.py is that gen needs proper
field-based access (ie gen.masterpal rather than gen['masterpal'])
For read-only access this is easy:

gen2 = np.recarray(shape = (1,), dtype = gen.dtype)
gen2[:] = gen
gen2.shape = ()
# now gen.maxmap, gen.masterpal, etc do what you would expect :)


Is it okay if I use the content of these emails in NOHRIO
documentation? Eventually I might get rid of them, but currently they
can provide information that I haven't yet had the time to formally
document.

David
_______________________________________________
Ohrrpgce mailing list
ohrrpgce@lists.motherhamster.org
http://lists.motherhamster.org/listinfo.cgi/ohrrpgce-motherhamster.org

Re: [Ohrrpgce] svn <-> git bridge, nohrio

Reply via email to