Re: [Ohrrpgce] svn <-> git bridge, nohrio

David Gowers Sun, 23 Aug 2009 18:13:37 -0700

On Mon, Aug 24, 2009 at 12:16 AM, Ralph Versteegen<teeem...@gmail.com> wrote:
> Yes, that's quite sensible.
> But I'm not sure that you're still talking about
> textserialize_dtype_tweaks. Anyway, I haven't looked at serialisation
> at all, so don't know how it happens. I'll get around to it.
oh, that. It doesn't do anything right now. I'm not sure if it's needed.


> Well, it's not a true memmap because I can't write to the file with
> it,
sorry, that's a gotcha: mmap is written to open with 'r' mode, not 'r+'

def mmap (fname, dtype, offset = 0, shape = None):
    return np.memmap (fname, dtype = dtype, mode = 'r',
                      offset = offset, shape = shape)

http://gitorious.org/nohrio/nohrio/blobs/master/nohrio/ohrrpgce.py

>> of course writeable memmaps.
>> It's just that you could get a writeable memmap, or a recarray,
>> depending on your requirements.
>
> What are/how will these load and save 'methods' work? Are we talking
> about methods on classes/objects for each lump? Or maybe on an
> RPG-file class?

rpgfile class method sounds more like what I was thinking. Load() is
what you'd mainly use -- but you might want to use save() if you say,
had some records that you wanted to write to a segment of the lump
without bothering with opening a memmap for the purpose (mainly this
could happen when you serialize to yaml and later load in that yaml
and convert it to an array.)
>> They behave better than you might be thinking, though:
>>>>> game.lump('.gen', shape=())['maxmap'] + 10
>> 58
>
> Something odd is going on here:
>>>> game.lump('.gen', shape=())['maxmap'] + 10
> memmap(58)
oh, whoops. Maybe there is a difference in behaviour between IPython
and normal Python console.
Anyway, repr(game.lump('.gen', shape=())['maxmap'] + 10)
produces the value you are expecting for me. str(..) produces '58'
IME you can indeed treat such memmaps as if they were integers except
in special circumstances (they don't like being pickled or serialized
to YAML -- basically anything that really truly wants to ask them 'are
you an integer?')

I think the only case you might run into is like foo += 10 (which of
course writes to the memmap). OTOH, I think it's actually pretty hard
to do that unintentionally.

Oh btw, I had to fix RPGdir : it assumes that the lump name is case-insensitive
(when I run it on the test data in Linux, it complains 'no such file
../tests/data/Viking.gen'). Imo the correct choice here is to
lowercase what we get from archinym (I'm considering modifying
archiNym to ensure that output is always lowercase and input always
gets converted to lowercase; but if you implement the change in
RPGdir, it will still work after I fix archiNym)

>
>>>>> game.lump('.gen', shape=())['maxmap'] == 48
>> True
>
>>>> game.lump('.gen', shape=())['maxmap'] == 48
> memmap(True, dtype=bool)
>
>>>>> "%d" % game.lump('.gen', shape=())['maxmap']
>> '48'
>>
>> the hash of X is X for any integer X, btw. If you treat the integer as
>> just a series of bytes, then hash generates something more like what
>> you might want -- in which case,
>
> Wow, are these the same hashes used internally in dicts? That can be a
> bad idea for hashmaps!

Technically it can, but I think you will find practically, there is
only a tiny chance of collision (because the set of integers that are
likely to be used are actually fairly small (reflected in Python's
choice to preallocate integers -1..100 and allocate at runtime all
others as needed). Python's hasher (and I think all decent hashers)
tend to produce about equal entropy for all bits, versus typical
integers (which have 0 entropy for higher bits)

Anyway any decent implementation of hashmaps includes collision
handling, so if this happens, it is only a minor slowdown, not
disasterous.

>> true true; I want to say that your discovery of that way to get planar
>> stuff working nicely is AWESOME and SOOO helpful!
>>
>> btw, numpy also provides a facility that automatically converts
>> between fortran and C layout. it might be a method on arrays, not sure
>> currently. that should kill linearize dead, heh.
I looked this up but have forgotten. I'll check when I reply to the remainder

> I'll look into it and see if I can improve unpack_4bpp_array further:
> at the moment it's actually returning the data with the last two axes
> transposed in memory, which isn't ideal. I couldn't manage to write
> unpackpt to write the data straight into that layout without doing a
> copy at the end, but Fortran layout might be the trick needed.
I've currently locally got unpack_4bpp_array basically as you wrote
it, but with a parameter (defaulting to True) controlling whether the
array is transposed before returning it.

After I've checked over this thread and tried to address any issues I
missed in  nohrio re:that, I'll reply to the rest of your email

AFAIK fortran reverses striding. That is, in a C array [frames][y][x],
the first N bytes will be variations of x, then y, then frames,
whereas Fortran arrays, in a [frames][y][x] the first N bytes will be
variations of frames, then y, then x.
So there probably is no direct way to avoid transposition, since you
want frames to vary slowest..

http://www.nersc.gov/vendor_docs/intel/f_ug1/pgwarray.htm

"In arrays of more than one dimension, Fortran varies the left-most
index the fastest, while C varies the right-most index the fastest.
These are sometimes called column-major order and row-major order,
respectively.

In C, the first four elements of an array declared as X[3][3] are:

  X[0][0] X[0][1] X[0][2] X[1][0]

In Fortran, the first four elements are:

  X(1,1) X(2,1) X(3,1) X(1,2)

The order of indexing extends to any number of dimensions you declare"
_______________________________________________
Ohrrpgce mailing list
ohrrpgce@lists.motherhamster.org
http://lists.motherhamster.org/listinfo.cgi/ohrrpgce-motherhamster.org

Re: [Ohrrpgce] svn <-> git bridge, nohrio

Reply via email to