Re: [Ohrrpgce] nohrio-based tool for importing/exporting data

Ralph Versteegen Sat, 19 May 2012 10:02:00 -0700

On 18 May 2012 13:13, David Gowers (kampu) <00a...@gmail.com> wrote:
> On Fri, May 18, 2012 at 1:18 AM, Ralph Versteegen <teeem...@gmail.com> wrote:
>>> One thing I'd like to mention at this point is that some parts of
>>> ndarray subclassing are non-obvious; for instance recarray-style
>>> access is not as good as it sounds because arrays come with  a great
>>> many  (100+?) methods, which will make things quite confusing when
>>> debugging (Tab-complete -> truckload of options even if there are only
>>> two subfields). I concluded that most of these methods need to be
>>> hidden so when you tab-complete or dir(), you get a useful result.
>>> That's complex to implement correctly.
>>
>> Why is it? Do you mean that only a hand selected set of methods should be 
>> shown?
>
> Well, you could even show NONE. Although some are very common and
> useful. The point was more that with attribute-style access,
> completion will not only show, say, arr.x and arr.y, but arr.sum,
> arr.take, arr.copy, arr.T, arr.all, arr.any,..... On the base ndarray
> class, there are 69 methods. On subclasses, of course you can't just
> delete these methods (though 99% of them can be achieved equivalently
> using identically named functions from the numpy module)..  after all,
> they don't exist in your subclass, only in the parent class. And MRO
> resolution will always find them.
>
> So you end up having to write your own __hasattr__ and __getattr__ (or
> was that __getattribute__? I forgot which does what), which have to:
> * do MRO lookup for all __* attributes, but not themselves /
> * do fake attribute lookup when handed a field name (eg. 'x'). /
> * do MRO lookup IFF the attribute name is whitelisted.
>
> The main problem is in those last two steps, and the potential for a
> method to become hidden by a field name. "all" is an obvious candidate
> there; "take" is another; view, size, max, min, base, data are other
> candidates (I'm selecting only from the list of attributes that I
> think are pretty essential to leave accessible as methods).
>
> It's wrong to say it's complex, actually; 'Irresolvable' is a much
> more accurate description.


I'm not clear as to what this has to do with __hasattr__. Is that used
by IPython when doing tab completion (or other enviroments which do
so? I don't know of any others). I thought you were referring to
overriding IPython TAB completion more directly, by defining a
trait_names method (though I couldn't find that actually documented
anywhere):

def trait_names(self): return self.dtype.names + <whitelist tuple>

I tried it out it out, and it seems that trait_names() if present is
used for completing a partially typed method/member name, while
pressing TAB after the . still shows everything, even if __dir__ is
defined (union of __dir__() and the class methods, I believe).

Of course it would be bad to allow field names that conflict with
ndarray methods; I think such field names should be disallowed to
avoid confusion (even though actual methods would always have
precedence if you override __getattr__ rather than __getatttribute__)

>>
>>> So for now I opted to go for
>>> x['y'] notation, with an extension allowing pseudo-dot-access --
>>> x['y.z.foo.bar']
>>>
>>> Looks like that got left out of the previous commit, so updating will
>>> get you that set of files (nohrio/dtypes/* (but mainly 'general.py'
>>> right now; the others are somewhat experimental and really only
>>> demonstrate a possible subclassing/field access model))
>>
>> Looking again, I see that there's still something missing. ohrtype and
>> readrecords are defined anywhere.
> No, they aren't.

Right, what I meant of course :)

> I haven't written them yet. Probably cause that
> code's out of date.
> That submodule's kind of a mess of different ideas how to handle rich
> data type definition. Not all of them will even run.
>
>>
>> BTW you wrote '3188' instead of '3118' several times.
> Grep says twice; I've pushed a fix for those two.
>
>>
>>> Oh, also, there's machinery for mutable dtypes in nohrio. Useful for
>>> split-records like the attack data, BINSIZE controlled records, or
>>> where valid enumeration values vary across versions. You just freeze()
>>> it after you're done editing.
>>>
>>>>
>>>> Originally I wanted to create a metadata file format similar to
>>>> editedit definitions, except describing binary record (and RELOAD)
>>>> based file formats, and try to split that off from editedt menu
>>>> definitions (which would then specify much less: mostly just the order
>>>> in which fields are presented and conditions on each appearing/being
>>>> enabled). But clearly this would just create a whole lot of work, so
>>>> I'm not so keen on it anymore.
>>>
>>> (dtype objects are pickleable, and pickling preserves metadata, BTW)
>>>
>>> I eventually decided that this kind of idea was overengineering,
>>> because the only application that could do anything useful with that
>>> kind of information would be a generic data editor/filterer (there are
>>> commercial hex editors that provide this functionality).
>>> Despite that, I had a go at writing one; Hopefully it got into the
>>> recent commit. Though it's notably got problems creating the right
>>> widgets for nested fields, it's a decent prototype.
>>
>> It's not there, but that sounds very cool! SDHawk wrote something
>
> Actually it's in my i/o centered project here, not in nohrio at all:
>
> https://gitorious.org/bits/bits
>
> (in subdirectory 'tools')

Thanks. I'll poke around in this some more later; nice motivation to
learn a little GTK.

> I can't currently run it, because my Python 3 GObject-Introspection
> installation seems somehow broken. Ironic.
>
>> similar (pygtk I think) for user-definable data structures for the ika
>> engine (described as Python classes or nested dictionaries; I don't
>> remember)
>> . I'd actually like to do the same thing for the OHR, but for
>> RELOAD documents. I think that's something Mike intended from the
>> beginning.
> That's kind of different, right? Procedurally generated structures
> instead of fixed but possibly overlapping structures? I know an
> implementation of that exists in a commercial hex editor.. They
> provide a plug-in system that requires you to write in a c-ish
> language, haha.

Yes, it is different in quite a few ways that didn't occur to me until
afterwards. I've forgotten them again :)

>>
>>> Every other kind of application pretty much needs to have some
>>> hardcoded understanding of the data fields and their type, if not
>>> their structure; So I concluded this functionality doesn't belong in
>>> OHRRPGCE itself.
>>
>> Well, I agree it's overengineering, but not for the reasons you
>> described. On the contrary, Custom itself already has such a system,
>> partially implemented: the editor editor (except of course that it's
>> for RELOAD file formats only).
> Yeah, honestly I'm losing track of where the OHRRPGCE is at. I haven't
> really used OHRRPGCE in a while, only for some debugging stuff.
>
>> Of course editors will frequently have
>> additional hard-coded portions. I think the 'overengineering' part
>> would be to split the editor definitions into separate editor and file
>> format definitions.
>
> That's a kind of overengineering I hadn't imagined before; Thanks!
> _______________________________________________
> Ohrrpgce mailing list
> ohrrpgce@lists.motherhamster.org
> http://lists.motherhamster.org/listinfo.cgi/ohrrpgce-motherhamster.org
_______________________________________________
Ohrrpgce mailing list
ohrrpgce@lists.motherhamster.org
http://lists.motherhamster.org/listinfo.cgi/ohrrpgce-motherhamster.org

Re: [Ohrrpgce] nohrio-based tool for importing/exporting data

Reply via email to