You covered some of the issues with a data encapsulated class approach like yours.
The big issue for me is that your set verb returns 0 0 $0, but even if it returned the object reference, J is poor at compound expressions that operate on an object. Need to pass strings to what effectively becomes a dsl new j903 modifier trains get useful, but still messy d=: dict 'abc';1 2 3 loc_z_=: (,&'_'@[ ,&'_'@, ":@>@])"1 0 boxopen in_z_ =: ([. loc ].)~ d ('gf' in ]: + 'gf' (in d)) 'a' NB. parameterizing dictionary as an adverb for lhs of fork, and hard coding on rhs 2 but if set returned an object, having a verb that operated on that object would require explicit code (__y will work) to be simple. Then there is the issue of a set operation that doesn't want a "forced side effect" of permanently altering the object. instead a copy that wants to be temporarily used. A filter/query operation that returns multiple "records" Instead of a data encapsulated class, functions that operate on inverted tables would allow returning a new/subset of the "data". This adds extra work to save, but the extra work to copy a class in order to modify only the copy, but predeciding that if you want to do this, you would never want to overwrite the original dictionary, which seems like being above the paygrade of a function operating on inverted tables. Also remember to destroy the copy in your code when it is supposed to be discarded (actually a hard problem that would need its own dsl to solve all "responsibility combintations"). And then J, has unfriendly access problems on operating with an object parameter to a function if not an explicit function. J's strengths come from its functional approach. Returning a new copy of data is functional. It is very easy in J, especially in console, to modify the previous line of code such that it assigns a new result value to existing or new variable names. Double checking that the function works properly before overwritting "production" or lesser data is a prudent approach I'd recommend 100% of the time. J's impure functional approach is also the perfect functional approach. Pure (never side effect) functions inside, but the last caller/user (outside) decides on what side effects to make. An inverted table argument makes it easy to write functions that operate on that y argument inverted table. An encapsulated class makes that difficult to extend. I still think "keyed table" (multi column dictionary including potential multicolumn keys uniquely identifying a record) is still the right approach to a generalized dictionary, and most (90%+) column use cases would be uniformly typed. A defining property of dictionaries is access by full key match which necessarily brings symbols as an optimization feature of fields, but even if dictionary/keyed table, general query access is a nice to have, that you have with inverted tables, and an ability to covert to/from symbols when "necessary". A class based approach to keyed tables is possible and easiest to create. I've mentioned a general datastructure framework. Which is metadata about the data in one box, data in the other. Metadata is a "property dictionary" where values are data or functions. A string encoding is possible especially if there is a "class type" field that directs the encoding/decoding, but encoding values as boxed items to distinguish among different types/classes of values and functions is also an option. There is an easyish 1:1 mapping between a metadata structure about data, and a class definition that references DATA variable, or better yet, use data that is expected to conform to metadata understanding of the data as its y function parameter. This necessarily makes this approach exactly as easy as the first. Write a class, and use it either as class or as metadata described structure (data) to be chose by user. A third option, especially if it applies just to keyed tables, is having a dsl/description of the inverted table structure as an adverb parameter. An adverb allows for optimization in the returned verb/modifier. To optimize get (your valuable feature of your dict class), you only need to know the table constraints/definitions. set using a datastructure definition can generate a (pre)validation of input, along with informative descriptions for why elements fail if they do. A multi column dictionary description dsl would look like: key: ... value: ... NB. where ... is a list of fields with attributes (reserved words not allowed as field names) as follows: colname: u(nique): s(orted): type: or b(oxed): (optional if first item determines type. But benefits optimization if provided in dictionary description) single line definition potential is a huge convenience for both copy/edit coding, and console simplicity. So a generic get (by whole field match) is an adverb that first uses 'keyed table def' get, but then by a column list (indexes or colnames) that permits an indexing optimization step on that index (m&i. where m is the column parameter), when a single column is passed, then all keys in y are used to retrieve records (one for each key passed), and when multiple columns are part of final adverb parameter, then y is expected as a boxed values for each column, and all records with a key match retrieved. It is possible to choose (with additional (named) adverb) that if only one record is in dataset, then just raw values instead of full dictionary structure are returned. A metadata encoded datastructure seems superior to the adverb dsl processor in that an adverb dsl processor could with a preceding adverb interpret any meta+data parameter with just the metadata portion that allows it to operate on any other similar structured/metadata'd data. The end goal of an approach, IMO, should be to create improvements to J in terms of generic inverted table functions, with some specific improvements already identified in this thread: 'column list' { meta-described-dictionary NB. use FIELDS metadata keyword that contains symbol data, to retrieve column indexes (or other potential use of FIELDS duck named variable specific to datastructure) referenced in string. &:: =: bind =: (& @: ;) new modifier train such that dyadic m&:: f and f &::n are (m&f)(@:;) or (f&n)(@:;). J already has bound =: (f&n) or (m&f) have special dyadic interpretations of bound^:x y. The above enhancement would allow an interpertation of bound(@:;) which allows writing f for 3 arguments, ie. compound 2 boxed x or y arguments, but allows user to provide compound part as dyadic unboxed arguments. &:: compounded allows even more arguments. If x takes 3 (boxed) arguments than arg0&::f&::y applied dyadically, has x as arg1 and y as arg2. If applied monadically, then the 3rd x argument (arg2) to f would be missing, and f c/would deal. Compounding &:: calls would increase arity of functions from 3 to higher than 3 parameters. This feature would also allow optimizing inside f. If f is explicit than any line that is varname =: f x (if m&f is bound) or f y (if f&n is bound), and where an ideal structure is x =. f x or y =. f y internally as proof that original x can be discarded. If f is implicit, than any u@] or u@[ can be optimized away to a constant based on m&::f or f&::n, and if N V N occurrs as result of that optimization, then that too can be optimized into a constant. What the above allows beyond syntax sugar for more than 2 parameter verbs, is not having to resort to self-written-code optimizations inside adverbs. verbs can self optimize based on bound parameters (when for example (m i. ]) has same optimization as m&i. > Lua table references I've been thinking of k/q as the guiding model. Lua's variant (boxed) key and variant (boxed) values tables have the simplicity of storing every potential scenario, but as a dictionary implementation, would provide a strong incentive to avoid the dictionaries for performance reason. If you wanted to use a dictionary as a key, in J, you could use a linear representation of that dictionary in order to keep all keys as strings. But, repeating sorry, a boxed/variant column type can coexist along side uniform typed columns. Metadata (not at all Lua interpretation) would instead specify types and attributes of inverted table columns in the case of keyed tables. But also (kinda like Lua) include optimized/specified functions related to data. In general, I'd also say that access_keys_ being limited to valid spaceless J naming conventions is not a huge sacrifice for accessnames. Extending to spaceless unicode strings is not an ease of use problem if the user wants unicode keys, though it would interfere with that 1:1 J locale/classname mapping of datastructure metadata. On Sunday, February 6, 2022, 09:52:04 a.m. EST, Jan-Pieter Jacobs <janpieter.jac...@gmail.com> wrote: Hi Pascal, I responded inline below: A workaround is to optimize SET, ADD, UPDATE, DEL for bulk operations > (multiple items processed at once (] F..) super useful), and after bulk > operations, "redefine" (just repeat execution of same definition) GET such > that any m&i. updates. Also update FILTER functions (GET multiple if they > gain from static binding optimization. > This is, if I get it correctly, exactly what my dict implementation ( https://github.com/jpjacobs/types_dict) does: it allows setting/updating/removing multiple keys and the lookup verbs used are updated only if there is a change in keys > > An approach that just presumes key uniqueness instead of enforcing it, is > for GET to be based on i: instead of i. and then any ADD with a duplicate > key effectively will return the last updated/added values. > This would gather a lot of garbage and would loose the advantage of in-place updating. > > Back to generic datastructure, everything a class can do is possible > within a datastructure. All administrative "properties" (names) and their > associated values including functions can be encoded in a dictionary, > including a string representation dsl for representing "name values" with > ease as to function/data. What specializes a datastructure over a "mere" > class is the concept of existential data held by the datastructure that a J > user would want complete access to that data. In a class based > implementation, a universal name data =: holds the core data that the J > programmer would want access to. Usually, it is compound greater than > atomic data that can be represented as inverted tables of "linked data". > And part of the data specifying dsl's purpose is to include descriptions > that permit any possible optimizations that include what k/q's attributes > do (sorted, unique), but with extensible dsl, any other > implications/constraints on the data can use/select a specific > implementation of universally named "accessors"/functions > So a datastructure contains 2 boxes: 1st holds the name of the > datastructure class (for lookup value of any metadata of that classname), > and all administrative properties, and specialized functions for > GET/ADD/DELETE and other functions expected to have meaning relative to its > "existential" data, and the 2nd box holds the (likely compound and so extra > boxed) "data" > > An advantage of a compound datastructure over a class is the user gets to > decide whether to overwrite the "permanent" data while still having access > to SET/DEL/ADD functionality of their own copy they may want for their > application/data needs. It is also possible for generic GET/ADD/DELETE to > query the datastructure as to how it can best accomplish its integral > functionality, should there not be a specialized version defined in the > datastructure, and GET as an adverb that takes either '', > datastructure_name, or a specific instance of datastructure can optimize > itself as a first step, or one that can be bound to an optimized named > function, or if '' is the adverb parameter to GET, then the generic verb > "inspect y for datastructure properties" before selecting implementation is > returned. > I think these ideas are pretty much what Lua implements with its tables (dictionaries that can contain anything as keys and values, joined by their metatables, i.e. tables that can contain functions to override e.g. indexing operations). These tables do everything: from working as locales (function environments), over separating modules (our addons) to implementing OOP (making liberal use of the __call metamethod, specifying what happens if you calln a table as if you were calling a function, and __index, specifying what happens if you try to get a non-existent key in a table). In my view, the problem with a locale-based dict implementation like mine is currently that you cannot nest dicts without loosing generality. As numbered locales are referred to by boxed numbers, you could make a special case for these in your implementation, but would evidently loose the possibility to store boxed numbers. Even when adding checks to whether a boxed number is a locale, one cannot be sure the user intended to refer to a locale or actually wanted to store a boxed number. One could think of using the locales themselves as dicts, but there you'd have the problem that: - only valid names can be keys - referring to values is only possible with dict__key, which precludes doing so tacitly. For such implementation to work, one could (note, I have no clue about the implementation itself :p): - make a datatype only for referring to locales - implement indexing into that type with {:: following more or less the same idea as indexing with {:: - providing a verb to amend along the same lines - have a conjunction DoneIn that allows something like verb DoneIn mylocale (could be called 'of' as well) - allowing any value as "name" in locales. Like that, implementing a dict that allows storing arbitrary keys and values, nesting dicts and even self-reference, reference loops etc, using locales would become possible. In the end, I guess this would end up at about the same functionality as Lua does for tables… so I don't know what's more effort: implementing everything in J/C, or binding Lua. There's been a time I would have loved to have Lua instead of J's explicit language, but I guess that would end up as a different language :). Jan-Pieter ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm