I'm not sure that eliminating syntax errors to get null words is a good idea.
-- Raul On Fri, Apr 7, 2017 at 12:35 PM, 'Pascal Jasmin' via Programming <[email protected]> wrote: > getting back to the idea of storing symbols by 3!:1 as delimited strings. > This would both be an improvement in storage, and eliminate the error prone > dependence on 10 s: > > I've got the esc (uses ;:) method in the latest jpp ( > https://github.com/Pascal-J/jpp ) to get 2MB/s throughput using 2 passes, and > an escape code to handle both embeded escapes and nulls, and null delimited > data/symbols. > > Several improvements to ;: would make this significantly faster and more > flexible: > > emit null when j=-1, and emitword issued (previously suggested): This allows > null fields to easily be "parsed" (current method used is to use function > code 2, and examine gaps in order to add nulls as a 2nd pass. function code > 2 is slower than 0, and overhead in calculating gaps, and inserting nulls) > > Add an action code that suspends/pauses current word. Next start word will > append to current word, skipping any characters that were scanned during > pause. This would allow "deleting" items in the middle of a word in a single > pass instead of using the 2 pass approach (with 2nd pass using function code > 1). Alternatively, it could function like ev, but if ew is in same state, it > discards the elements between startword's. > > > A custom action code (one interpretation of Henry's inclination, though he > may have thought of custom function codes) that has a way of inserting a > character. This would allow building an escaped sequence by inserting the > escape character prior to last seen. > > Custom action codes would need to return characters to include (if it is not > an ew,ev class), newi, newj at least. A new function code would be a > variation on 2, emit i (i-j), actioncode, though "characters to include" > would interact direction with function codes 0 and 1. > > > A powerful tool for nested structures (see parenw machine in fsm.ijs that > builds trees from parentheses groups) would be an emitwordandIncreaseDepth > and emitwordandDecreaseDepth actions. So, as part of the return parameters > for custom actions would be a code for the action: (noword, word, > WordincreaseDepth, WordDecreaseDepth, vector) > > > > > ________________________________ > From: 'Pascal Jasmin' via Programming <[email protected]> > To: "[email protected]" <[email protected]> > Sent: Sunday, March 19, 2017 12:38 PM > Subject: Re: [Jprogramming] Show cause hearing - (10 s: y) > > > > idea for double nullchars doesn't work as there's no way to know if a null is > embedded at the end of one "string" or beginning of next string. Though null > followed by a code of the number of consecutive nulls would work. If there > are 255 nulls, the code 255 0 would be used. 510 consecutive nulls 255 255 > 0... > > > > > ________________________________ > From: 'Pascal Jasmin' via Programming <[email protected]> > To: "[email protected]" <[email protected]> > Sent: Sunday, March 19, 2017 11:33 AM > Subject: Re: [Jprogramming] Show cause hearing - (10 s: y) > > > > > > Assuming that this comes with some improvement for s: then it would be easy > to favour that improvement. > > things not to like about a global symbol table is that every typo is > included, and any "app"/set that is loaded joins that table. AFAIU, > Corruption happens if you create symbols, and then restore a table with > 10&s:, and so any application that relies on 10&s: can crash another > previously "loaded application" > > > A problem is that 3!:1, or 3!:3 anyway, seems to just store indexes for > symbols, which relies on 10 s: for actual persistence. > > A suggestion for 3!:1 of symbols would be to scan the array containing > symbols for null (\0), then store 2&s: if not included, or 5&s: if there is a > \0. AFAIU, utf8 is safe to not include 0 as an extended byte. > > An alternative to 5&s: would be a new 8&s: where "data nulls" are encodes > similar to embedded ' in strings. double nullchars encode a data nullchar. > single nullchar encodes terminating 2&s: nullchar. This format c/would be > used for 3!:1. 2&s: could be modified to be the 8&s: proposal. > > 10&s: could store in this new format for portability. But the problem of > previously assigned symbols in session persists, and so a locale level symbol > table would make the most sense for robustness. Also, an > "application"/locale that just uses `true`false symbols (bad example but > replace with small set of enums), would (presumably) be faster if it didn't > share a symbol table with a very large symbol array principally used to avoid > string fills. > > > A question about symbols/3!:1... the documentation suggests that indexes are > limited to 32bit values. Is that true for j64 too? Query (new) and query > (old) is not completely clear in documentation either, and does that differ > from i. or e. ? > > > > ________________________________ > From: Henry Rich <[email protected]> > To: Programming forum <[email protected]> > Sent: Sunday, March 19, 2017 12:14 AM > Subject: [Jprogramming] Show cause hearing - (10 s: y) > > > > Does anyone use (10 s: y)? > > > It is problematic in that the hash table (0 s: 4) may depend on the CPU > > and the J release level. > > > I would rather decommit (10 s: y) and have the user reload the symbol > > table de novo. Any objections? > > > Henry Rich > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
