CSV is just an example everyone can relate to and an important one. But the
issue is much broader in scope, to put it simply currently SM is not
flexible enough, it bites one's hand almost every time one tries to apply
it.
The most pressing issue is with the domain of emit word:
1) you can't emi
Parsing csv seems like the motivation here.
If so, it would also be good to have a more complete test suite.
In particular, csv double quote handling --
https://stackoverflow.com/questions/66096193/having-multiple-double-quotes-inside-quoted-string-csv-file
for example -- means that your opcode 9
> parsing a csv file with 3 fields per record where any can be empty:
This is indeed an important application, and missing capability of ;:.
discussion:
https://code.jsoftware.com/wiki/User:Pascal_Jasmin/sequential_machine_intro
A workaround to emit nulls part of jpp package (whole project m
I wonder if I'm the only one bothered by semicolon's assertion of strictly
i>j.
Generally, empty words can be used as markers to impose some additional
regularity on the output, to make it easier to process later.
An obvious example would be parsing a csv file with 3 fields per record
where any c
Thanks for the lead. I shall explore the idea.
On Wed, 11 Sep 2019, 01:36 Raul Miller, wrote:
> On Tue, Sep 10, 2019 at 3:51 PM 'Pascal Jasmin' via Programming
> wrote:
> > ;: can do this. Look at the handling for NB. in the J sentence example
> state machine (vocabulary entry).
>
> Sure, but
On Tue, Sep 10, 2019 at 3:51 PM 'Pascal Jasmin' via Programming
wrote:
> ;: can do this. Look at the handling for NB. in the J sentence example
> state machine (vocabulary entry).
Sure, but that approach requires expanding the states to represent
each intermediate point in the sequences.
--
;: can do this. Look at the handling for NB. in the J sentence example state
machine (vocabulary entry).
On Tuesday, September 10, 2019, 10:46:01 a.m. EDT, Raul Miller
wrote:
I think ;: could handle this if you first mapped your character
sequences such that the sequences you wanted
I think ;: could handle this if you first mapped your character
sequences such that the sequences you wanted to treat with a single
state were single characters.
But it might be easier to just use a loop.
Good luck,
--
Raul
On Tue, Sep 10, 2019 at 1:17 AM Arnab Chakraborty wrote:
>
> Dear all
Dear all,
I am trying to implement a state machine in J. I shall be happy if I can
manage to take milage out of the J primitive ;: rather than use explicit
coding.
The input consists of a sequence of alphanumeric characters. However,
certain pairs and triples are considered as single entities
You can use , in the verb argument to / if you want to accumulate a
sequence of results. And you can use {. on the right argument to that
verb if you want the most recent of those results. (The rightmost noun
in the original sequence effectively being the first "example
result".)
FYI,
--
Raul
Hi Henry,
Indeed when I wrote the verb
sm=.4 :'s=:s sf y ] s of y'
I first wanted to write the verb so that
sm/ 2 3 4
runs the machine on input 2 3 4. But I could not manage this, because I
need to preserve the output from each cell, but only the final state. So I
seem to need u/ and u/\.
up.
> >>>>
> >>>>group=: #~ (1 j. 0 ,~ 2 ~:/\ ])
> >>>>words=: ;:@:group
> >>>>substitute=: [: ; ('alpha'"_)`('beta'"_)`]@.('abc' i. {.)&.>
> >>>>
> >>>
The idiomatic way to pass state from execution on one cell to the
execution on the next is with u/ or u/\., depending on whether you need
the result from each cell or just the final result. You write
|. u/\. (|. array) , initialvalue
and u is repeatedly executed between (cell of y) and (previ
gt;>>f=: [: substitute words
>>>>
>>>>I=:'aaaaccca'[O=:'alphabetaalphacccalphabeta'
>>>>(O -: f) I
>>>> 1
>>>>
>>>>
>>>> Realizing I had overlooked `cut',
~:/\ ])
>>>(O -: f) I
>>> 1
>>>
>>> We can mash it together
>>>g=: [: ; (<@(('alpha'"_)`('beta'"_)`]@.('abc'i.{.));.2~ (1,~2~:/\]))
>>> (O-:g) I
>>> 1
>>>
>>>
>
~2~:/\]))
>>(O-:g) I
>> 1
>>
>>
>> Or use no boxes, but this idea depends on your actual application. Accept
>> the fill and remove it later.
>>
>>h=: ' ' -.~ [: , ((('alpha'"_)`('beta'"_)`]@.('abc'
#x27;i.{.));.2~
(1,~2~:/\]))
(O-:h) I
1
On 02/18/2018 07:00 AM, programming-requ...@forums.jsoftware.com wrote:
Date: Sun, 18 Feb 2018 14:08:45 +0530
From: Arnab Chakraborty
To:programm...@jsoftware.com
Subject: [Jprogramming] sequential machine
Message-ID:
Content-Type: text/plai
er.
h=: ' ' -.~ [: , ((('alpha'"_)`('beta'"_)`]@.('abc'i.{.));.2~
(1,~2~:/\]))
(O-:h) I
1
On 02/18/2018 07:00 AM, programming-requ...@forums.jsoftware.com wrote:
Date: Sun, 18 Feb 2018 14:08:45 +0530
From: Arnab Chakraborty
To:programm.
Huh.. somehow google did not show me Loius de Forcrand's reply to your
other message until *after* i came back to check that thread, after
replying to it. And yet it tells me he sent it three hours ago.
So, thinking about this, you've probably already figured this out.
Still, you did ask... Anyway
Hello,
I use state machines a lot in my programs (in other
languages). I am trying to understand how I can use J for
those purposes. I have read the Sequential Machines and
Huffman Coding labs. But I am unable to see how to solve this
toy problem (without using regexp):
Input al
For what it's worth, here was my implementation:
tokenize=:4 :0
'ESC SEP'=. x
E=. 18 b./\.&.|.ESC=y NB. escape positions
S=. (SEP=y)>_1}.0,E NB. separator positions
K=. -.E+.S NB. keep positions
T=. (#y){. 1,}.S NB. token beginnings
(T<;.1 K)#&.>T<;.1 y
)
'^|' tokenize 'one^|uno||t
A similar question on how to tokenize characters with a escape
character came up in the #jsoftware irc channel recently.
I extended that solution to solve the rest of it. I'm not sure if it's
possible to use a single sequential machine for it
charTokens =: (0;(3 2 2$(2 1 1 1 2 2 1 2 1 0 1 0));<<'
Sequential machine does not support empty tokens.
Also, sequential machine does not support deleting characters from
inside a token.
You can work around those issues but post-processing, but if you have
the necessary representation of the sequence to do that kind of
post-processing you don't real
^ escapes to the next character,
| separates tokens.
Can tokenize be written as an application of sequential machine?
tokenize 'one^|uno||three|four^^^|^cuatro|'
+---++---+++
|one|uno||three^^|four^|cuatro||
+---++---+++
---
24 matches
Mail list logo