I'm not sure why "whenever we want to work with the characters within a line, we'd have to unbox it" is considered such a drawback, especially given the "each" conjunction. I frequently work with character data and find it very useful to build my code to work on a string, then apply that code with "each" to an entire file broken into useful pieces, e.g.
grab2After=: ] (] {~ 1 2 + [)~ ] i. [: < [ 'values' grab2After 'some';'test';'values';'hit0';'hit1';'nohit' +----+----+ |hit0|hit1| +----+----+ rr=. (<'values') grab2After &.>"1 <;._1&>TAB,&.><;._2 ] LF (],[#~[~:[:{:]) CR-.~fread 'somefile.txt' (where the "<;._1&>TAB..." stuff before the "fread" boxes lines on LFs, then boxes items on each line based on tab-delimiters.) But, as Raul says, it's hard to speculate usefully about undefined problems. On Thu, Nov 5, 2015 at 1:30 AM, Raul Miller <rauldmil...@gmail.com> wrote: > On Wed, Nov 4, 2015 at 4:29 PM, Joe Bogner <joebog...@gmail.com> wrote: > > I think ragged lines in this context means delimited text (not fixed > width). > > > > It sounds like the author recognizes the options: > > > > "For J the choice is not quite as clear. One way of loading the data > > would be to `box' each line and then create a vector of boxes to > > represent all of the data in the file. This works fine, but whenever > > we want to work with the characters within a line, we'd have to unbox > > it before doing so. An alternative would be to load the file as a > > character array, but this would necessitate `squaring up' the data, > > padding each of the lines out to match the length of the longest > > single line, thus producing a rectangular matrix. While either of > > these choices could be made to work, they generally seem, to me at > > least, to be somewhat cumbersome in comparison with K's much more > > straightforward treatment." > > > > What's unclear to me is what the author means by this: "but whenever > > we want to work with the characters within a line, we'd have to unbox > > it before doing so." - specifically what "work with the characters" > > means. > > Another option involves a one dimensional array of characters, and > then working with indices and corresponding sequence lengths in that > array... > > But it's hard to say anything useful about an unknown problem. > > Thanks, > > -- > Raul > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > -- Devon McCormick, CFA Quantitative Consultant ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm