On Nov 10, 2007 10:36 AM, danil osipchuk <[EMAIL PROTECTED]> wrote: > of a problem. The real task I have at hand is to parse and then analyze huge > log files (hundreds of megabytes). The parsing of each line is somewhat > context sensitive: each line may modify the context and the result for the > next line will be different. So I have to reffer to and to modify the state > when going through the file line-by-line. If I write explicitly basically I > will get the perfomance of a shell script. Tacitly - well, this thread is > about it. And this is not uncommon problem, see this thread:
For this, I would probably use dyadic ;: to break the log file down into meaningful words, (or possibly multiple times, in a stepwise refinement sort of way), until I could get something which I could handle with a simple expression. That said, there are a few operations I wish ;: had which it currently does not: [1] emit space character (where the space character can be defined in its left argument). This is important where I want character position in my parsed result to correspond to original character position, but I also want to discard some data. [2] I would like an operation which simply tells me the state and nothing else. Currently, if I have a use for this (perhaps in terms of defining what operation I will be using at a higher level on the original text), I must use 5 as my initial value in my left argument, then I must extract the result state (state after seeing each character) from the result. This creates quite a bit of overhead, both in memory and time. Note that [2] might efficiently accomplish [1] if we had a fast character substitution operation (or special code supporting one of our current options, though I don't now which is the best candidate for special code). FYI, -- Raul ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
