On Nov 10, 2007 10:36 AM, danil osipchuk <[EMAIL PROTECTED]> wrote:
> of a problem. The real task I have at hand is to parse and then analyze huge
> log files (hundreds of megabytes). The parsing of each line is somewhat
> context sensitive: each line may modify the context and the result for the
> next line will be different. So I have to reffer to and to modify the state
> when going through the file line-by-line. If I write explicitly basically I
> will get the perfomance of a shell script. Tacitly - well, this thread is
> about it. And this is not uncommon problem, see this thread:

For this, I would probably use dyadic ;: to break the log file down into
meaningful words, (or possibly multiple times, in a stepwise refinement
sort of way), until I could get something which I could handle with a
simple expression.

That said, there are a few operations I wish ;: had which it currently does not:

[1] emit space character (where the space character can be defined in
its left argument).  This is important where I want character position in
my parsed result to correspond to original character position, but I also
want to discard some data.

[2] I would like an operation which simply tells me the state and nothing
else.  Currently, if I have a use for this (perhaps in terms of defining what
operation I will be using at a higher level on the original text), I must
use 5 as my initial value in my left argument, then I must extract the
result state (state after seeing each character) from the result.  This
creates quite a bit of overhead, both in memory and time.

Note that [2] might efficiently accomplish [1] if we had a fast character
substitution operation (or special code supporting one of our
current options, though I don't now which is the best candidate
for special code).

FYI,

-- 
Raul
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to