Hi Eliot,

maybe RunArray helps here?

>From the class comment of RunArray (in Pharo):
My instances provide space-efficient storage of data which tends to be
constant over long runs of the possible indices. Essentially repeated
values are stored singly and then associated with a "run" length that
denotes the number of consecutive occurrences of the value.

Cheers

Matthias

2010/2/9 Eliot Miranda <[email protected]>:
> Hi All,
>     I've just needed to make sense of a very long log file generated by
> strace.  The log file is full of entries like:
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744804, 491238}, NULL) = 0
> sigreturn()                             = ? (mask now [])
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> and my workspace script reduces these to e.g.
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744797, 316183}, NULL) = 0
> sigreturn()                             = ? (mask now [])
> NEXT 2 LINES REPEAT 715 TIMES
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744797, 317189}, NULL) = 0
> sigreturn()                             = ? (mask now [])
>
> My question is has anyone looked at this issue in any depth and perhaps come
> up with something not as crude as the below and possibly even recursive.
>  i.e. the above would ideally be reduced to e.g.
> NEXT 7 LINES REPEAT 123456 TIMES
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744797, 316183}, NULL) = 0
> sigreturn()                             = ? (mask now [])
> NEXT 2 LINES REPEAT BETWEEN 500 AND 800 TIMES
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744797, 317189}, NULL) = 0
> sigreturn()                             = ? (mask now [])
>
>
> Here's my quick hack that I ran in vw7.7nc:
> | f o lines maxrun repeats range |
> f := '../Cog/squeak.strace.log' asFilename readStream.
> o := 'compressed.log' asFilename writeStream.
> lines := OrderedCollection new.
> maxrun := 50.
> repeats := 0.
> range := nil.
> [[f atEnd] whileFalse:
> [lines size > maxrun ifTrue:
> [repeats > 0
> ifTrue:
> [1 to: range first - 1 do:
> [:i| o nextPutAll: (lines at: i); cr].
> o nextPutAll: 'NEXT '; print: range size; nextPutAll: ' LINES REPEAT ';
> print: repeats + 1; nextPutAll: ' TIMES'; cr.
> range do:
> [:i| o nextPutAll: (lines at: i); cr].
> lines removeFirst: range last.
> repeats := 0]
> ifFalse:
> [o nextPutAll: lines removeFirst; cr; flush].
> range := nil].
> lines addLast: (f upTo: Character cr).
> [:exit|
> 1 to: lines size do:
> [:i| | line repeat |
> line := lines at: i.
> repeat := lines nextIndexOf: line from: i + 1 to: lines size.
> (repeat ~~ nil
> and: [lines size >= (repeat - i * 2 + i)
> and: [(i to: repeat - 1) allSatisfy: [:j| (lines at: j) = (lines at: j - i +
> repeat)]]]) ifTrue:
> [repeats := repeats + 1.
> range isNil
> ifTrue: [range := i to: repeat - 1]
> ifFalse:
> [range = (i to: repeat - 1) ifTrue:
> [range do: [:ignore| lines removeAtIndex: repeat].
> exit value]]]]] valueWithExit]]
> ensure: [f close. o close].
> repeats
> Forgive the cross post.  I expect deep expertise in each newsgroup posted
> to.
> best
> Eliot
> _______________________________________________
> Pharo-project mailing list
> [email protected]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>

_______________________________________________
Pharo-project mailing list
[email protected]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

Reply via email to