You are on the right track, but you may need to do the replacement in chunks
This works on my machine (uses 12gb of ram)
data =: a. {~ 65+ ?. 1e8 # (70)
data =: 30 # data
substr =: (];.0~ ,.)~"1
>./ a. i. (0 1e9 substr data)
134
data=: ' ' (I. 128 < a. i. (0 1e9 substr data)) } data
data=: ' ' (I. 128 < a. i. (1e9 1e9 substr data)) } data
data=: ' ' (I. 128 < a. i. (2e9 1e9 substr data)) } data
>./ a. i. (0 1e9 substr data)
128
That being said, I would probably prefer a different utility for this job
such as tr
On Fri, Sep 11, 2015 at 5:42 AM, Strale <[email protected]> wrote:
> Hello
>
> I have a very big file 3 G Bytes and I need to make some search on it :(
> data is 8 bit char data
>
> J open it without problem
> but I have some problem to use "rxmatches" due to the not 7 bit ASCII chars
> (I presume)
>
> I have used a trik to delete data > of 7 bit but is very badd and lead to
> out of memory
>
> I take the inddexes of ASCII chars > of 127
> and then I look inside data for those indexes with command i.
> once found I change with spaces the indexes found with the comand ' '
> (indexes) } data
>
>
> data <- is loaded with 3 G Bytes
> remove =.( 128 + i. 128 ) { a. NB. >7bit ASCII
>
> data =. ' ' ((128 > remove i. data) # i. $ data) } data NB. to 7bit ASCII
>
> |out of memory
>
>
>
> Is there a better way to do it ?
>
>
>
> Thanks
>
> Paolo
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm