Re: [Jprogramming] Regex vs I./E. for pattern matching

Jon Hough Mon, 17 Aug 2015 07:30:34 -0700

I suppose it wasn't the most interesting observation, but it was just something 
that occurred to me during the regex lab, that regex wasn't the only, and not 
maybe not the best, way to solve some of the problems.


> Date: Sun, 16 Aug 2015 15:00:15 +0000
> From: [email protected]
> To: [email protected]
> Subject: Re: [Jprogramming] Regex vs I./E. for pattern matching
> 
> note that there is an even faster version
> 
>   10 timespacex '( I. ''CTAG'' E. DNA)' 
> 2.656e_6 6016 
>   10 timespacex '(  ''CTAG'' I.@:E. DNA)' 
> 2.144e_6 6400 
> 
> 
> for Henry's example of 
> 'CTAG*ACTA'
> 
> there can be some enhanced flexibility for using E.
> 
> ('CTAG';'ACTA') I.@:E.each < DNA
> 
> one way to use the index starts is getting all of the "overlapping matches"
> 
> rangei =: [ + >:@] i.@- [
> ('CTAG';'ACTA') (] {~each a:-.~ [: ''"_`rangei@.</each [: ,@:(,.each/"0 1&>/) 
>  I.@:E.each) < DNA
> 
> ----- Original Message -----
> From: Jon Hough <[email protected]>
> To: "[email protected]" <[email protected]>
> Cc: 
> Sent: Sunday, August 16, 2015 2:09 AM
> Subject: [Jprogramming] Regex vs I./E. for pattern matching
> 
> I recently went through the regex lab, and would like to know whether it is 
> more idiomatic for J users to use regex when matching simple patterns in a 
> string, or to use E. and similar verbs?
> For example. If I have an (imaginary) DNA sequence string:
> DNA=: 
> 'CGATTGACTAGTCGATTGCTGATGCTCTAGTCGTGATGCTATACTAGTGCGTCGATGCTAGCGCTAGTCGCATTTGA'
> I want to find where 'CTAG' sequences exist in this string. Using regex, 
> 'CTAG' rxmatches DNA
> will give the 5 indices where the CTAG pattern is found.
> But I could equally do,
> I. 'CTAG' E. DNA
> which will give me the same indices. And it seems the non-regex way is more 
> efficient (in time and space):
> 
> timespacex '( I. ''CTAG'' E. DNA)'
> 
> 
> gives 1.5e_5 3008
> 
> 
> 
> 
> timespacex '( ''CTAG'' rxmatches DNA)'
> 
> 
> gives 0.001103 6720
> 
> 
> Granted, the regex expression is as simple as possible. and regex can do more 
> complicated matching than E. can do, and possibly rxmatches gains efficiency 
> over E. for very longer DNA strings. But it seems for simple matches E. is 
> the better choice.
> 
> 
> 
> 
>                           
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
                                          
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Re: [Jprogramming] Regex vs I./E. for pattern matching

Reply via email to