This is a problem from leetcode.com (similar to Project
Euler)https://leetcode.com/problems/repeated-dna-sequences/
The problem is to find all 10 letter repeated subsequences from a DNA string
(made of C,G,A,T characters).
My solution:
func =: (I.@:(1&<)@:>@:(1&{)@:(~. ,: <"0@:(#/.~)) { ])@:(<"1@:(10&(]\)))
e.g. s =: 'AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT' NB. see the link for this
definition
func s
┌──────────┬──────────┐
│AAAAACCCCC│CCCCCAAAAA│
└──────────┴──────────┘
It is not very pretty. Can anyone improve on it?
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm