[algogeeks] I am developing a new algorithm constructing Suffix Array and I want some knowledge on genome

wtx...@gmail.com Thu, 07 Aug 2014 18:27:06 -0700

I am developing a new algorithm constructing Suffix Array that is not based 
on KA, AS-IS or Skew algorithms. Its performance depends on Max(LCPs) (the 
largest of longest common prefix) of the suffix array.  It will work 
perfectly for 8-bit character string without any code change. It needs some 
refine to deal with genome code.


I want to know some special knowledge about genome DNA testing code. I know 
nothing about DNA sequence and biology.
 
1. Which are the best books about genome DNA sequence processing suitable 
for me who is developing a new algorithm constructing suffix array and want 
the algorithm better workable for DNA analyses. 
2. I want to know if there is any algorithm constructing Suffix Array whose 
performance depends on Max(LCPs)?
3. Genome DNA testing file contains only 4 characters: A,C,G and T. Is it 
right? I found another char U in RNA. Does the file still contain 4 
characters? 
4. If the number of chars in a file is limited to 4, and all repeatable 
patterns are known, I can specially design some technical refinement to 
improve my algorithm performance. I want know, in addition to 1 char, 2 
chars, 3 chars and 4 chars repentance, 5 chars or 
more repeatable sequence are common? And if common, the largest common 
chars repentance contains how many different chars? 
1 char repentance: AAAAAAAA...
2 char repentance: ACACACACACACACA... 

Thank you.

Weng

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to algogeeks+unsubscr...@googlegroups.com.

[algogeeks] I am developing a new algorithm constructing Suffix Array and I want some knowledge on genome

Reply via email to