Here is Google Suffix array testing result website.

https://sites.google.com/site/yuta256/sais

I want to know if the testing corpus contains DNA bio information?

It has a file named chr22.dna. Is it chromosome 22 DNA? 

Weng

On Thursday, August 7, 2014 6:26:19 PM UTC-7, wtx...@gmail.com wrote:
>
> I am developing a new algorithm constructing Suffix Array that is not 
> based on KA, AS-IS or Skew algorithms. Its performance depends on Max(LCPs) 
> (the largest of longest common prefix) of the suffix array.  It will work 
> perfectly for 8-bit character string without any code change. It needs some 
> refine to deal with genome code. 
>
> I want to know some special knowledge about genome DNA testing code. I 
> know nothing about DNA sequence and biology.
>  
> 1. Which are the best books about genome DNA sequence processing suitable 
> for me who is developing a new algorithm constructing suffix array and want 
> the algorithm better workable for DNA analyses. 
> 2. I want to know if there is any algorithm constructing Suffix Array 
> whose performance depends on Max(LCPs)?
> 3. Genome DNA testing file contains only 4 characters: A,C,G and T. Is it 
> right? I found another char U in RNA. Does the file still contain 4 
> characters? 
> 4. If the number of chars in a file is limited to 4, and all repeatable 
> patterns are known, I can specially design some technical refinement to 
> improve my algorithm performance. I want know, in addition to 1 char, 2 
> chars, 3 chars and 4 chars repentance, 5 chars or 
> more repeatable sequence are common? And if common, the largest common 
> chars repentance contains how many different chars? 
> 1 char repentance: AAAAAAAA...
> 2 char repentance: ACACACACACACACA... 
>
> Thank you.
>
> Weng
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to algogeeks+unsubscr...@googlegroups.com.

Reply via email to