[algogeeks] Re: I am developing a new algorithm constructing Suffix Array and I want some knowledge on genome

2014-08-08 Thread wtx...@gmail.com
Odysseus,
What does char 'N' mean in chr22.dna? DNA has only 4 chars: A, C, G, T.

Other DNA/RNA has any other chars? I read from wiki that 'U' appears in 
tRNA.

Where is a tRNA file I can use?

Thank you.

Weng

On Thursday, August 7, 2014 6:26:19 PM UTC-7, wtx...@gmail.com wrote:
>
> I am developing a new algorithm constructing Suffix Array that is not 
> based on KA, AS-IS or Skew algorithms. Its performance depends on Max(LCPs) 
> (the largest of longest common prefix) of the suffix array.  It will work 
> perfectly for 8-bit character string without any code change. It needs some 
> refine to deal with genome code. 
>
> I want to know some special knowledge about genome DNA testing code. I 
> know nothing about DNA sequence and biology.
>  
> 1. Which are the best books about genome DNA sequence processing suitable 
> for me who is developing a new algorithm constructing suffix array and want 
> the algorithm better workable for DNA analyses. 
> 2. I want to know if there is any algorithm constructing Suffix Array 
> whose performance depends on Max(LCPs)?
> 3. Genome DNA testing file contains only 4 characters: A,C,G and T. Is it 
> right? I found another char U in RNA. Does the file still contain 4 
> characters? 
> 4. If the number of chars in a file is limited to 4, and all repeatable 
> patterns are known, I can specially design some technical refinement to 
> improve my algorithm performance. I want know, in addition to 1 char, 2 
> chars, 3 chars and 4 chars repentance, 5 chars or 
> more repeatable sequence are common? And if common, the largest common 
> chars repentance contains how many different chars? 
> 1 char repentance: ...
> 2 char repentance: ACACACACACACACA... 
>
> Thank you.
>
> Weng
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to algogeeks+unsubscr...@googlegroups.com.


[algogeeks] Re: I am developing a new algorithm constructing Suffix Array and I want some knowledge on genome

2014-08-08 Thread wtx...@gmail.com
Odysseus,
Thank you very much!!! 

I can open it with Microsoft' Notepad to view it.

This is really what I want!!!

Weng




On Thursday, August 7, 2014 6:26:19 PM UTC-7, wtx...@gmail.com wrote:
>
> I am developing a new algorithm constructing Suffix Array that is not 
> based on KA, AS-IS or Skew algorithms. Its performance depends on Max(LCPs) 
> (the largest of longest common prefix) of the suffix array.  It will work 
> perfectly for 8-bit character string without any code change. It needs some 
> refine to deal with genome code. 
>
> I want to know some special knowledge about genome DNA testing code. I 
> know nothing about DNA sequence and biology.
>  
> 1. Which are the best books about genome DNA sequence processing suitable 
> for me who is developing a new algorithm constructing suffix array and want 
> the algorithm better workable for DNA analyses. 
> 2. I want to know if there is any algorithm constructing Suffix Array 
> whose performance depends on Max(LCPs)?
> 3. Genome DNA testing file contains only 4 characters: A,C,G and T. Is it 
> right? I found another char U in RNA. Does the file still contain 4 
> characters? 
> 4. If the number of chars in a file is limited to 4, and all repeatable 
> patterns are known, I can specially design some technical refinement to 
> improve my algorithm performance. I want know, in addition to 1 char, 2 
> chars, 3 chars and 4 chars repentance, 5 chars or 
> more repeatable sequence are common? And if common, the largest common 
> chars repentance contains how many different chars? 
> 1 char repentance: ...
> 2 char repentance: ACACACACACACACA... 
>
> Thank you.
>
> Weng
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to algogeeks+unsubscr...@googlegroups.com.


Re: [algogeeks] Re: I am developing a new algorithm constructing Suffix Array and I want some knowledge on genome

2014-08-08 Thread Carl Barton
It's just a plain text file, use whatever text editor you like


On 8 August 2014 10:42, wtx...@gmail.com  wrote:

> *I downloaded the file chr22.dna. I don't know what software should be
> used to browse the contents and view its data pattern. This file is really
> what I need to view its data pattern.* *Please help tell me what **software
> should be used to browse the contents.*
> *Weng*
>  *Can't Open .DNA Files ?*You need to clean your Windows Registry and
> repair the Broken Windows File Associations. RegCure  is the tool that
> automates this tedious task... *[image: Warning]There is a 97% chance
> your computer has registry problems.*
>
> If you can't open/run .DNA files chances are you are experiencing Registry
> problems. To prevent further corruption of registry error pile ups that
> slow down your PC, it is *highly recommended* that this errors should be
> fixed immediately.
>
> Not repairing this kind of errors can lead to system crashes, blue
> screens, and hardware failure.. Don't waste any more time, use the
> RegCure   tool and your
> computer will be humming in less than 2 minutes.  (This version includes
> all the latest security fixes and updates.)
> [image: Free Download Now!] 
> File size: 4.9MB, Download time: <1min (Cable/DSL)
>
>
>- [image: 12] *E*asily Open/Repair .DNA files!
>
>
> On Thursday, August 7, 2014 6:26:19 PM UTC-7, wtx...@gmail.com wrote:
>
>> I am developing a new algorithm constructing Suffix Array that is not
>> based on KA, AS-IS or Skew algorithms. Its performance depends on Max(LCPs)
>> (the largest of longest common prefix) of the suffix array.  It will
>> work perfectly for 8-bit character string without any code change. It needs
>> some refine to deal with genome code.
>>
>> I want to know some special knowledge about genome DNA testing code. I
>> know nothing about DNA sequence and biology.
>>
>> 1. Which are the best books about genome DNA sequence processing
>> suitable for me who is developing a new algorithm constructing suffix array
>> and want the algorithm better workable for DNA analyses.
>> 2. I want to know if there is any algorithm constructing Suffix Array
>> whose performance depends on Max(LCPs)?
>> 3. Genome DNA testing file contains only 4 characters: A,C,G and T. Is
>> it right? I found another char U in RNA. Does the file still contain 4
>> characters?
>> 4. If the number of chars in a file is limited to 4, and all repeatable
>> patterns are known, I can specially design some technical refinement to
>> improve my algorithm performance. I want know, in addition to 1 char, 2
>> chars, 3 chars and 4 chars repentance, 5 chars or
>> more repeatable sequence are common? And if common, the largest common
>> chars repentance contains how many different chars?
>> 1 char repentance: ...
>> 2 char repentance: ACACACACACACACA...
>>
>> Thank you.
>>
>> Weng
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "Algorithm Geeks" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to algogeeks+unsubscr...@googlegroups.com.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to algogeeks+unsubscr...@googlegroups.com.


[algogeeks] Re: I am developing a new algorithm constructing Suffix Array and I want some knowledge on genome

2014-08-08 Thread wtx...@gmail.com
*I downloaded the file chr22.dna. I don't know what software should be used 
to browse the contents and view its data pattern. This file is really what 
I need to view its data pattern.**Please help tell me what **software 
should be used to browse the contents.*
*Weng* 
*Can't Open .DNA Files ?*You need to clean your Windows Registry and repair 
the Broken Windows File Associations. RegCure  is the tool that automates 
this tedious task...*[image: Warning]There is a 97% chance your computer 
has registry problems.*

If you can't open/run .DNA files chances are you are experiencing Registry 
problems. To prevent further corruption of registry error pile ups that 
slow down your PC, it is *highly recommended* that this errors should be 
fixed immediately.

Not repairing this kind of errors can lead to system crashes, blue screens, 
and hardware failure.. Don't waste any more time, use the RegCure  
 tool and your computer will 
be humming in less than 2 minutes.  (This version includes all the latest 
security fixes and updates.)
[image: Free Download Now!] 
File size: 4.9MB, Download time: <1min (Cable/DSL)
 

   - [image: 12] *E*asily Open/Repair .DNA files!
   

On Thursday, August 7, 2014 6:26:19 PM UTC-7, wtx...@gmail.com wrote:
>
> I am developing a new algorithm constructing Suffix Array that is not 
> based on KA, AS-IS or Skew algorithms. Its performance depends on Max(LCPs) 
> (the largest of longest common prefix) of the suffix array.  It will work 
> perfectly for 8-bit character string without any code change. It needs some 
> refine to deal with genome code. 
>
> I want to know some special knowledge about genome DNA testing code. I 
> know nothing about DNA sequence and biology.
>  
> 1. Which are the best books about genome DNA sequence processing suitable 
> for me who is developing a new algorithm constructing suffix array and want 
> the algorithm better workable for DNA analyses. 
> 2. I want to know if there is any algorithm constructing Suffix Array 
> whose performance depends on Max(LCPs)?
> 3. Genome DNA testing file contains only 4 characters: A,C,G and T. Is it 
> right? I found another char U in RNA. Does the file still contain 4 
> characters? 
> 4. If the number of chars in a file is limited to 4, and all repeatable 
> patterns are known, I can specially design some technical refinement to 
> improve my algorithm performance. I want know, in addition to 1 char, 2 
> chars, 3 chars and 4 chars repentance, 5 chars or 
> more repeatable sequence are common? And if common, the largest common 
> chars repentance contains how many different chars? 
> 1 char repentance: ...
> 2 char repentance: ACACACACACACACA... 
>
> Thank you.
>
> Weng
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to algogeeks+unsubscr...@googlegroups.com.


Re: [algogeeks] Re: I am developing a new algorithm constructing Suffix Array and I want some knowledge on genome

2014-08-08 Thread Carl Barton
Almost certainly yes, but that website also gives the links to the files
used in the benchmark. So you can just check yourself.


On 8 August 2014 10:23, wtx...@gmail.com  wrote:

> Here is Google Suffix array testing result website.
>
> https://sites.google.com/site/yuta256/sais
>
> I want to know if the testing corpus contains DNA bio information?
>
> It has a file named chr22.dna. Is it chromosome 22 DNA?
>
> Weng
>
> On Thursday, August 7, 2014 6:26:19 PM UTC-7, wtx...@gmail.com wrote:
>>
>> I am developing a new algorithm constructing Suffix Array that is not
>> based on KA, AS-IS or Skew algorithms. Its performance depends on Max(LCPs)
>> (the largest of longest common prefix) of the suffix array.  It will
>> work perfectly for 8-bit character string without any code change. It needs
>> some refine to deal with genome code.
>>
>> I want to know some special knowledge about genome DNA testing code. I
>> know nothing about DNA sequence and biology.
>>
>> 1. Which are the best books about genome DNA sequence processing
>> suitable for me who is developing a new algorithm constructing suffix array
>> and want the algorithm better workable for DNA analyses.
>> 2. I want to know if there is any algorithm constructing Suffix Array
>> whose performance depends on Max(LCPs)?
>> 3. Genome DNA testing file contains only 4 characters: A,C,G and T. Is
>> it right? I found another char U in RNA. Does the file still contain 4
>> characters?
>> 4. If the number of chars in a file is limited to 4, and all repeatable
>> patterns are known, I can specially design some technical refinement to
>> improve my algorithm performance. I want know, in addition to 1 char, 2
>> chars, 3 chars and 4 chars repentance, 5 chars or
>> more repeatable sequence are common? And if common, the largest common
>> chars repentance contains how many different chars?
>> 1 char repentance: ...
>> 2 char repentance: ACACACACACACACA...
>>
>> Thank you.
>>
>> Weng
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "Algorithm Geeks" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to algogeeks+unsubscr...@googlegroups.com.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to algogeeks+unsubscr...@googlegroups.com.


[algogeeks] Re: I am developing a new algorithm constructing Suffix Array and I want some knowledge on genome

2014-08-08 Thread wtx...@gmail.com
Here is Google Suffix array testing result website.

https://sites.google.com/site/yuta256/sais

I want to know if the testing corpus contains DNA bio information?

It has a file named chr22.dna. Is it chromosome 22 DNA? 

Weng

On Thursday, August 7, 2014 6:26:19 PM UTC-7, wtx...@gmail.com wrote:
>
> I am developing a new algorithm constructing Suffix Array that is not 
> based on KA, AS-IS or Skew algorithms. Its performance depends on Max(LCPs) 
> (the largest of longest common prefix) of the suffix array.  It will work 
> perfectly for 8-bit character string without any code change. It needs some 
> refine to deal with genome code. 
>
> I want to know some special knowledge about genome DNA testing code. I 
> know nothing about DNA sequence and biology.
>  
> 1. Which are the best books about genome DNA sequence processing suitable 
> for me who is developing a new algorithm constructing suffix array and want 
> the algorithm better workable for DNA analyses. 
> 2. I want to know if there is any algorithm constructing Suffix Array 
> whose performance depends on Max(LCPs)?
> 3. Genome DNA testing file contains only 4 characters: A,C,G and T. Is it 
> right? I found another char U in RNA. Does the file still contain 4 
> characters? 
> 4. If the number of chars in a file is limited to 4, and all repeatable 
> patterns are known, I can specially design some technical refinement to 
> improve my algorithm performance. I want know, in addition to 1 char, 2 
> chars, 3 chars and 4 chars repentance, 5 chars or 
> more repeatable sequence are common? And if common, the largest common 
> chars repentance contains how many different chars? 
> 1 char repentance: ...
> 2 char repentance: ACACACACACACACA... 
>
> Thank you.
>
> Weng
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Algorithm Geeks" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to algogeeks+unsubscr...@googlegroups.com.