----------------------------------------
> Date: Sun, 10 Oct 2010 15:27:11 +0200
> From: lorenzo.ise...@gmail.com
> To: dwinsem...@comcast.net
> CC: r-help@r-project.org
> Subject: Re: [R] Memory management in R
>
>
> > I already offered the Biostrings package. It provides more robust
> > methods for string matching than does grepl. Is there a reason that you
> > choose not to?
> >
>
> Indeed that is the way I should go for and I have installed the package
> after some struggling. Since biostring is a fairly complex package and I
> need only a way to check if a certain string A is a subset of string B,
> do you know the biostring functions to achieve this?
> I see a lot of methods for biological (DNA, RNA) sequences, and they may
> not apply to my series (which are definitely not from biology).

Generally the differences relate to alphabet and "things you may want
to know about them." Unless you are looking for reverse complement
text strings, there will be a lot of stuff you don't need. Offhand,
I'd be looking for things like computational linguistics packages
as you are looking to find patterns or predictability in human readable 
character sequences. Now, humans can probably write hairpin-text( look
at what RNA can do LOL) but this is probably not what you care about. 

However,  as I mentioned earlier, I had to write my own regex compiler ( 
coincidently
for bio apps ) to get required performance. Your application and understanding
may benefit from things like building dictionaries that aren't really
part of regex and that can easily be done in a few lines of c++ code
using STL containers. To get statistically meaningful samples, you almost
will certainly need faster code.




> Cheers
>
> Lorenzo
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
                                          
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to