On Mon, Mar 28, 2011 at 4:51 AM, Franco Nazareno <franco.nazar...@gmail.com> wrote: > Good day everyone! > > > > First, I want to congratulate the group for this wonderful project. It did > open up new ideas and solutions in computing and technology-wise. I'm > excited to learn more about it and discover possibilities using Hadoop and > its components. > > > > Well I just want to ask this with regards to my study. Currently I'm > studying my PhD course in Bioinformatics, and my question is that can you > give me a (rough) idea if it's possible to use Hadoop cluster in achieving a > DNA sequence alignment? My basic idea for this goes something like a string > search out of a huge data files stored in HDFS, and the application uses > MapReduce in searching and computing. As the Hadoop paradigm impies, it > doesn't serve well in interactive applications, and I think this kind of > searching is a "write-once, read-many" application.
Are you looking for something like a "distributed grep?" The hadoop package comes with some examples, and 'grep' is one of them. Please see: http://wiki.apache.org/hadoop/Grep and http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html . Let us know if you are looking for something else. -b > > > > I hope you don't mind my question. And it'll be great hearing your comments > or suggestions about this. > > > > Thanks and more power! > > Franco > >