Re: Hadoop also applicable in a web app environment?
Thanks. Hopefully you keep us informed via this thread. Kylie McCormick schrieb: Hello: I am actually working on this myself on my project Multisearch. The Map() function uses clients to connect to services and collect responses, and the Reduce() function merges them together. I'm working on putting this into a Servlet as well, too, so it can be used via Tomcat. I've worked with a number of different web services... OGSA-DAI and Axis Web Services. My experience with Hadoop (which is not entirely researched yet) is that it is faster than using these other methods alone. Hopefully by the end of the summer I'll have some more research on this topic (about speed). The other links posted here are really helpful... Kylie On Tue, Aug 5, 2008 at 10:11 AM, Mork0075 <[EMAIL PROTECTED]> wrote: Hello, i just discovered the Hadoop project and it looks really interesting to me. As i can see at the moment, Hadoop is really useful for data intensive computations. Is there a Hadoop scenario for scaling web applications too? Normally web applications are not that computation heavy. The need of scaling them, arises from increasing users, which perform (every user in his session) simple operations like querying some data from the database. So distributing this scenario, a Hadoop job would be to "map" the requests to a certain server in the cluster and "reduce" it. But this is what load balancers normally do, this doenst solve the scalabilty problem so far. So my question: is there a Hadoop scenario for "non computation heavy but heavy load" web applications? Thanks a lot
Re: Hadoop also applicable in a web app environment?
Thanks. I've looked brievly at HBase and thought, that is was designed for very large datasets only. But now i've got the feeling that it's also suitable for distributed, scalable persitence of small datasets under huge requests. Is it this way? Leon Mergen schrieb: Hello, On Tue, Aug 5, 2008 at 8:11 PM, Mork0075 <[EMAIL PROTECTED]> wrote: So my question: is there a Hadoop scenario for "non computation heavy but heavy load" web applications? I suggest you look into HBase, a subproject of hadoop: http://hadoop.apache.org/hbase/ -- it is designed after google's Bigtable and works on top of Hadoop's DFS. It allows quick retrieval of small portions of data, in a distributed fashion. Regards, Leon Mergen
Re: Hadoop also applicable in a web app environment?
Hello: I am actually working on this myself on my project Multisearch. The Map() function uses clients to connect to services and collect responses, and the Reduce() function merges them together. I'm working on putting this into a Servlet as well, too, so it can be used via Tomcat. I've worked with a number of different web services... OGSA-DAI and Axis Web Services. My experience with Hadoop (which is not entirely researched yet) is that it is faster than using these other methods alone. Hopefully by the end of the summer I'll have some more research on this topic (about speed). The other links posted here are really helpful... Kylie On Tue, Aug 5, 2008 at 10:11 AM, Mork0075 <[EMAIL PROTECTED]> wrote: > Hello, > > i just discovered the Hadoop project and it looks really interesting to me. > As i can see at the moment, Hadoop is really useful for data intensive > computations. Is there a Hadoop scenario for scaling web applications too? > Normally web applications are not that computation heavy. The need of > scaling them, arises from increasing users, which perform (every user in his > session) simple operations like querying some data from the database. > > So distributing this scenario, a Hadoop job would be to "map" the requests > to a certain server in the cluster and "reduce" it. But this is what load > balancers normally do, this doenst solve the scalabilty problem so far. > > So my question: is there a Hadoop scenario for "non computation heavy but > heavy load" web applications? > > Thanks a lot > -- The Circle of the Dragon -- unlock the mystery that is the dragon. http://www.blackdrago.com/index.html "Light, seeking light, doth the light of light beguile!" -- William Shakespeare's Love's Labor's Lost
Re: Hadoop also applicable in a web app environment?
Hello, On Tue, Aug 5, 2008 at 8:11 PM, Mork0075 <[EMAIL PROTECTED]> wrote: > So my question: is there a Hadoop scenario for "non computation heavy but > heavy load" web applications? I suggest you look into HBase, a subproject of hadoop: http://hadoop.apache.org/hbase/ -- it is designed after google's Bigtable and works on top of Hadoop's DFS. It allows quick retrieval of small portions of data, in a distributed fashion. Regards, Leon Mergen
Re: Hadoop also applicable in a web app environment?
I am a newbie also, so my answer is not an expert user's by any means. That said: This is not what the MR is designed for... If you have a reporting tool for example, which takes a database a very long time to answer - such a long time that you can't expect a user to hang around waiting for the HTTP response - you might use hadoop to churn through the data and produce the report, with a response to the user "your data is being processes, please check back this_URL soon" It is not designed as the thing that answers real time synchronous requests though (e.g. users clicking on links), nor to handle high traffic load - for that you need more servers, and a load balancer like you say - and scaling out your DB to have multiple read only copies. Consider a search engine - yahoo are crawling all the web sites, and using MR to process the data to create indexes of the words on pages. But when you search on Yahoo as a user, it is not a MR job that is running to provide the answers. Here you could say MR is playing the role of generating the index "offline" which is then loaded into something that can answer the query immediately. You might consider lucene or SOLR or something for that... (SOLR especially I would say) You might find http://highscalability.com/ interesting... Cheers, Tim On Tue, Aug 5, 2008 at 8:11 PM, Mork0075 <[EMAIL PROTECTED]> wrote: > Hello, > > i just discovered the Hadoop project and it looks really interesting to me. > As i can see at the moment, Hadoop is really useful for data intensive > computations. Is there a Hadoop scenario for scaling web applications too? > Normally web applications are not that computation heavy. The need of > scaling them, arises from increasing users, which perform (every user in his > session) simple operations like querying some data from the database. > > So distributing this scenario, a Hadoop job would be to "map" the requests > to a certain server in the cluster and "reduce" it. But this is what load > balancers normally do, this doenst solve the scalabilty problem so far. > > So my question: is there a Hadoop scenario for "non computation heavy but > heavy load" web applications? > > Thanks a lot >