Re: Hadoop also applicable in a web app environment?

2008-08-06 Thread Mork0075

Thanks. Hopefully you keep us informed via this thread.

Kylie McCormick schrieb:

Hello:
I am actually working on this myself on my project Multisearch. The Map()
function uses clients to connect to services and collect responses, and the
Reduce() function merges them together. I'm working on putting this into a
Servlet as well, too, so it can be used via Tomcat.

I've worked with a number of different web services... OGSA-DAI and Axis Web
Services. My experience with Hadoop (which is not entirely researched yet)
is that it is faster than using these other methods alone. Hopefully by the
end of the summer I'll have some more research on this topic (about speed).

The other links posted here are really helpful...

Kylie


On Tue, Aug 5, 2008 at 10:11 AM, Mork0075 <[EMAIL PROTECTED]> wrote:


Hello,

i just discovered the Hadoop project and it looks really interesting to me.
As i can see at the moment, Hadoop is really useful for data intensive
computations. Is there a Hadoop scenario for scaling web applications too?
Normally web applications are not that computation heavy. The need of
scaling them, arises from increasing users, which perform (every user in his
session) simple operations like querying some data from the database.

So distributing this scenario, a Hadoop job would be to "map" the requests
to a certain server in the cluster and "reduce" it. But this is what load
balancers normally do, this doenst solve the scalabilty problem so far.

So my question: is there a Hadoop scenario for "non computation heavy but
heavy load" web applications?

Thanks a lot









Re: Hadoop also applicable in a web app environment?

2008-08-06 Thread Mork0075
Thanks. I've looked brievly at HBase and thought, that is was designed 
for very large datasets only. But now i've got the feeling that it's 
also suitable for distributed, scalable persitence of small datasets 
under huge requests. Is it this way?


Leon Mergen schrieb:

Hello,

On Tue, Aug 5, 2008 at 8:11 PM, Mork0075 <[EMAIL PROTECTED]> wrote:


So my question: is there a Hadoop scenario for "non computation heavy but
heavy load" web applications?



I suggest you look into HBase, a subproject of hadoop:
http://hadoop.apache.org/hbase/ -- it is designed after google's Bigtable
and works on top of Hadoop's DFS. It allows quick retrieval of small
portions of data, in a distributed fashion.

Regards,

Leon Mergen





Re: Hadoop also applicable in a web app environment?

2008-08-05 Thread Kylie McCormick
Hello:
I am actually working on this myself on my project Multisearch. The Map()
function uses clients to connect to services and collect responses, and the
Reduce() function merges them together. I'm working on putting this into a
Servlet as well, too, so it can be used via Tomcat.

I've worked with a number of different web services... OGSA-DAI and Axis Web
Services. My experience with Hadoop (which is not entirely researched yet)
is that it is faster than using these other methods alone. Hopefully by the
end of the summer I'll have some more research on this topic (about speed).

The other links posted here are really helpful...

Kylie


On Tue, Aug 5, 2008 at 10:11 AM, Mork0075 <[EMAIL PROTECTED]> wrote:

> Hello,
>
> i just discovered the Hadoop project and it looks really interesting to me.
> As i can see at the moment, Hadoop is really useful for data intensive
> computations. Is there a Hadoop scenario for scaling web applications too?
> Normally web applications are not that computation heavy. The need of
> scaling them, arises from increasing users, which perform (every user in his
> session) simple operations like querying some data from the database.
>
> So distributing this scenario, a Hadoop job would be to "map" the requests
> to a certain server in the cluster and "reduce" it. But this is what load
> balancers normally do, this doenst solve the scalabilty problem so far.
>
> So my question: is there a Hadoop scenario for "non computation heavy but
> heavy load" web applications?
>
> Thanks a lot
>



-- 
The Circle of the Dragon -- unlock the mystery that is the dragon.
http://www.blackdrago.com/index.html

"Light, seeking light, doth the light of light beguile!"
-- William Shakespeare's Love's Labor's Lost


Re: Hadoop also applicable in a web app environment?

2008-08-05 Thread Leon Mergen
Hello,

On Tue, Aug 5, 2008 at 8:11 PM, Mork0075 <[EMAIL PROTECTED]> wrote:

> So my question: is there a Hadoop scenario for "non computation heavy but
> heavy load" web applications?


I suggest you look into HBase, a subproject of hadoop:
http://hadoop.apache.org/hbase/ -- it is designed after google's Bigtable
and works on top of Hadoop's DFS. It allows quick retrieval of small
portions of data, in a distributed fashion.

Regards,

Leon Mergen


Re: Hadoop also applicable in a web app environment?

2008-08-05 Thread tim robertson
I am a newbie also, so my answer is not an expert user's by any means.
 That said:

This is not what the MR is designed for...

If you have a reporting tool for example, which takes a database a
very long time to answer - such a long time that you can't expect a
user to hang around waiting for the HTTP response - you might use
hadoop to churn through the data and produce the report, with a
response to the user "your data is being processes, please check back
this_URL soon"

It is not designed as the thing that answers real time synchronous
requests though (e.g. users clicking on links), nor to handle high
traffic load - for that you need more servers, and a load balancer
like you say - and scaling out your DB to have multiple read only
copies.

Consider a search engine - yahoo are crawling all the web sites, and
using MR to process the data to create indexes of the words on pages.
But when you search on Yahoo as a user, it is not a MR job that is
running to provide the answers.  Here you could say MR is playing the
role of generating the index "offline" which is then loaded into
something that can answer the query immediately.  You might consider
lucene or SOLR or something for that... (SOLR especially I would say)

You might find http://highscalability.com/ interesting...

Cheers,

Tim


On Tue, Aug 5, 2008 at 8:11 PM, Mork0075 <[EMAIL PROTECTED]> wrote:
> Hello,
>
> i just discovered the Hadoop project and it looks really interesting to me.
> As i can see at the moment, Hadoop is really useful for data intensive
> computations. Is there a Hadoop scenario for scaling web applications too?
> Normally web applications are not that computation heavy. The need of
> scaling them, arises from increasing users, which perform (every user in his
> session) simple operations like querying some data from the database.
>
> So distributing this scenario, a Hadoop job would be to "map" the requests
> to a certain server in the cluster and "reduce" it. But this is what load
> balancers normally do, this doenst solve the scalabilty problem so far.
>
> So my question: is there a Hadoop scenario for "non computation heavy but
> heavy load" web applications?
>
> Thanks a lot
>