Can it be used for example in a web hosting application to process site 
requests in the form of load balancing etc

Sent from my iPhone

> On 07 Feb 2015, at 09:45, Matt Wallis <[email protected]> wrote:
> 
> Hi Jonathan,
> 
>> On 7 Feb 2015, at 6:20 pm, Jonathan Aquilina <[email protected]> wrote:
>> 
>> Can someone explain to me what exactly the purpose of hadoop is and what we 
>> mean when we say big data? Is this for data storage and retrieval? Number 
>> crunching?
> 
> Hadoop can be thought of as HTPC, High Throughput Computing, over a 
> collection of simple servers. Where in HPC you might have hundreds of nodes 
> with a shared file system working on the same copy of the data, Hadoop 
> distributes the data to local storage in each node of the cluster using the 
> Hadoop Filesystem, and then collects the output at the end. I believe it has 
> built in redundancy, allowing you to distribute the same job to 2 or 3 nodes 
> for fault tolerance. It means your "cluster" can be very simple, no complex 
> parallel filesystems, no specialised networks, no redundancy at the hardware 
> level.
> 
> Originally built to work with MapReduce as it's core application, there are a 
> number of other applications that can be found on the Apache website. 
> 
> As for big data, this is basically about taking things like 10 billion 
> tweets, breaking them up into chunks of 500,000 or so, and doing analytics on 
> them. Things like that break up very easily for distribution, as there is 
> usually very little linkage between each tweet. 
> 
> Hadoop came out of the need for places like Google, Yahoo, Paypal and eBay to 
> process terabytes of transaction logs an hour. They already had the servers, 
> but they were in data centres all over the world. Rather than hook them all 
> up to some common file server, just build a system to package up the data and 
> the application and send it where ever can process it the quickest. Send it 3 
> times to make sure it gets done, then pull back the results at the end.
> 
> Matt.
> _______________________________________________
> Beowulf mailing list, [email protected] sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to