Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "PoweredBy" page has been changed by LarsFrancke:
http://wiki.apache.org/hadoop/PoweredBy?action=diff&rev1=394&rev2=395

Comment:
Fix formatting and remove what looks like Spam

  This page documents an alphabetical list of institutions that are using 
Hadoop for educational or production uses. Companies that offer services on or 
based around Hadoop are listed in [[Distributions and Commercial Support]]. 
Please include details about your cluster hardware and size. Entries without 
this may be mistaken for spam references and deleted.'' ''
  
- To add entries you need write permission to the wiki, which you can get by 
subscribing to the core-...@hadoop.apache.org mailing list and asking for the 
wiki account you have just created to get this permission. If you are using 
Hadoop in production you ought to consider getting involved in the development 
process anyway, by filing bugs, testing beta releases, reviewing the code and 
turning your notes into shared documentation. Your participation in this 
process will ensure your needs get met.
+ To add entries you need write permission to the wiki, which you can get by 
subscribing to the common-...@hadoop.apache.org mailing list and asking for the 
wiki account you have just created to get this permission. If you are using 
Hadoop in production you ought to consider getting involved in the development 
process anyway, by filing bugs, testing beta releases, reviewing the code and 
turning your notes into shared documentation. Your participation in this 
process will ensure your needs get met.
  
  {{{
  }}}
@@ -176, +176 @@

    * ''532 nodes cluster (8 * 532 cores, 5.3PB). ''
    * ''Heavy usage of Java MapReduce, Pig, Hive, HBase ''
    * ''Using it for Search optimization and Research. ''
+ 
-  * ''[[http://ecircle.com|eCircle]] ''
+  * ''[[http://ecircle.com|eCircle]]''
-   * ''two 60 nodes cluster each >1000 cores, total 5T Ram, 1PB
+   * ''two 60 nodes cluster each >1000 cores, total 5T Ram, 1PB''
-   * mostly HBase, some M/R
+   * ''mostly HBase, some M/R''
-   * marketing data handling
+   * ''marketing data handling''
+ 
   * ''[[http://www.enet.gr|Enet]], 'Eleftherotypia' newspaper, Greece ''
    * ''Experimental installation - storage for logs and digital assets ''
    * ''Currently 5 nodes cluster ''
@@ -211, +213 @@

  = F =
   * ''[[http://www.facebook.com/|Facebook]] ''
    * ''We use Hadoop to store copies of internal log and dimension data 
sources and use it as a source for reporting/analytics and machine learning. ''
+   * ''Currently we have 2 major clusters:''
-   * ''Currently we have 2 major clusters:    * A 1100-machine cluster with 
8800 cores and about 12 PB raw storage. ''
+    * ''A 1100-machine cluster with 8800 cores and about 12 PB raw storage. ''
     * ''A 300-machine cluster with 2400 cores and about 3 PB raw storage. ''
     * ''Each (commodity) node has 8 cores and 12 TB of storage. ''
     * ''We are heavy users of both streaming as well as the Java APIs. We have 
built a higher level data warehousing framework using these features called 
Hive (see the http://hadoop.apache.org/hive/). We have also developed a FUSE 
implementation over HDFS. ''
@@ -372, +375 @@

    * ''Used for user profile analysis, statistical analysis,cookie level 
reporting tools. ''
    * ''Some Hive but mainly automated Java MapReduce jobs that process ~150MM 
new events/day. ''
  
+  * ''[[https://lbg.unc.edu|Lineberger Comprehensive Cancer Center - 
Bioinformatics Group]]''
-  * ''[[https://lbg.unc.edu|Lineberger Comprehensive Cancer Center - 
Bioinformatics Group]] This is the cancer center at UNC Chapel Hill. We are 
using Hadoop/HBase for databasing and analyzing Next Generation Sequencing 
(NGS) data produced for the [[http://cancergenome.nih.gov/|Cancer Genome 
Atlas]] (TCGA) project and other groups. This development is based on the 
[[http://seqware.sf.net|SeqWare]] open source project which includes SeqWare 
Query Engine, a database and web service built on top of HBase that stores 
sequence data types. Our prototype cluster includes: ''
+   * ''This is the cancer center at UNC Chapel Hill. We are using Hadoop/HBase 
for databasing and analyzing Next Generation Sequencing (NGS) data produced for 
the [[http://cancergenome.nih.gov/|Cancer Genome Atlas]] (TCGA) project and 
other groups. This development is based on the 
[[http://seqware.sf.net|SeqWare]] open source project which includes SeqWare 
Query Engine, a database and web service built on top of HBase that stores 
sequence data types. Our prototype cluster includes: ''
-   * ''8 dual quad core nodes running CentOS ''
+    * ''8 dual quad core nodes running CentOS ''
-   * ''total of 48TB of HDFS storage ''
+    * ''total of 48TB of HDFS storage ''
-   * ''HBase & Hadoop version 0.20 ''
+    * ''HBase & Hadoop version 0.20 ''
  
   * ''[[http://www.legolas-media.com|Legolas Media]] ''
  
@@ -391, +395 @@

      * ''Pig 0.9 heavily customized ''
      * ''Azkaban for scheduling ''
      * ''Hive, Avro, Kafka, and other bits and pieces... ''
- 
-  * ''We use these things for discovering People You May Know and 
[[http://www.linkedin.com/careerexplorer/dashboard|other]] 
[[http://inmaps.linkedinlabs.com/|fun]] 
[[http://www.linkedin.com/skills/|facts]]. ''
+   * ''We use these things for discovering People You May Know and 
[[http://www.linkedin.com/careerexplorer/dashboard|other]] 
[[http://inmaps.linkedinlabs.com/|fun]] 
[[http://www.linkedin.com/skills/|facts]]. ''
  
   * ''[[http://www.lookery.com|Lookery]] ''
    * ''We use Hadoop to process clickstream and demographic data in order to 
create web analytic reports. ''
@@ -524, +527 @@

    * ''Also used as a proof of concept cluster for a cloud based ERP system. ''
  
   * ''[[http://www.psgtech.edu/|PSG Tech, Coimbatore, India]] ''
-   * ''[[http://www.kraloyun.gen.tr/yeni-oyunlar/|Yeni Oyunlar]] ''
-   * ''[[http://www.ben10oyun.net/|Ben 10 Oyunları]] ''
-   * ''[[http://www.giysilerigiydirmeoyunlari.com/|Giysi Giydirme]]
    * ''Multiple alignment of protein sequences helps to determine evolutionary 
linkages and to predict molecular structures. The dynamic nature of the 
algorithm coupled with data and compute parallelism of Hadoop data grids 
improves the accuracy and speed of sequence alignment. Parallelism at the 
sequence and block level reduces the time complexity of MSA problems. The 
scalable nature of Hadoop makes it apt to solve large scale alignment problems. 
''
    * ''Our cluster size varies from 5 to 10 nodes. Cluster nodes vary from 
2950 Quad Core Rack Server, with 2x6MB Cache and 4 x 500 GB SATA Hard Drive to 
E7200 / E7400 processors with 4 GB RAM and 160 GB HDD. ''
  
@@ -694, +694 @@

    . ''We currently run one medium-sized Hadoop cluster (1.6PB) to store and 
serve up physics data for the computing portion of the Compact Muon Solenoid 
(CMS) experiment. This requires a filesystem which can download data at 
multiple Gbps and process data at an even higher rate locally. Additionally, 
several of our students are involved in research projects on Hadoop. ''
  
   * ''[[http://db.cs.utwente.nl|University of Twente, Database Group]] ''
-   . ''We run a 16 node cluster (dual core Xeon E3110 64 bit processors with 
6MB cache, 8GB main memory, 1TB disk) as of December 2008. We teach MapReduce 
and use Hadoop in our computer science master's program, and for information 
retrieval research. For more information, see: http://mirex.sourceforge.net/
+   . ''We run a 16 node cluster (dual core Xeon E3110 64 bit processors with 
6MB cache, 8GB main memory, 1TB disk) as of December 2008. We teach MapReduce 
and use Hadoop in our computer science master's program, and for information 
retrieval research. For more information, see: http://mirex.sourceforge.net/''
  
  = V =
   * ''[[http://www.veoh.com|Veoh]] ''
@@ -703, +703 @@

   * ''[[http://www.vibyggerhus.se/|Bygga hus]] ''
    * ''We use a Hadoop cluster to for search and indexing for our projects. ''
  
+  * ''[[http://www.visiblemeasures.com|Visible Measures Corporation]]
-  * ''[[http://www.visiblemeasures.com|Visible Measures Corporation]] uses 
Hadoop as a component in our Scalable Data Pipeline, which ultimately powers 
!VisibleSuite and other products. We use Hadoop to aggregate, store, and 
analyze data related to in-stream viewing behavior of Internet video audiences. 
Our current grid contains more than 128 CPU cores and in excess of 100 
terabytes of storage, and we plan to grow that substantially during 2008. ''
+   . uses Hadoop as a component in our Scalable Data Pipeline, which 
ultimately powers !VisibleSuite and other products. We use Hadoop to aggregate, 
store, and analyze data related to in-stream viewing behavior of Internet video 
audiences. Our current grid contains more than 128 CPU cores and in excess of 
100 terabytes of storage, and we plan to grow that substantially during 2008. ''
  
   * ''[[http://www.vksolutions.com/|VK Solutions]] ''
    * ''We use a small Hadoop cluster in the scope of our general research 
activities at [[http://www.vklabs.com|VK Labs]] to get a faster data access 
from web applications. ''

Reply via email to