Here's my few test results on Oracle BDA (40G/s infiniband network). It seems slow than our PageRank example.
P.S., There are some errors so I couldn't test large-scale. (java.lang.ClassCastException: hadoop.mrql.MR_int cannot be cast to hadoop.mrql.Inv and java.lang.Error: Cannot clear a non-materialized sequence ..., etc.) == 100K nodes and 1M edges == *** Using 10 BSP tasks (out of a max 10). Each task will handle about 2383611 bytes of input data. Run time: 30.384 secs *** Using 20 BSP tasks (out of a max 20). Each task will handle about 1191805 bytes of input data. Run time: 24.412 secs On Fri, Aug 24, 2012 at 9:36 AM, Edward J. Yoon <[email protected]> wrote: > Wow, very interesting. I'm going to install and test on my large cluster. > > On Fri, Aug 24, 2012 at 4:41 AM, Leonidas Fegaras <[email protected]> wrote: >> Dear Hama users, >> I am pleased to announce that the MRQL query processing system can now >> evaluate SQL-like queries on a Hama cluster. MRQL is available at: >> >> http://lambda.uta.edu/mrql/ >> >> MRQL (the Map-Reduce Query Language) is an SQL-like query language for >> large-scale, distributed data analysis. MRQL is powerful enough to >> express most common data analysis tasks over many different kinds of >> raw data, including hierarchical data and nested collections, such as >> XML data. MRQL can run in two modes: in MR (Map-Reduce) mode using >> Apache Hadoop and in BSP (Bulk Synchronous Parallel) mode using Apache >> Hama. Both modes use Apache's HDFS to read and write their data. >> >> Note that, the BSP mode is currently experimental (not fine-tuned yet) >> and lacks any fault-tolerance (if an error occurs, the entire job must >> be restarted). Due to our limited resources, MRQL has only been tested >> on a small cluster (7-nodes/28-cores). We compared the BSP mode with >> the MR mode by evaluating a pagerank query over a small graph (100K >> nodes, 1M edges) and found that BSP mode is about 4.5 times faster >> than the MR mode. Please let me know if you'd like to contribute to >> this project by testing MRQL on a larger cluster. >> Best regards, >> Leonidas Fegaras >> University of Texas at Arlington >> > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon -- Best Regards, Edward J. Yoon @eddieyoon
