[ 
https://issues.apache.org/jira/browse/HAMA-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Behroz Sikander updated HAMA-990:
---------------------------------
    Attachment: Benchmark_script.sh

Hi, 
I have created the initial script to run PageRank algorithm using MRQL on Hama. 
You can review the script in the attachment.

The script assumes a few things 
1- User needs to specify the MRQL installation directory against the variable 
MRQL_INSTALL_FOLDER. Can we replace this with a default path ?
2- Hadoop is installed at $HADOOP_PREFIX environment variable. In my PC it is 
configured $HADOOP_PREFIX. What should I change it to ?
3- HAMA is installed at $HAMA_HOME environment variable.
4- HDFS address is by default set to localhost:54310. Should I pick this 
address directly from the HADOOP core-site.xml ?
5- Java is configured against $JAVA_HOME environment variable.

The script currently does the following
1- Download the MRQL tarball in a local directory specified by user in the 
script
2- Download Jars required by MRQL
3- Change MRQL configuration file for Hadoop and Hama settings
4- Execute the Page Rank algorithm
5- Parse the output of algorithm to output the total runtime in seconds.

It would be great if you can test it on AWS/Google Cluster so that I can fix 
the bugs. In parallel, I am finishing the Spark/Flink script and will send you 
the update soon. Once Spark/Flink is also done. I will make the script 
parameterized as you mentioned in your last comment.

> GSoC'16: Apache Hama benchmark against Spark and Flink
> ------------------------------------------------------
>
>                 Key: HAMA-990
>                 URL: https://issues.apache.org/jira/browse/HAMA-990
>             Project: Hama
>          Issue Type: Documentation
>            Reporter: Behroz Sikander
>            Priority: Minor
>         Attachments: Benchmark_script.sh
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to