Hi, I'm having the same problem. I found out it depends on the version of docker. The benchmark runs well in Docker 1.9, but not in Docker 1.12. In the latter version it seems to hang after starting the first job(s). There is no error, and I could not find anything in the log files.
Output: ... Creating sequence files from wikiXML Running on hadoop, using /opt/new_analytic/hadoop-2.7.1/bin/hadoop and HADOOP_CONF_DIR= MAHOUT-JOB: /opt/new_analytic/apache-mahout-distribution-0.11.0/examples/target/mahout-examples-0.11.0-job.jar 16/08/31 09:09:52 INFO WikipediaToSequenceFile: Input: /opt/new_analytic/apache-mahout-distribution-0.11.0/examples/temp/mahout-work-wiki/wikixml/enwiki-latest-pages-articles.xml Out: /opt/new_analytic/apache-mahout-distribution-0.11.0/examples/temp/mahout-work-wiki/wikipediainput Categories: /opt/new_analytic/apache-mahout-distribution-0.11.0/examples/temp/categories.txt All Files: false 16/08/31 09:09:53 INFO RMProxy: Connecting to ResourceManager at master.cloudsuite.com/172.17.0.3:8040 16/08/31 09:09:53 WARN JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 16/08/31 09:09:53 INFO FileInputFormat: Total input paths to process : 1 16/08/31 09:09:53 INFO JobSubmitter: number of splits:5 16/08/31 09:09:53 INFO JobSubmitter: Submitting tokens for job: job_1472634440875_0001 16/08/31 09:09:53 INFO YarnClientImpl: Submitted application application_1472634440875_0001 16/08/31 09:09:54 INFO Job: The url to track the job: http://master.cloudsuite.com:8088/proxy/application_1472634440875_0001/ 16/08/31 09:09:54 INFO Job: Running job: job_1472634440875_0001 Thanks, Stijn -----Original Message----- From: chandrap [mailto:[email protected]] Sent: Friday, September 2, 2016 2:17 PM To: [email protected] Subject: [cloudsuite] Problem in data analytics Hi, I was trying to use CloudSuit (Data Analytics) benchmark but not able to execute the workload. I have pulled the docker image and downloaded all the data sets (small, medium and big, all in three different containers). I am using only master node (0 slave node). After starting the workload (by executing script run.sh), it stops after reaching the job_running status. Can you please help me in executing the workload? Regards, Chandra Prakash Intel Corporation NV/SA Kings Square, Veldkant 31 2550 Kontich RPM (Bruxelles) 0415.497.718. Citibank, Brussels, account 570/1031255/09 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
