Most likely is the following commit, which turned off Yarn by default. The
commit message will tell you how to turn it on ("testdata/cluster/admin -y
start_cluster"). I'm open to changing the defaults the other way around if
we think this is an important use case.-- Philip commit 6dc13d933b5ea9a41e584d83e95db72b9e8e19b3 Author: Philip Zeyliger <[email protected]> Date: Tue Apr 10 09:21:20 2018 -0700 Remove Yarn from minicluster by default. (2nd try) On Wed, Apr 18, 2018 at 6:25 PM, Tianyi Wang <[email protected]> wrote: > I was trying to run tests/comparison/data_generator.py, which used to > work > before switching to hadoop 3. Now MR claims that it's wrongly configured to > connect to 0.0.0.0:8032, but I cannot find text "8032" in our minicluster > configs. Does anybody happen to know this error? > > > Traceback (most recent call last): > File "./data_generator.py", line 339, in <module> > populator.populate_db(args.table_count, postgresql_conn=postgresql_ > conn) > File "./data_generator.py", line 134, in populate_db > self._run_data_generator_mr_job([g for _, g in table_and_generators], > self.db_name) > File "./data_generator.py", line 244, in _run_data_generator_mr_job > % (reducer_count, ','.join(files), mapper_input_file, hdfs_output_dir)) > File "/home/twang/projects/impala/tests/comparison/cluster.py", line > 476, > in run_mr_job > stderr=subprocess.STDOUT, env=env) > File "/home/twang/projects/impala/tests/util/shell_util.py", line 113, > in > shell > "\ncmd: %s\nstdout: %s\nstderr: %s") % (retcode, cmd, output, err)) > Exception: Command returned non-zero exit code: 5 > cmd: set -euo pipefail > hadoop jar > /home/twang/projects/impala/toolchain/cdh_components/ > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop- > streaming-3.0.0-cdh6.x-SNAPSHOT.jar > -D mapred.reduce.tasks=34 \ > -D stream.num.map.output.key.fields=2 \ > -files > ./common.py,./db_types.py,./data_generator_mapred_common. > py,./data_generator_mapper.py,./data_generator_reducer.py,./ > random_val_generator.py > \ > -input /tmp/data_gen_randomness_mr_input_1524095906 \ > -output /tmp/data_gen_randomness_mr_output_1524095906 \ > -mapper data_generator_mapper.py \ > -reducer data_generator_reducer.py > stdout: packageJobJar: [] > [/home/twang/projects/impala/toolchain/cdh_components/ > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop- > streaming-3.0.0-cdh6.x-SNAPSHOT.jar] > /tmp/streamjob6950277591392799099.jar tmpDir=null > 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager at / > 0.0.0.0:8032 > 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager at / > 0.0.0.0:8032 > 18/04/18 16:58:32 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 18/04/18 16:58:33 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > > .......................... > > 18/04/18 16:58:51 INFO ipc.Client: Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 18/04/18 16:58:51 INFO retry.RetryInvocationHandler: > java.net.ConnectException: Your endpoint configuration is wrong; For more > details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort, while > invoking ApplicationClientProtocolPBClientImpl.getNewApplication over null > after 1 failover attempts. Trying to failover after sleeping for 16129ms. > > -- > Tianyi Wang >
