Thanks Philip. It works! I'll update the README file.
On Wed, Apr 18, 2018 at 6:49 PM Philip Zeyliger <[email protected]> wrote:
> Most likely is the following commit, which turned off Yarn by default. The
> commit message will tell you how to turn it on ("testdata/cluster/admin -y
> start_cluster"). I'm open to changing the defaults the other way around if
> we think this is an important use case.
>
> -- Philip
>
> commit 6dc13d933b5ea9a41e584d83e95db72b9e8e19b3
> Author: Philip Zeyliger <[email protected]>
> Date: Tue Apr 10 09:21:20 2018 -0700
>
> Remove Yarn from minicluster by default. (2nd try)
>
> On Wed, Apr 18, 2018 at 6:25 PM, Tianyi Wang <[email protected]> wrote:
>
> > I was trying to run tests/comparison/data_generator.py, which used to
> > work
> > before switching to hadoop 3. Now MR claims that it's wrongly configured
> to
> > connect to 0.0.0.0:8032, but I cannot find text "8032" in our
> minicluster
> > configs. Does anybody happen to know this error?
> >
> >
> > Traceback (most recent call last):
> > File "./data_generator.py", line 339, in <module>
> > populator.populate_db(args.table_count, postgresql_conn=postgresql_
> > conn)
> > File "./data_generator.py", line 134, in populate_db
> > self._run_data_generator_mr_job([g for _, g in table_and_generators],
> > self.db_name)
> > File "./data_generator.py", line 244, in _run_data_generator_mr_job
> > % (reducer_count, ','.join(files), mapper_input_file,
> hdfs_output_dir))
> > File "/home/twang/projects/impala/tests/comparison/cluster.py", line
> > 476,
> > in run_mr_job
> > stderr=subprocess.STDOUT, env=env)
> > File "/home/twang/projects/impala/tests/util/shell_util.py", line 113,
> > in
> > shell
> > "\ncmd: %s\nstdout: %s\nstderr: %s") % (retcode, cmd, output, err))
> > Exception: Command returned non-zero exit code: 5
> > cmd: set -euo pipefail
> > hadoop jar
> > /home/twang/projects/impala/toolchain/cdh_components/
> > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
> > streaming-3.0.0-cdh6.x-SNAPSHOT.jar
> > -D mapred.reduce.tasks=34 \
> > -D stream.num.map.output.key.fields=2 \
> > -files
> > ./common.py,./db_types.py,./data_generator_mapred_common.
> > py,./data_generator_mapper.py,./data_generator_reducer.py,./
> > random_val_generator.py
> > \
> > -input /tmp/data_gen_randomness_mr_input_1524095906 \
> > -output /tmp/data_gen_randomness_mr_output_1524095906 \
> > -mapper data_generator_mapper.py \
> > -reducer data_generator_reducer.py
> > stdout: packageJobJar: []
> > [/home/twang/projects/impala/toolchain/cdh_components/
> > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
> > streaming-3.0.0-cdh6.x-SNAPSHOT.jar]
> > /tmp/streamjob6950277591392799099.jar tmpDir=null
> > 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager at /
> > 0.0.0.0:8032
> > 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager at /
> > 0.0.0.0:8032
> > 18/04/18 16:58:32 INFO ipc.Client: Retrying connect to server:
> > 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is
> > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> > MILLISECONDS)
> > 18/04/18 16:58:33 INFO ipc.Client: Retrying connect to server:
> > 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
> > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> > MILLISECONDS)
> >
> > ..........................
> >
> > 18/04/18 16:58:51 INFO ipc.Client: Retrying connect to server:
> > 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is
> > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> > MILLISECONDS)
> > 18/04/18 16:58:51 INFO retry.RetryInvocationHandler:
> > java.net.ConnectException: Your endpoint configuration is wrong; For more
> > details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort, while
> > invoking ApplicationClientProtocolPBClientImpl.getNewApplication over
> null
> > after 1 failover attempts. Trying to failover after sleeping for 16129ms.
> >
> > --
> > Tianyi Wang
> >
>
--
Tianyi Wang