Re: Minicluster MR "your endpoint configuration is wrong" error

Tianyi Wang Tue, 24 Apr 2018 11:04:23 -0700

I haven't made it work. I plan to try 2.x branch.

On Tue, Apr 24, 2018 at 10:57 AM Philip Zeyliger <[email protected]>
wrote:


> That looks like starting the Yarn application failed. I can't tell from
> what you've posted just why. Are you still struggling with it?
>
> -- Philip
>
> On Mon, Apr 23, 2018 at 4:31 PM, Tianyi Wang <[email protected]> wrote:
>
> > My previous message is misleading. After I started yarn it doesn't fail
> > immediately but still doesn't work:
> >
> >
> > 2018-04-23 23:15:46,065 INFO:db_connection[752]:Dropping database
> > randomness
> > 2018-04-23 23:15:46,095 INFO:db_connection[234]:Creating database
> > randomness
> > 2018-04-23 23:15:52,390 INFO:data_generator[235]:Starting MR job to
> > generate data for randomness
> > Traceback (most recent call last):
> >   File "tests/comparison/data_generator.py", line 339, in <module>
> >     populator.populate_db(args.table_count, postgresql_conn=postgresql_
> > conn)
> >   File "tests/comparison/data_generator.py", line 134, in populate_db
> >     self._run_data_generator_mr_job([g for _, g in table_and_generators],
> > self.db_name)
> >   File "tests/comparison/data_generator.py", line 244, in
> > _run_data_generator_mr_job
> >     % (reducer_count, ','.join(files), mapper_input_file,
> hdfs_output_dir))
> >   File "/home/impdev/projects/impala/tests/comparison/cluster.py", line
> > 476, in run_mr_job
> >     stderr=subprocess.STDOUT, env=env)
> >   File "/home/impdev/projects/impala/tests/util/shell_util.py", line 113,
> > in shell
> >     "\ncmd: %s\nstdout: %s\nstderr: %s") % (retcode, cmd, output, err))
> > Exception: Command returned non-zero exit code: 1
> > cmd: set -euo pipefail
> > hadoop jar
> > /home/impdev/projects/impala/toolchain/cdh_components/
> > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
> > streaming-3.0.0-cdh6.x-SNAPSHOT.jar
> > -D mapred.reduce.tasks=36 \
> >         -D stream.num.map.output.key.fields=2 \
> >         -files
> > tests/comparison/common.py,tests/comparison/db_types.py,
> > tests/comparison/data_generator_mapred_common.py,tests/comparison/data_
> > generator_mapper.py,tests/comparison/data_generator_
> > reducer.py,tests/comparison/random_val_generator.py
> > \
> >         -input /tmp/data_gen_randomness_mr_input_1524525348 \
> >         -output /tmp/data_gen_randomness_mr_output_1524525348 \
> >         -mapper data_generator_mapper.py \
> >         -reducer data_generator_reducer.py
> > stdout: packageJobJar: []
> > [/home/impdev/projects/impala/toolchain/cdh_components/
> > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
> > streaming-3.0.0-cdh6.x-SNAPSHOT.jar]
> > /tmp/streamjob2990195923122538287.jar tmpDir=null
> > 18/04/23 23:15:53 INFO client.RMProxy: Connecting to ResourceManager at /
> > 0.0.0.0:8032
> > 18/04/23 23:15:53 INFO client.RMProxy: Connecting to ResourceManager at /
> > 0.0.0.0:8032
> > 18/04/23 23:15:54 INFO mapreduce.JobResourceUploader: Disabling Erasure
> > Coding for path:
> > /tmp/hadoop-yarn/staging/impdev/.staging/job_1524519161700_0002
> > 18/04/23 23:15:54 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> > 18/04/23 23:15:54 INFO lzo.LzoCodec: Successfully loaded & initialized
> > native-lzo library [hadoop-lzo rev 2b3bd7731ff3ef5d8585a004b90696
> > 630e5cea96]
> > 18/04/23 23:15:54 INFO mapred.FileInputFormat: Total input files to
> process
> > : 1
> > 18/04/23 23:15:54 INFO mapreduce.JobSubmitter: number of splits:2
> > 18/04/23 23:15:54 INFO Configuration.deprecation: mapred.reduce.tasks is
> > deprecated. Instead, use mapreduce.job.reduces
> > 18/04/23 23:15:54 INFO Configuration.deprecation:
> > yarn.resourcemanager.system-metrics-publisher.enabled is deprecated.
> > Instead, use yarn.system-metrics-publisher.enabled
> > 18/04/23 23:15:54 INFO mapreduce.JobSubmitter: Submitting tokens for job:
> > job_1524519161700_0002
> > 18/04/23 23:15:54 INFO mapreduce.JobSubmitter: Executing with tokens: []
> > 18/04/23 23:15:54 INFO conf.Configuration: resource-types.xml not found
> > 18/04/23 23:15:54 INFO resource.ResourceUtils: Unable to find
> > 'resource-types.xml'.
> > 18/04/23 23:15:54 INFO impl.YarnClientImpl: Submitted application
> > application_1524519161700_0002
> > 18/04/23 23:15:54 INFO mapreduce.Job: The url to track the job:
> > http://c37e0835e988:8088/proxy/application_1524519161700_0002/
> > 18/04/23 23:15:54 INFO mapreduce.Job: Running job: job_1524519161700_0002
> > 18/04/23 23:16:00 INFO mapreduce.Job: Job job_1524519161700_0002 running
> in
> > uber mode : false
> > 18/04/23 23:16:00 INFO mapreduce.Job:  map 0% reduce 0%
> > 18/04/23 23:16:06 INFO mapreduce.Job: Job job_1524519161700_0002 failed
> > with state FAILED due to: Application application_1524519161700_0002
> failed
> > 2 times due to AM Container for appattempt_1524519161700_0002_000002
> > exited
> > with  exitCode: 255
> > Failing this attempt.Diagnostics: [2018-04-23 23:16:06.473]Exception from
> > container-launch.
> > Container id: container_1524519161700_0002_02_000001
> > Exit code: 255
> >
> > [2018-04-23 23:16:06.475]Container exited with a non-zero exit code 255.
> > Error file: prelaunch.err.
> > Last 4096 bytes of prelaunch.err :
> > Last 4096 bytes of stderr :
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> > INFO: Registering
> > org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a
> > provider
> > class
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> > INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler
> as
> > a provider class
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> > INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices
> > as a root resource class
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
> > INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015
> 03:25
> > AM'
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
> > getComponentProvider
> > INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.
> > JAXBContextResolver
> > to GuiceManagedComponentProvider with the scope "Singleton"
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
> > getComponentProvider
> > INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to
> > GuiceManagedComponentProvider with the scope "Singleton"
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
> > getComponentProvider
> > INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to
> > GuiceManagedComponentProvider with the scope "PerRequest"
> > log4j:WARN No appenders could be found for logger
> > (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
> > log4j:WARN Please initialize the log4j system properly.
> > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> > more info.
> >
> >
> > [2018-04-23 23:16:06.476]Container exited with a non-zero exit code 255.
> > Error file: prelaunch.err.
> > Last 4096 bytes of prelaunch.err :
> > Last 4096 bytes of stderr :
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> > INFO: Registering
> > org.apache.hadoop.mapreduce.v2.app.webapp.JAXBContextResolver as a
> > provider
> > class
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> > INFO: Registering org.apache.hadoop.yarn.webapp.GenericExceptionHandler
> as
> > a provider class
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory register
> > INFO: Registering org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices
> > as a root resource class
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.server.impl.application.WebApplicationImpl _initiate
> > INFO: Initiating Jersey application, version 'Jersey: 1.19 02/11/2015
> 03:25
> > AM'
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
> > getComponentProvider
> > INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.
> > JAXBContextResolver
> > to GuiceManagedComponentProvider with the scope "Singleton"
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
> > getComponentProvider
> > INFO: Binding org.apache.hadoop.yarn.webapp.GenericExceptionHandler to
> > GuiceManagedComponentProvider with the scope "Singleton"
> > Apr 23, 2018 11:16:03 PM
> > com.sun.jersey.guice.spi.container.GuiceComponentProviderFactory
> > getComponentProvider
> > INFO: Binding org.apache.hadoop.mapreduce.v2.app.webapp.AMWebServices to
> > GuiceManagedComponentProvider with the scope "PerRequest"
> > log4j:WARN No appenders could be found for logger
> > (org.apache.hadoop.mapreduce.v2.app.MRAppMaster).
> > log4j:WARN Please initialize the log4j system properly.
> > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
> > more info.
> >
> >
> > For more detailed output, check the application tracking page:
> > http://localhost:8088/cluster/app/application_1524519161700_0002 Then
> > click
> > on links to logs of each attempt.
> > . Failing the application.
> > 18/04/23 23:16:06 INFO mapreduce.Job: Counters: 0
> > 18/04/23 23:16:06 ERROR streaming.StreamJob: Job not successful!
> > Streaming Command Failed!
> >
> >
> >
> >
> >
> >
> > On Thu, Apr 19, 2018 at 1:59 PM Tianyi Wang <[email protected]> wrote:
> >
> > > Thanks Philip. It works! I'll update the README file.
> > >
> > > On Wed, Apr 18, 2018 at 6:49 PM Philip Zeyliger <[email protected]>
> > > wrote:
> > >
> > >> Most likely is the following commit, which turned off Yarn by default.
> > The
> > >> commit message will tell you how to turn it on
> ("testdata/cluster/admin
> > -y
> > >> start_cluster"). I'm open to changing the defaults the other way
> around
> > if
> > >> we think this is an important use case.
> > >>
> > >> -- Philip
> > >>
> > >> commit 6dc13d933b5ea9a41e584d83e95db72b9e8e19b3
> > >> Author: Philip Zeyliger <[email protected]>
> > >> Date:   Tue Apr 10 09:21:20 2018 -0700
> > >>
> > >>     Remove Yarn from minicluster by default. (2nd try)
> > >>
> > >> On Wed, Apr 18, 2018 at 6:25 PM, Tianyi Wang <[email protected]>
> > wrote:
> > >>
> > >> > I was trying to run  tests/comparison/data_generator.py, which used
> > to
> > >> > work
> > >> > before switching to hadoop 3. Now MR claims that it's wrongly
> > >> configured to
> > >> > connect to 0.0.0.0:8032, but I cannot find text "8032" in our
> > >> minicluster
> > >> > configs. Does anybody happen to know this error?
> > >> >
> > >> >
> > >> > Traceback (most recent call last):
> > >> >   File "./data_generator.py", line 339, in <module>
> > >> >     populator.populate_db(args.table_count,
> > postgresql_conn=postgresql_
> > >> > conn)
> > >> >   File "./data_generator.py", line 134, in populate_db
> > >> >     self._run_data_generator_mr_job([g for _, g in
> > >> table_and_generators],
> > >> > self.db_name)
> > >> >   File "./data_generator.py", line 244, in
> _run_data_generator_mr_job
> > >> >     % (reducer_count, ','.join(files), mapper_input_file,
> > >> hdfs_output_dir))
> > >> >   File "/home/twang/projects/impala/tests/comparison/cluster.py",
> > line
> > >> > 476,
> > >> > in run_mr_job
> > >> >     stderr=subprocess.STDOUT, env=env)
> > >> >   File "/home/twang/projects/impala/tests/util/shell_util.py", line
> > 113,
> > >> > in
> > >> > shell
> > >> >     "\ncmd: %s\nstdout: %s\nstderr: %s") % (retcode, cmd, output,
> > err))
> > >> > Exception: Command returned non-zero exit code: 5
> > >> > cmd: set -euo pipefail
> > >> > hadoop jar
> > >> > /home/twang/projects/impala/toolchain/cdh_components/
> > >> > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
> > >> > streaming-3.0.0-cdh6.x-SNAPSHOT.jar
> > >> > -D mapred.reduce.tasks=34 \
> > >> >         -D stream.num.map.output.key.fields=2 \
> > >> >         -files
> > >> > ./common.py,./db_types.py,./data_generator_mapred_common.
> > >> > py,./data_generator_mapper.py,./data_generator_reducer.py,./
> > >> > random_val_generator.py
> > >> > \
> > >> >         -input /tmp/data_gen_randomness_mr_input_1524095906 \
> > >> >         -output /tmp/data_gen_randomness_mr_output_1524095906 \
> > >> >         -mapper data_generator_mapper.py \
> > >> >         -reducer data_generator_reducer.py
> > >> > stdout: packageJobJar: []
> > >> > [/home/twang/projects/impala/toolchain/cdh_components/
> > >> > hadoop-3.0.0-cdh6.x-SNAPSHOT/share/hadoop/tools/lib/hadoop-
> > >> > streaming-3.0.0-cdh6.x-SNAPSHOT.jar]
> > >> > /tmp/streamjob6950277591392799099.jar tmpDir=null
> > >> > 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > >> /
> > >> > 0.0.0.0:8032
> > >> > 18/04/18 16:58:30 INFO client.RMProxy: Connecting to ResourceManager
> > at
> > >> /
> > >> > 0.0.0.0:8032
> > >> > 18/04/18 16:58:32 INFO ipc.Client: Retrying connect to server:
> > >> > 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is
> > >> > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> > >> > MILLISECONDS)
> > >> > 18/04/18 16:58:33 INFO ipc.Client: Retrying connect to server:
> > >> > 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
> > >> > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> > >> > MILLISECONDS)
> > >> >
> > >> > ..........................
> > >> >
> > >> > 18/04/18 16:58:51 INFO ipc.Client: Retrying connect to server:
> > >> > 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is
> > >> > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
> > >> > MILLISECONDS)
> > >> > 18/04/18 16:58:51 INFO retry.RetryInvocationHandler:
> > >> > java.net.ConnectException: Your endpoint configuration is wrong; For
> > >> more
> > >> > details see:  http://wiki.apache.org/hadoop/UnsetHostnameOrPort,
> > while
> > >> > invoking ApplicationClientProtocolPBClientImpl.getNewApplication
> over
> > >> null
> > >> > after 1 failover attempts. Trying to failover after sleeping for
> > >> 16129ms.
> > >> >
> > >> > --
> > >> > Tianyi Wang
> > >> >
> > >>
> > > --
> > > Tianyi Wang
> > >
> > --
> > Tianyi Wang
> >
>
-- 
Tianyi Wang

Re: Minicluster MR "your endpoint configuration is wrong" error

Reply via email to