hi, Xintong

Thanks for reply , I attached those lines below for application master
start command :


2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory
                - Crypto codec
org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available.
2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory
                - Using crypto codec
org.apache.hadoop.crypto.JceAesCtrCryptoCodec.
2020-05-11 21:16:16,636 DEBUG org.apache.hadoop.hdfs.DataStreamer
                 - DataStreamer block
BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet
packet seqno: 0 offsetInBlock: 0 lastPacketInBlock: false
lastByteOffsetInBlock: 1697
2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer
                 - DFSClient seqno: 0 reply: SUCCESS
downstreamAckTimeNanos: 0 flag: 0
2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer
                 - DataStreamer block
BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet
packet seqno: 1 offsetInBlock: 1697 lastPacketInBlock: true
lastByteOffsetInBlock: 1697
2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer
                 - DFSClient seqno: 1 reply: SUCCESS
downstreamAckTimeNanos: 0 flag: 0
2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer
                 - Closing old block
BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315
2020-05-11 21:16:16,641 DEBUG org.apache.hadoop.ipc.Client
                - IPC Client (1954985045) connection to
ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #70
org.apache.hadoop.hdfs.protocol.ClientProtocol.complete
2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client
                - IPC Client (1954985045) connection to
ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #70
2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
                 - Call: complete took 2ms
2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client
                - IPC Client (1954985045) connection to
ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #71
org.apache.hadoop.hdfs.protocol.ClientProtocol.setTimes
2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.Client
                - IPC Client (1954985045) connection to
ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #71
2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
                 - Call: setTimes took 2ms
2020-05-11 21:16:16,647 DEBUG org.apache.hadoop.ipc.Client
                - IPC Client (1954985045) connection to
ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #72
org.apache.hadoop.hdfs.protocol.ClientProtocol.setPermission
2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.Client
                - IPC Client (1954985045) connection to
ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #72
2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
                 - Call: setPermission took 2ms
2020-05-11 21:16:16,654 DEBUG
org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Application
Master start command: $JAVA_HOME/bin/java -Xmx424m
"-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation
-XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly"
-Dlog.file="<LOG_DIR>/jobmanager.log"
-Dlog4j.configuration=file:log4j.properties
org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint  1>
<LOG_DIR>/jobmanager.out 2> <LOG_DIR>/jobmanager.err
2020-05-11 21:16:16,654 DEBUG org.apache.hadoop.ipc.Client
                - stopping client from cache:
org.apache.hadoop.ipc.Client@28194a50
2020-05-11 21:16:16,656 DEBUG
org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
- org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
method setApplicationTags.
2020-05-11 21:16:16,656 DEBUG
org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
- org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
method setAttemptFailuresValidityInterval.
2020-05-11 21:16:16,656 DEBUG
org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
- org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
method setKeepContainersAcrossApplicationAttempts.
2020-05-11 21:16:16,656 DEBUG
org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
- org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
method setNodeLabelExpression.

Xintong Song <tonysong...@gmail.com> 于2020年5月11日周一 下午10:11写道:

> Hi Jacky,
>
> Could you search for "Application Master start command:" in the debug log
> and post the result and a few lines before & after that? This is not
> included in the clip of attached log file.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Tue, May 12, 2020 at 5:33 AM Jacky D <jacky.du0...@gmail.com> wrote:
>
>> hi, Robert
>>
>> Thanks so much for quick reply  , I changed the log level to debug  and
>> attach the log file .
>>
>> Thanks
>> Jacky
>>
>> Robert Metzger <rmetz...@apache.org> 于2020年5月11日周一 下午4:14写道:
>>
>>> Thanks a lot for posting the full output.
>>>
>>> It seems that Flink is passing an invalid list of arguments to the JVM.
>>> Can you
>>> - set the root log level in conf/log4j-yarn-session.properties to DEBUG
>>> - then launch the YARN session
>>> - share the log file of the yarn session on the mailing list?
>>>
>>> I'm particularly interested in the line printed here, as it shows the
>>> JVM invocation.
>>>
>>> https://github.com/apache/flink/blob/release-1.6/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L1630
>>>
>>>
>>> On Mon, May 11, 2020 at 9:56 PM Jacky D <jacky.du0...@gmail.com> wrote:
>>>
>>>> Hi,Robert
>>>>
>>>> Yes , I tried to retrieve more log info from yarn UI , the full logs
>>>> showing below , this happens when I try to create a flink yarn session on
>>>> emr when set up jitwatch configuration .
>>>>
>>>> 2020-05-11 19:06:09,552 ERROR
>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Error while
>>>> running the Flink Yarn session.
>>>> java.lang.reflect.UndeclaredThrowableException
>>>> at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1862)
>>>> at
>>>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>>>> at
>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:813)
>>>> Caused by:
>>>> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't
>>>> deploy Yarn session cluster
>>>> at
>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:429)
>>>> at
>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:610)
>>>> at
>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:813)
>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>> at javax.security.auth.Subject.doAs(Subject.java:422)
>>>> at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
>>>> ... 2 more
>>>> Caused by:
>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException:
>>>> The YARN application unexpectedly switched to state FAILED during
>>>> deployment.
>>>> Diagnostics from YARN: Application application_1584459865196_0165
>>>> failed 1 times (global limit =2; local limit is =1) due to AM Container for
>>>> appattempt_1584459865196_0165_000001 exited with  exitCode: 1
>>>> Failing this attempt.Diagnostics: Exception from container-launch.
>>>> Container id: container_1584459865196_0165_01_000001
>>>> Exit code: 1
>>>> Exception message: Usage: java [-options] class [args...]
>>>>            (to execute a class)
>>>>    or  java [-options] -jar jarfile [args...]
>>>>            (to execute a jar file)
>>>> where options include:
>>>>     -d32   use a 32-bit data model if available
>>>>     -d64   use a 64-bit data model if available
>>>>     -server   to select the "server" VM
>>>>                   The default VM is server,
>>>>                   because you are running on a server-class machine.
>>>>
>>>>
>>>>     -cp <class search path of directories and zip/jar files>
>>>>     -classpath <class search path of directories and zip/jar files>
>>>>                   A : separated list of directories, JAR archives,
>>>>                   and ZIP archives to search for class files.
>>>>     -D<name>=<value>
>>>>                   set a system property
>>>>     -verbose:[class|gc|jni]
>>>>                   enable verbose output
>>>>     -version      print product version and exit
>>>>     -version:<value>
>>>>                   Warning: this feature is deprecated and will be
>>>> removed
>>>>                   in a future release.
>>>>                   require the specified version to run
>>>>     -showversion  print product version and continue
>>>>     -jre-restrict-search | -no-jre-restrict-search
>>>>                   Warning: this feature is deprecated and will be
>>>> removed
>>>>                   in a future release.
>>>>                   include/exclude user private JREs in the version
>>>> search
>>>>     -? -help      print this help message
>>>>     -X            print help on non-standard options
>>>>     -ea[:<packagename>...|:<classname>]
>>>>     -enableassertions[:<packagename>...|:<classname>]
>>>>                   enable assertions with specified granularity
>>>>     -da[:<packagename>...|:<classname>]
>>>>     -disableassertions[:<packagename>...|:<classname>]
>>>>                   disable assertions with specified granularity
>>>>     -esa | -enablesystemassertions
>>>>                   enable system assertions
>>>>     -dsa | -disablesystemassertions
>>>>                   disable system assertions
>>>>     -agentlib:<libname>[=<options>]
>>>>                   load native agent library <libname>, e.g.
>>>> -agentlib:hprof
>>>>                   see also, -agentlib:jdwp=help and -agentlib:hprof=help
>>>>     -agentpath:<pathname>[=<options>]
>>>>                   load native agent library by full pathname
>>>>     -javaagent:<jarpath>[=<options>]
>>>>                   load Java programming language agent, see
>>>> java.lang.instrument
>>>>     -splash:<imagepath>
>>>>                   show splash screen with specified image
>>>> See
>>>> http://www.oracle.com/technetwork/java/javase/documentation/index.html
>>>> for more details.
>>>>
>>>> Thanks
>>>> Jacky
>>>>
>>>> Robert Metzger <rmetz...@apache.org> 于2020年5月11日周一 下午3:42写道:
>>>>
>>>>> Hey Jacky,
>>>>>
>>>>> The error says "The YARN application unexpectedly switched to state
>>>>> FAILED during deployment.".
>>>>> Have you tried retrieving the YARN application logs?
>>>>> Does the YARN UI / resource manager logs reveal anything on the reason
>>>>> for the deployment to fail?
>>>>>
>>>>> Best,
>>>>> Robert
>>>>>
>>>>>
>>>>> On Mon, May 11, 2020 at 9:34 PM Jacky D <jacky.du0...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> ---------- Forwarded message ---------
>>>>>> 发件人: Jacky D <jacky.du0...@gmail.com>
>>>>>> Date: 2020年5月11日周一 下午3:12
>>>>>> Subject: Re: Flink Memory analyze on AWS EMR
>>>>>> To: Khachatryan Roman <khachatryan.ro...@gmail.com>
>>>>>>
>>>>>>
>>>>>> Hi, Roman
>>>>>>
>>>>>> Thanks for quick response , I tried without logFIle option but failed
>>>>>> with same error , I'm currently using flink 1.6
>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/application_profiling.html,
>>>>>> so I can only use Jitwatch or JMC .  I guess those tools only available 
>>>>>> on
>>>>>> Standalone cluster ? as document mentioned "Each standalone
>>>>>> JobManager, TaskManager, HistoryServer, and ZooKeeper daemon redirects
>>>>>> stdout and stderr to a file with a .out filename suffix and writes
>>>>>> internal logging to a file with a .log suffix. Java options
>>>>>> configured by the user in env.java.opts" ?
>>>>>>
>>>>>> Thanks
>>>>>> Jacky
>>>>>>
>>>>>

Reply via email to