Hi Jacky, I suspect that the quotes are the actual issue. Could you try to remove them? See also [1].
[1] http://blogs.perl.org/users/tinita/2018/03/strings-in-yaml---to-quote-or-not-to-quote.html On Tue, May 12, 2020 at 4:03 PM Jacky D <jacky.du0...@gmail.com> wrote: > hi, Xintong > > Thanks for reply , I attached those lines below for application master > start command : > > > 2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory > - Crypto codec > org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. > 2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory > - Using crypto codec > org.apache.hadoop.crypto.JceAesCtrCryptoCodec. > 2020-05-11 21:16:16,636 DEBUG org.apache.hadoop.hdfs.DataStreamer > - DataStreamer block > BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet > packet seqno: 0 offsetInBlock: 0 lastPacketInBlock: false > lastByteOffsetInBlock: 1697 > 2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer > - DFSClient seqno: 0 reply: SUCCESS > downstreamAckTimeNanos: 0 flag: 0 > 2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer > - DataStreamer block > BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet > packet seqno: 1 offsetInBlock: 1697 lastPacketInBlock: true > lastByteOffsetInBlock: 1697 > 2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer > - DFSClient seqno: 1 reply: SUCCESS > downstreamAckTimeNanos: 0 flag: 0 > 2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer > - Closing old block > BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 > 2020-05-11 21:16:16,641 DEBUG org.apache.hadoop.ipc.Client > - IPC Client (1954985045) connection to > ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #70 > org.apache.hadoop.hdfs.protocol.ClientProtocol.complete > 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client > - IPC Client (1954985045) connection to > ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #70 > 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine > - Call: complete took 2ms > 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client > - IPC Client (1954985045) connection to > ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #71 > org.apache.hadoop.hdfs.protocol.ClientProtocol.setTimes > 2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.Client > - IPC Client (1954985045) connection to > ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #71 > 2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine > - Call: setTimes took 2ms > 2020-05-11 21:16:16,647 DEBUG org.apache.hadoop.ipc.Client > - IPC Client (1954985045) connection to > ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #72 > org.apache.hadoop.hdfs.protocol.ClientProtocol.setPermission > 2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.Client > - IPC Client (1954985045) connection to > ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #72 > 2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine > - Call: setPermission took 2ms > 2020-05-11 21:16:16,654 DEBUG > org.apache.flink.yarn.AbstractYarnClusterDescriptor - Application > Master start command: $JAVA_HOME/bin/java -Xmx424m > "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation > -XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly" > -Dlog.file="<LOG_DIR>/jobmanager.log" > -Dlog4j.configuration=file:log4j.properties > org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint 1> > <LOG_DIR>/jobmanager.out 2> <LOG_DIR>/jobmanager.err > 2020-05-11 21:16:16,654 DEBUG org.apache.hadoop.ipc.Client > - stopping client from cache: > org.apache.hadoop.ipc.Client@28194a50 > 2020-05-11 21:16:16,656 DEBUG > org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector > - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports > method setApplicationTags. > 2020-05-11 21:16:16,656 DEBUG > org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector > - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports > method setAttemptFailuresValidityInterval. > 2020-05-11 21:16:16,656 DEBUG > org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector > - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports > method setKeepContainersAcrossApplicationAttempts. > 2020-05-11 21:16:16,656 DEBUG > org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector > - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports > method setNodeLabelExpression. > > Xintong Song <tonysong...@gmail.com> 于2020年5月11日周一 下午10:11写道: > >> Hi Jacky, >> >> Could you search for "Application Master start command:" in the debug log >> and post the result and a few lines before & after that? This is not >> included in the clip of attached log file. >> >> Thank you~ >> >> Xintong Song >> >> >> >> On Tue, May 12, 2020 at 5:33 AM Jacky D <jacky.du0...@gmail.com> wrote: >> >>> hi, Robert >>> >>> Thanks so much for quick reply , I changed the log level to debug and >>> attach the log file . >>> >>> Thanks >>> Jacky >>> >>> Robert Metzger <rmetz...@apache.org> 于2020年5月11日周一 下午4:14写道: >>> >>>> Thanks a lot for posting the full output. >>>> >>>> It seems that Flink is passing an invalid list of arguments to the JVM. >>>> Can you >>>> - set the root log level in conf/log4j-yarn-session.properties to DEBUG >>>> - then launch the YARN session >>>> - share the log file of the yarn session on the mailing list? >>>> >>>> I'm particularly interested in the line printed here, as it shows the >>>> JVM invocation. >>>> >>>> https://github.com/apache/flink/blob/release-1.6/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L1630 >>>> >>>> >>>> On Mon, May 11, 2020 at 9:56 PM Jacky D <jacky.du0...@gmail.com> wrote: >>>> >>>>> Hi,Robert >>>>> >>>>> Yes , I tried to retrieve more log info from yarn UI , the full logs >>>>> showing below , this happens when I try to create a flink yarn session on >>>>> emr when set up jitwatch configuration . >>>>> >>>>> 2020-05-11 19:06:09,552 ERROR >>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli - Error >>>>> while >>>>> running the Flink Yarn session. >>>>> java.lang.reflect.UndeclaredThrowableException >>>>> at >>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1862) >>>>> at >>>>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) >>>>> at >>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:813) >>>>> Caused by: >>>>> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't >>>>> deploy Yarn session cluster >>>>> at >>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:429) >>>>> at >>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:610) >>>>> at >>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:813) >>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>>> at >>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) >>>>> ... 2 more >>>>> Caused by: >>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: >>>>> The YARN application unexpectedly switched to state FAILED during >>>>> deployment. >>>>> Diagnostics from YARN: Application application_1584459865196_0165 >>>>> failed 1 times (global limit =2; local limit is =1) due to AM Container >>>>> for >>>>> appattempt_1584459865196_0165_000001 exited with exitCode: 1 >>>>> Failing this attempt.Diagnostics: Exception from container-launch. >>>>> Container id: container_1584459865196_0165_01_000001 >>>>> Exit code: 1 >>>>> Exception message: Usage: java [-options] class [args...] >>>>> (to execute a class) >>>>> or java [-options] -jar jarfile [args...] >>>>> (to execute a jar file) >>>>> where options include: >>>>> -d32 use a 32-bit data model if available >>>>> -d64 use a 64-bit data model if available >>>>> -server to select the "server" VM >>>>> The default VM is server, >>>>> because you are running on a server-class machine. >>>>> >>>>> >>>>> -cp <class search path of directories and zip/jar files> >>>>> -classpath <class search path of directories and zip/jar files> >>>>> A : separated list of directories, JAR archives, >>>>> and ZIP archives to search for class files. >>>>> -D<name>=<value> >>>>> set a system property >>>>> -verbose:[class|gc|jni] >>>>> enable verbose output >>>>> -version print product version and exit >>>>> -version:<value> >>>>> Warning: this feature is deprecated and will be >>>>> removed >>>>> in a future release. >>>>> require the specified version to run >>>>> -showversion print product version and continue >>>>> -jre-restrict-search | -no-jre-restrict-search >>>>> Warning: this feature is deprecated and will be >>>>> removed >>>>> in a future release. >>>>> include/exclude user private JREs in the version >>>>> search >>>>> -? -help print this help message >>>>> -X print help on non-standard options >>>>> -ea[:<packagename>...|:<classname>] >>>>> -enableassertions[:<packagename>...|:<classname>] >>>>> enable assertions with specified granularity >>>>> -da[:<packagename>...|:<classname>] >>>>> -disableassertions[:<packagename>...|:<classname>] >>>>> disable assertions with specified granularity >>>>> -esa | -enablesystemassertions >>>>> enable system assertions >>>>> -dsa | -disablesystemassertions >>>>> disable system assertions >>>>> -agentlib:<libname>[=<options>] >>>>> load native agent library <libname>, e.g. >>>>> -agentlib:hprof >>>>> see also, -agentlib:jdwp=help and >>>>> -agentlib:hprof=help >>>>> -agentpath:<pathname>[=<options>] >>>>> load native agent library by full pathname >>>>> -javaagent:<jarpath>[=<options>] >>>>> load Java programming language agent, see >>>>> java.lang.instrument >>>>> -splash:<imagepath> >>>>> show splash screen with specified image >>>>> See >>>>> http://www.oracle.com/technetwork/java/javase/documentation/index.html >>>>> for more details. >>>>> >>>>> Thanks >>>>> Jacky >>>>> >>>>> Robert Metzger <rmetz...@apache.org> 于2020年5月11日周一 下午3:42写道: >>>>> >>>>>> Hey Jacky, >>>>>> >>>>>> The error says "The YARN application unexpectedly switched to state >>>>>> FAILED during deployment.". >>>>>> Have you tried retrieving the YARN application logs? >>>>>> Does the YARN UI / resource manager logs reveal anything on the >>>>>> reason for the deployment to fail? >>>>>> >>>>>> Best, >>>>>> Robert >>>>>> >>>>>> >>>>>> On Mon, May 11, 2020 at 9:34 PM Jacky D <jacky.du0...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> ---------- Forwarded message --------- >>>>>>> 发件人: Jacky D <jacky.du0...@gmail.com> >>>>>>> Date: 2020年5月11日周一 下午3:12 >>>>>>> Subject: Re: Flink Memory analyze on AWS EMR >>>>>>> To: Khachatryan Roman <khachatryan.ro...@gmail.com> >>>>>>> >>>>>>> >>>>>>> Hi, Roman >>>>>>> >>>>>>> Thanks for quick response , I tried without logFIle option but >>>>>>> failed with same error , I'm currently using flink 1.6 >>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/application_profiling.html, >>>>>>> so I can only use Jitwatch or JMC . I guess those tools only available >>>>>>> on >>>>>>> Standalone cluster ? as document mentioned "Each standalone >>>>>>> JobManager, TaskManager, HistoryServer, and ZooKeeper daemon redirects >>>>>>> stdout and stderr to a file with a .out filename suffix and writes >>>>>>> internal logging to a file with a .log suffix. Java options >>>>>>> configured by the user in env.java.opts" ? >>>>>>> >>>>>>> Thanks >>>>>>> Jacky >>>>>>> >>>>>> -- Arvid Heise | Senior Java Developer <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng