Hi Jacky,

I don't think ${FLINK_LOG_PREFIX} is available for Flink Yarn deployment.
This is just my guess, that the actual file name becomes ".jit". You can
try to verify that by looking for the hidden file.

If it is indeed this problem, you can try to replace "${FLINK_LOG_PREFIX}"
with "<LOG_DIR>/your-file-name.jit". The token "<LOG_DIR>" should be
replaced with proper log directory path by Yarn automatically.

I noticed that the usage of ${FLINK_LOG_PREFIX} is recommended by Flink's
documentation [1]. This is IMO a bit misleading. I'll try to file an issue
to improve the docs.

Thank you~

Xintong Song


[1]
https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/application_profiling.html#profiling-with-jitwatch

On Wed, May 13, 2020 at 2:45 AM Jacky D <jacky.du0...@gmail.com> wrote:

> hi, Arvid
>
> thanks for the advice  ,  I removed the quotes and it do created a yarn
> session on EMR , but I didn't find any jit log file generated .
>
> The config with quotes is working on standalone cluster . I also tried to
> dynamic pass the property within the yarn session command :
>
> flink-yarn-session -n 1 -d -nm testSession -yD 
> env.java.opts="-XX:+UnlockDiagnosticVMOptions
> -XX:+TraceClassLoading -XX:+LogCompilation
> -XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly"
>
>
> but get same result , session created , but can not find any jit log file
> under container log .
>
>
> Thanks
>
> Jacky
>
> Arvid Heise <ar...@ververica.com> 于2020年5月12日周二 下午12:57写道:
>
>> Hi Jacky,
>>
>> I suspect that the quotes are the actual issue. Could you try to remove
>> them? See also [1].
>>
>> [1]
>> http://blogs.perl.org/users/tinita/2018/03/strings-in-yaml---to-quote-or-not-to-quote.html
>>
>> On Tue, May 12, 2020 at 4:03 PM Jacky D <jacky.du0...@gmail.com> wrote:
>>
>>> hi, Xintong
>>>
>>> Thanks for reply , I attached those lines below for application master
>>> start command :
>>>
>>>
>>> 2020-05-11 21:16:16,635 DEBUG
>>> org.apache.hadoop.util.PerformanceAdvisory                    - Crypto
>>> codec org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available.
>>> 2020-05-11 21:16:16,635 DEBUG
>>> org.apache.hadoop.util.PerformanceAdvisory                    - Using
>>> crypto codec org.apache.hadoop.crypto.JceAesCtrCryptoCodec.
>>> 2020-05-11 21:16:16,636 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>>                    - DataStreamer block
>>> BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet
>>> packet seqno: 0 offsetInBlock: 0 lastPacketInBlock: false
>>> lastByteOffsetInBlock: 1697
>>> 2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>>                    - DFSClient seqno: 0 reply: SUCCESS
>>> downstreamAckTimeNanos: 0 flag: 0
>>> 2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>>                    - DataStreamer block
>>> BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet
>>> packet seqno: 1 offsetInBlock: 1697 lastPacketInBlock: true
>>> lastByteOffsetInBlock: 1697
>>> 2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>>                    - DFSClient seqno: 1 reply: SUCCESS
>>> downstreamAckTimeNanos: 0 flag: 0
>>> 2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer
>>>                    - Closing old block
>>> BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315
>>> 2020-05-11 21:16:16,641 DEBUG org.apache.hadoop.ipc.Client
>>>                     - IPC Client (1954985045) connection to
>>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #70
>>> org.apache.hadoop.hdfs.protocol.ClientProtocol.complete
>>> 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client
>>>                     - IPC Client (1954985045) connection to
>>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #70
>>> 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
>>>                    - Call: complete took 2ms
>>> 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client
>>>                     - IPC Client (1954985045) connection to
>>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #71
>>> org.apache.hadoop.hdfs.protocol.ClientProtocol.setTimes
>>> 2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.Client
>>>                     - IPC Client (1954985045) connection to
>>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #71
>>> 2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
>>>                    - Call: setTimes took 2ms
>>> 2020-05-11 21:16:16,647 DEBUG org.apache.hadoop.ipc.Client
>>>                     - IPC Client (1954985045) connection to
>>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #72
>>> org.apache.hadoop.hdfs.protocol.ClientProtocol.setPermission
>>> 2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.Client
>>>                     - IPC Client (1954985045) connection to
>>> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #72
>>> 2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine
>>>                    - Call: setPermission took 2ms
>>> 2020-05-11 21:16:16,654 DEBUG
>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor           - Application
>>> Master start command: $JAVA_HOME/bin/java -Xmx424m
>>> "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation
>>> -XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly"
>>> -Dlog.file="<LOG_DIR>/jobmanager.log"
>>> -Dlog4j.configuration=file:log4j.properties
>>> org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint  1>
>>> <LOG_DIR>/jobmanager.out 2> <LOG_DIR>/jobmanager.err
>>> 2020-05-11 21:16:16,654 DEBUG org.apache.hadoop.ipc.Client
>>>                     - stopping client from cache:
>>> org.apache.hadoop.ipc.Client@28194a50
>>> 2020-05-11 21:16:16,656 DEBUG
>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
>>> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
>>> method setApplicationTags.
>>> 2020-05-11 21:16:16,656 DEBUG
>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
>>> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
>>> method setAttemptFailuresValidityInterval.
>>> 2020-05-11 21:16:16,656 DEBUG
>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
>>> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
>>> method setKeepContainersAcrossApplicationAttempts.
>>> 2020-05-11 21:16:16,656 DEBUG
>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector
>>> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports
>>> method setNodeLabelExpression.
>>>
>>> Xintong Song <tonysong...@gmail.com> 于2020年5月11日周一 下午10:11写道:
>>>
>>>> Hi Jacky,
>>>>
>>>> Could you search for "Application Master start command:" in the debug
>>>> log and post the result and a few lines before & after that? This is not
>>>> included in the clip of attached log file.
>>>>
>>>> Thank you~
>>>>
>>>> Xintong Song
>>>>
>>>>
>>>>
>>>> On Tue, May 12, 2020 at 5:33 AM Jacky D <jacky.du0...@gmail.com> wrote:
>>>>
>>>>> hi, Robert
>>>>>
>>>>> Thanks so much for quick reply  , I changed the log level to debug
>>>>> and attach the log file .
>>>>>
>>>>> Thanks
>>>>> Jacky
>>>>>
>>>>> Robert Metzger <rmetz...@apache.org> 于2020年5月11日周一 下午4:14写道:
>>>>>
>>>>>> Thanks a lot for posting the full output.
>>>>>>
>>>>>> It seems that Flink is passing an invalid list of arguments to the
>>>>>> JVM.
>>>>>> Can you
>>>>>> - set the root log level in conf/log4j-yarn-session.properties to
>>>>>> DEBUG
>>>>>> - then launch the YARN session
>>>>>> - share the log file of the yarn session on the mailing list?
>>>>>>
>>>>>> I'm particularly interested in the line printed here, as it shows the
>>>>>> JVM invocation.
>>>>>>
>>>>>> https://github.com/apache/flink/blob/release-1.6/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L1630
>>>>>>
>>>>>>
>>>>>> On Mon, May 11, 2020 at 9:56 PM Jacky D <jacky.du0...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,Robert
>>>>>>>
>>>>>>> Yes , I tried to retrieve more log info from yarn UI , the full logs
>>>>>>> showing below , this happens when I try to create a flink yarn session 
>>>>>>> on
>>>>>>> emr when set up jitwatch configuration .
>>>>>>>
>>>>>>> 2020-05-11 19:06:09,552 ERROR
>>>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - Error 
>>>>>>> while
>>>>>>> running the Flink Yarn session.
>>>>>>> java.lang.reflect.UndeclaredThrowableException
>>>>>>> at
>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1862)
>>>>>>> at
>>>>>>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>>>>>>> at
>>>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:813)
>>>>>>> Caused by:
>>>>>>> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't
>>>>>>> deploy Yarn session cluster
>>>>>>> at
>>>>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:429)
>>>>>>> at
>>>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:610)
>>>>>>> at
>>>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:813)
>>>>>>> at java.security.AccessController.doPrivileged(Native Method)
>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:422)
>>>>>>> at
>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
>>>>>>> ... 2 more
>>>>>>> Caused by:
>>>>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException:
>>>>>>> The YARN application unexpectedly switched to state FAILED during
>>>>>>> deployment.
>>>>>>> Diagnostics from YARN: Application application_1584459865196_0165
>>>>>>> failed 1 times (global limit =2; local limit is =1) due to AM Container 
>>>>>>> for
>>>>>>> appattempt_1584459865196_0165_000001 exited with  exitCode: 1
>>>>>>> Failing this attempt.Diagnostics: Exception from container-launch.
>>>>>>> Container id: container_1584459865196_0165_01_000001
>>>>>>> Exit code: 1
>>>>>>> Exception message: Usage: java [-options] class [args...]
>>>>>>>            (to execute a class)
>>>>>>>    or  java [-options] -jar jarfile [args...]
>>>>>>>            (to execute a jar file)
>>>>>>> where options include:
>>>>>>>     -d32   use a 32-bit data model if available
>>>>>>>     -d64   use a 64-bit data model if available
>>>>>>>     -server   to select the "server" VM
>>>>>>>                   The default VM is server,
>>>>>>>                   because you are running on a server-class machine.
>>>>>>>
>>>>>>>
>>>>>>>     -cp <class search path of directories and zip/jar files>
>>>>>>>     -classpath <class search path of directories and zip/jar files>
>>>>>>>                   A : separated list of directories, JAR archives,
>>>>>>>                   and ZIP archives to search for class files.
>>>>>>>     -D<name>=<value>
>>>>>>>                   set a system property
>>>>>>>     -verbose:[class|gc|jni]
>>>>>>>                   enable verbose output
>>>>>>>     -version      print product version and exit
>>>>>>>     -version:<value>
>>>>>>>                   Warning: this feature is deprecated and will be
>>>>>>> removed
>>>>>>>                   in a future release.
>>>>>>>                   require the specified version to run
>>>>>>>     -showversion  print product version and continue
>>>>>>>     -jre-restrict-search | -no-jre-restrict-search
>>>>>>>                   Warning: this feature is deprecated and will be
>>>>>>> removed
>>>>>>>                   in a future release.
>>>>>>>                   include/exclude user private JREs in the version
>>>>>>> search
>>>>>>>     -? -help      print this help message
>>>>>>>     -X            print help on non-standard options
>>>>>>>     -ea[:<packagename>...|:<classname>]
>>>>>>>     -enableassertions[:<packagename>...|:<classname>]
>>>>>>>                   enable assertions with specified granularity
>>>>>>>     -da[:<packagename>...|:<classname>]
>>>>>>>     -disableassertions[:<packagename>...|:<classname>]
>>>>>>>                   disable assertions with specified granularity
>>>>>>>     -esa | -enablesystemassertions
>>>>>>>                   enable system assertions
>>>>>>>     -dsa | -disablesystemassertions
>>>>>>>                   disable system assertions
>>>>>>>     -agentlib:<libname>[=<options>]
>>>>>>>                   load native agent library <libname>, e.g.
>>>>>>> -agentlib:hprof
>>>>>>>                   see also, -agentlib:jdwp=help and
>>>>>>> -agentlib:hprof=help
>>>>>>>     -agentpath:<pathname>[=<options>]
>>>>>>>                   load native agent library by full pathname
>>>>>>>     -javaagent:<jarpath>[=<options>]
>>>>>>>                   load Java programming language agent, see
>>>>>>> java.lang.instrument
>>>>>>>     -splash:<imagepath>
>>>>>>>                   show splash screen with specified image
>>>>>>> See
>>>>>>> http://www.oracle.com/technetwork/java/javase/documentation/index.html
>>>>>>> for more details.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Jacky
>>>>>>>
>>>>>>> Robert Metzger <rmetz...@apache.org> 于2020年5月11日周一 下午3:42写道:
>>>>>>>
>>>>>>>> Hey Jacky,
>>>>>>>>
>>>>>>>> The error says "The YARN application unexpectedly switched to state
>>>>>>>> FAILED during deployment.".
>>>>>>>> Have you tried retrieving the YARN application logs?
>>>>>>>> Does the YARN UI / resource manager logs reveal anything on the
>>>>>>>> reason for the deployment to fail?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Robert
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, May 11, 2020 at 9:34 PM Jacky D <jacky.du0...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------- Forwarded message ---------
>>>>>>>>> 发件人: Jacky D <jacky.du0...@gmail.com>
>>>>>>>>> Date: 2020年5月11日周一 下午3:12
>>>>>>>>> Subject: Re: Flink Memory analyze on AWS EMR
>>>>>>>>> To: Khachatryan Roman <khachatryan.ro...@gmail.com>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi, Roman
>>>>>>>>>
>>>>>>>>> Thanks for quick response , I tried without logFIle option but
>>>>>>>>> failed with same error , I'm currently using flink 1.6
>>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/application_profiling.html,
>>>>>>>>> so I can only use Jitwatch or JMC .  I guess those tools only 
>>>>>>>>> available on
>>>>>>>>> Standalone cluster ? as document mentioned "Each standalone
>>>>>>>>> JobManager, TaskManager, HistoryServer, and ZooKeeper daemon redirects
>>>>>>>>> stdout and stderr to a file with a .out filename suffix and
>>>>>>>>> writes internal logging to a file with a .log suffix. Java
>>>>>>>>> options configured by the user in env.java.opts" ?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Jacky
>>>>>>>>>
>>>>>>>>
>>
>> --
>>
>> Arvid Heise | Senior Java Developer
>>
>> <https://www.ververica.com/>
>>
>> Follow us @VervericaData
>>
>> --
>>
>> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
>> Conference
>>
>> Stream Processing | Event Driven | Real Time
>>
>> --
>>
>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>>
>> --
>> Ververica GmbH
>> Registered at Amtsgericht Charlottenburg: HRB 158244 B
>> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
>> (Toni) Cheng
>>
>

Reply via email to