The standalone-job process fails because no task executors are around to request slots from. It _should_ wait for a bit to give task executors time to start up, controlled via resourcemanager.standalone.start-up-time or, if unset, slot.request.timeout.
Does the standalone-job process fail immediately?

On 1/15/2021 2:28 PM, Manas Kale wrote:
You mean taskmanager? I tried using this command:

docker run --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" flink_pipeline taskmanager

after running above script but got:

2021-01-15 13:03:05,069 INFO  org.apache.flink.runtime.util.LeaderRetrievalUtils   [] - Trying to select the network interface and address to use by connecting to the leading JobManager. 2021-01-15 13:03:05,069 INFO  org.apache.flink.runtime.util.LeaderRetrievalUtils   [] - TaskManager will try to connect for PT10S before falling back to heuristics 2021-01-15 13:03:05,484 INFO  org.apache.flink.runtime.net.ConnectionUtils   [] - Trying to connect to address jobmanager:6123 2021-01-15 13:03:05,486 INFO  org.apache.flink.runtime.net.ConnectionUtils   [] - Failed to connect from address '608ecee74cff/172.17.0.3 <http://172.17.0.3>': jobmanager

Here's what I understand is supposed to happen:
1. Start a jobmanager in a docker container.
2. Start a taskmanager in another docker container and tell it where to find the jobmanager.
3. Using the taskmanager, submit a new job.

I thought since step (1) is failing, adding the next step (starting taskmanager) would be of no use.

Please correct me if my understanding is wrong.




On Fri, Jan 15, 2021 at 4:37 PM Chesnay Schepler <[email protected] <mailto:[email protected]>> wrote:

    Where are you starting the task executor?

    On 1/15/2021 11:57 AM, Manas Kale wrote:
    Hi all,
    I've got a job that I am trying to run using docker as per [1].
    Here's the dockerfile:
    # Start from base Flink image. FROM flink:1.11.0 # Add fat JAR and logger 
properties file to image. ADD 
./target/flink_POC-0.1.jar/opt/flink/usrlib/flink_POC-0.1.jar
    ADD ./target/classes/log4j.properties/opt/flink/usrlib/log4j.properties

    # Add pipeline.properties and its location. ADD 
target/classes/pipeline.properties/opt/flink/usrlib/pipeline.properties
    ENV FLINK_CONFIG_LOCATION=/opt/flink/usrlib/pipeline.properties


    EXPOSE 8081
    And the script I use to launch it:
    #!/usr/bin/env bash echo "Building docker image..." docker build --tag 
flink_pipeline .

    echo "Configuring Flink runtime..." export 
FLINK_PROPERTIES="jobmanager.rpc.address: host taskmanager.memory.process.size:
    4000 jobmanager.memory.process.size: 4000 " echo "Starting docker
    image..." docker run --rm -p 8081:8081 --env 
FLINK_PROPERTIES=FLINK_PROPERTIES \
    flink_pipeline standalone-job --job-classname flink_POC.StreamingJob

    When I run the script, I see my job stuck in "CREATED" state and
    after some time I get the error:

    2021-01-15 10:44:29,563 INFO
     org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl     []
    - Requesting new slot
    [SlotRequestId{1c25a61e6179f66b112b1944740f1a11}] and profile
    ResourceProfile{UNKNOWN} from resource manager.
    2021-01-15 10:44:29,565 INFO
     org.apache.flink.runtime.resourcemanager.StandaloneResourceManager
    [] - Request slot with profile ResourceProfile{UNKNOWN} for job
    b854f75d6029e1725e822721c30095d7 with allocation id
    edc1e29d229aceb82f75b7c5835eca3c.
    2021-01-15 10:46:39,604 INFO
     org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl     []
    - Failing pending slot request
    [SlotRequestId{1c25a61e6179f66b112b1944740f1a11}]: Could not
    fulfill slot request edc1e29d229aceb82f75b7c5835eca3c. Requested
    resource profile (ResourceProfile{UNKNOWN}) is unfulfillable.
    2021-01-15 10:46:39,667 INFO
     org.apache.flink.runtime.executiongraph.ExecutionGraph       []
    - Source: advanced features  kafak consumer (1/1)
    (49ea271f6b9881d82c49b2826e8584d9) switched from SCHEDULED to
    FAILED on not deployed.
    *java.util.concurrent.CompletionException:
    
org.apache.flink.runtime.resourcemanager.exceptions.UnfulfillableSlotRequestException:
    Could not fulfill slot request edc1e29d229aceb82f75b7c5835eca3c.
    Requested resource profile (ResourceProfile{UNKNOWN}) is
    unfulfillable.
    *      ...
    I understand that the resourcemanager fails to provide resources
    for my job(?), but other than that the error is quite cryptic for
    me. Could anyone help me understand what is going wrong?


    Regards,
    Manas

    [1]
    
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/standalone/docker.html#introduction
    
<https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/standalone/docker.html#introduction>



Reply via email to