The standalone-job process fails because no task executors are around to
request slots from.
It _should_ wait for a bit to give task executors time to start up,
controlled via resourcemanager.standalone.start-up-time or, if unset,
slot.request.timeout.
Does the standalone-job process fail immediately?
On 1/15/2021 2:28 PM, Manas Kale wrote:
You mean taskmanager? I tried using this command:
docker run --env FLINK_PROPERTIES="${FLINK_PROPERTIES}" flink_pipeline
taskmanager
after running above script but got:
2021-01-15 13:03:05,069 INFO
org.apache.flink.runtime.util.LeaderRetrievalUtils [] - Trying to
select the network interface and address to use by connecting to the
leading JobManager.
2021-01-15 13:03:05,069 INFO
org.apache.flink.runtime.util.LeaderRetrievalUtils [] - TaskManager
will try to connect for PT10S before falling back to heuristics
2021-01-15 13:03:05,484 INFO
org.apache.flink.runtime.net.ConnectionUtils [] - Trying to connect
to address jobmanager:6123
2021-01-15 13:03:05,486 INFO
org.apache.flink.runtime.net.ConnectionUtils [] - Failed to connect
from address '608ecee74cff/172.17.0.3 <http://172.17.0.3>': jobmanager
Here's what I understand is supposed to happen:
1. Start a jobmanager in a docker container.
2. Start a taskmanager in another docker container and tell it where
to find the jobmanager.
3. Using the taskmanager, submit a new job.
I thought since step (1) is failing, adding the next step (starting
taskmanager) would be of no use.
Please correct me if my understanding is wrong.
On Fri, Jan 15, 2021 at 4:37 PM Chesnay Schepler <[email protected]
<mailto:[email protected]>> wrote:
Where are you starting the task executor?
On 1/15/2021 11:57 AM, Manas Kale wrote:
Hi all,
I've got a job that I am trying to run using docker as per [1].
Here's the dockerfile:
# Start from base Flink image. FROM flink:1.11.0 # Add fat JAR and logger
properties file to image. ADD
./target/flink_POC-0.1.jar/opt/flink/usrlib/flink_POC-0.1.jar
ADD ./target/classes/log4j.properties/opt/flink/usrlib/log4j.properties
# Add pipeline.properties and its location. ADD
target/classes/pipeline.properties/opt/flink/usrlib/pipeline.properties
ENV FLINK_CONFIG_LOCATION=/opt/flink/usrlib/pipeline.properties
EXPOSE 8081
And the script I use to launch it:
#!/usr/bin/env bash echo "Building docker image..." docker build --tag
flink_pipeline .
echo "Configuring Flink runtime..." export
FLINK_PROPERTIES="jobmanager.rpc.address: host taskmanager.memory.process.size:
4000 jobmanager.memory.process.size: 4000 " echo "Starting docker
image..." docker run --rm -p 8081:8081 --env
FLINK_PROPERTIES=FLINK_PROPERTIES \
flink_pipeline standalone-job --job-classname flink_POC.StreamingJob
When I run the script, I see my job stuck in "CREATED" state and
after some time I get the error:
2021-01-15 10:44:29,563 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl []
- Requesting new slot
[SlotRequestId{1c25a61e6179f66b112b1944740f1a11}] and profile
ResourceProfile{UNKNOWN} from resource manager.
2021-01-15 10:44:29,565 INFO
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager
[] - Request slot with profile ResourceProfile{UNKNOWN} for job
b854f75d6029e1725e822721c30095d7 with allocation id
edc1e29d229aceb82f75b7c5835eca3c.
2021-01-15 10:46:39,604 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl []
- Failing pending slot request
[SlotRequestId{1c25a61e6179f66b112b1944740f1a11}]: Could not
fulfill slot request edc1e29d229aceb82f75b7c5835eca3c. Requested
resource profile (ResourceProfile{UNKNOWN}) is unfulfillable.
2021-01-15 10:46:39,667 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph []
- Source: advanced features kafak consumer (1/1)
(49ea271f6b9881d82c49b2826e8584d9) switched from SCHEDULED to
FAILED on not deployed.
*java.util.concurrent.CompletionException:
org.apache.flink.runtime.resourcemanager.exceptions.UnfulfillableSlotRequestException:
Could not fulfill slot request edc1e29d229aceb82f75b7c5835eca3c.
Requested resource profile (ResourceProfile{UNKNOWN}) is
unfulfillable.
* ...
I understand that the resourcemanager fails to provide resources
for my job(?), but other than that the error is quite cryptic for
me. Could anyone help me understand what is going wrong?
Regards,
Manas
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/standalone/docker.html#introduction
<https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/standalone/docker.html#introduction>