baeminbo opened a new issue, #33636:
URL: https://github.com/apache/beam/issues/33636
### What happened?
A Dataflow job creation outputted `Template successfully created`, but the
template wasn't created in GCS.
With the debug logs enabled, it turned out that the root cause was the GCS
error `The specified bucket does not exist.`. The template creation command
specified a non-existing bucket for `templateLocation`, and a different bucket
for `stagingLocation`. It could upload pipeline graph and JAR artifacts to the
staging location, but couldn't write the template file. See the output [1] for
the outputs.
```
--runner=DataflowRunner \
--project=$PROJECT \
--region=$REGION \
--stagingLocation=gs://$GOOD_BUCKET/staging \
--templateLocation=gs://$NOT_EXISTING_BUCKET/template.json
```
The problem is `DataflowRunner` outputs `Template successfully created` and
doesn't report the error messages. As you see at [1], the error messages are
reported in debug logs only (CONFIG, FINE, FINER and FINEST in JUL).
[1]
```
... skipped ...
Jan 16, 2025 9:47:03 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Dataflow SDK version: 2.59.0
Jan 16, 2025 9:47:03 PM
org.apache.beam.sdk.util.construction.Environments$JavaVersion forSpecification
WARNING: Unsupported Java version: 22, falling back to: 21
... skipped ...
Jan 16, 2025 9:47:05 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl
create
FINER: create(gs://<REDACTED>/template.json)
Jan 16, 2025 9:47:05 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl
getWriteGeneration
FINER: getWriteGeneration(gs://<REDACTED>/template.json, true)
Jan 16, 2025 9:47:05 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl
getItemInfo
FINER: getItemInfo(gs://<REDACTED>/template.json)
Jan 16, 2025 9:47:05 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl
getObject
FINER: getObject(gs://<REDACTED>/template.json)
Jan 16, 2025 9:47:05 PM com.google.api.client.http.HttpRequest execute
CONFIG: -------------- REQUEST --------------
GET
https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
Accept-Encoding: gzip
Authorization: <Not Logged>
User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam)
Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1 (gzip)
x-goog-custom-audit-job: multimaptimerpipeline-baeminbo-0117054655-6e49d6ed
x-goog-api-client: gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2
x-cloud-trace-context:
8073c91c355dbed3a8dffbe1a9db8dcc/9140331792875765399;o=0
Jan 16, 2025 9:47:05 PM com.google.api.client.http.HttpRequest execute
CONFIG: curl -v --compressed -H 'Accept-Encoding: gzip' -H 'Authorization:
<Not Logged>' -H 'User-Agent: MultimapTimerPipeline apache-beam/2.59.0
(GPN:Beam) Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1 (gzip)'
-H 'x-goog-custom-audit-job:
multimaptimerpipeline-baeminbo-0117054655-6e49d6ed' -H 'x-goog-api-client:
gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2' -H 'x-cloud-trace-context:
8073c91c355dbed3a8dffbe1a9db8dcc/9140331792875765399;o=0' --
'https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata'
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.http.HttpURLConnection
plainConnect0
FINEST: ProxySelector Request for
https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.https.HttpsClient New
FINEST: Looking for HttpClient for URL
https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
and proxy value of DIRECT
Jan 16, 2025 9:47:05 PM sun.net.www.http.KeepAliveCache$ClientVector get
FINEST: cached HttpClient was idle for 3716
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.https.HttpsClient New
FINEST: KeepAlive stream retrieved from the cache,
sun.net.www.protocol.https.HttpsClient(https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/staging%2Fpipeline-ueMS_MzNWJnRhXG05hdnWlMpUI5kQfCcy81r-I7R380.pb)
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.http.HttpURLConnection
plainConnect0
FINEST: Proxy used: DIRECT
Jan 16, 2025 9:47:05 PM sun.net.www.protocol.http.HttpURLConnection
writeRequests
FINE: sun.net.www.MessageHeader@a0e33db 10 pairs: {GET
/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
HTTP/1.1: null}{Accept-Encoding: gzip}{Authorization: Bearer
<REDACTED>>}{User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam)
Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1
(gzip)}{x-goog-custom-audit-job:
multimaptimerpipeline-baeminbo-0117054655-6e49d6ed}{x-goog-api-client:
gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2}{x-cloud-trace-context:
8073c91c355dbed3a8dffbe1a9db8dcc/9140331792875765399;o=0}{Host:
storage.googleapis.com}{Accept: */*}{Connection: keep-alive}
Jan 16, 2025 9:47:06 PM sun.net.www.http.HttpClient logFinest
FINEST: KeepAlive stream used:
https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
Jan 16, 2025 9:47:06 PM sun.net.www.protocol.http.HttpURLConnection
getInputStream0
FINE: sun.net.www.MessageHeader@3ef46749 12 pairs: {null: HTTP/1.1 404 Not
Found}{X-GUploader-UploadID:
AFIdbgR2rWfCdgSNLific-SJEW1Du32LTPwRGWp77LCdOunj6ZiguVvs_01wGi6WORYBsUJtnCrI50Y}{Content-Type:
application/json; charset=UTF-8}{Date: Fri, 17 Jan 2025 05:47:06 GMT}{Vary:
Origin}{Vary: X-Origin}{Cache-Control: no-cache, no-store, max-age=0,
must-revalidate}{Expires: Mon, 01 Jan 1990 00:00:00 GMT}{Pragma:
no-cache}{Content-Length: 247}{Server: UploadServer}{Alt-Svc: h3=":443";
ma=2592000,h3-29=":443"; ma=2592000}
Jan 16, 2025 9:47:06 PM com.google.api.client.http.HttpResponse <init>
CONFIG: -------------- RESPONSE --------------
HTTP/1.1 404 Not Found
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Server: UploadServer
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
X-GUploader-UploadID:
AFIdbgR2rWfCdgSNLific-SJEW1Du32LTPwRGWp77LCdOunj6ZiguVvs_01wGi6WORYBsUJtnCrI50Y
Vary: Origin
Vary: X-Origin
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Content-Length: 247
Date: Fri, 17 Jan 2025 05:47:06 GMT
Content-Type: application/json; charset=UTF-8
Jan 16, 2025 9:47:06 PM
org.apache.beam.sdk.extensions.gcp.util.RetryHttpRequestInitializer$LoggingHttpBackOffHandler
handleResponse
FINE: Request failed with code 404, performed 0 retries due to IOExceptions,
performed 0 retries due to unsuccessful status codes, HTTP framework says
request can be retried, (caller responsible for retrying):
https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata.
Jan 16, 2025 9:47:06 PM
com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: Total: 247 bytes
Jan 16, 2025 9:47:06 PM
com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: {
"error": {
"code": 404,
"message": "The specified bucket does not exist.",
"errors": [
{
"message": "The specified bucket does not exist.",
"domain": "global",
"reason": "notFound"
}
]
}
}
Jan 16, 2025 9:47:07 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl
getObject
FINER: getObject(gs://<REDACTED>/template.json): not found
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not
Found
GET
https://storage.googleapis.com/storage/v1/b/<REDACTED>/o/template.json?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata
{
"code": 404,
"errors": [
{
"domain": "global",
"message": "The specified bucket does not exist.",
"reason": "notFound"
}
],
"message": "The specified bucket does not exist."
}
at
com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
at
com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
at
com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
at
com.google.api.client.googleapis.services.AbstractGoogleClientRequest$3.interceptResponse(AbstractGoogleClientRequest.java:466)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
at
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:552)
at
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:493)
at
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:603)
at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getObject(GoogleCloudStorageImpl.java:2229)
at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfo(GoogleCloudStorageImpl.java:2122)
at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getWriteGeneration(GoogleCloudStorageImpl.java:2197)
at
com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.create(GoogleCloudStorageImpl.java:568)
at
org.apache.beam.sdk.extensions.gcp.util.GcsUtil.create(GcsUtil.java:714)
at
org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystem.create(GcsFileSystem.java:155)
at
org.apache.beam.sdk.extensions.gcp.storage.GcsFileSystem.create(GcsFileSystem.java:72)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:246)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:233)
at
org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:1438)
at
org.apache.beam.runners.dataflow.DataflowRunner.run(DataflowRunner.java:203)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:325)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:310)
at baeminbo.MultimapTimerPipeline.main(MultimapTimerPipeline.java:97)
Jan 16, 2025 9:47:07 PM com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl
getItemInfo
FINER: getItemInfo: gs://<REDACTED>/template.json: exists: no
Jan 16, 2025 9:47:13 PM
com.google.cloud.hadoop.util.LoggingMediaHttpUploaderProgressListener
progressChanged
FINE: Uploading: template.json
Jan 16, 2025 9:47:14 PM com.google.api.client.http.HttpRequest execute
CONFIG: -------------- REQUEST --------------
POST
https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
Accept-Encoding: gzip
Authorization: <Not Logged>
User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam)
Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1 (gzip)
x-goog-custom-audit-job: multimaptimerpipeline-baeminbo-0117054655-6e49d6ed
x-goog-api-client: gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2
x-upload-content-type: text/plain
x-cloud-trace-context:
3042899c41252b26f51144930ff4cb1e/3008447248009734846;o=0
Content-Type: application/json; charset=UTF-8
Content-Length: 38
Jan 16, 2025 9:47:14 PM com.google.api.client.http.HttpRequest execute
CONFIG: curl -v --compressed -X POST -H 'Accept-Encoding: gzip' -H
'Authorization: <Not Logged>' -H 'User-Agent: MultimapTimerPipeline
apache-beam/2.59.0 (GPN:Beam) Google-API-Java-Client/2.4.0
Google-HTTP-Java-Client/1.44.1 (gzip)' -H 'x-goog-custom-audit-job:
multimaptimerpipeline-baeminbo-0117054655-6e49d6ed' -H 'x-goog-api-client:
gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2' -H 'x-upload-content-type: text/plain'
-H 'x-cloud-trace-context:
3042899c41252b26f51144930ff4cb1e/3008447248009734846;o=0' -H 'Content-Type:
application/json; charset=UTF-8' -d '@-' --
'https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable'
<< $$$
Jan 16, 2025 9:47:14 PM sun.net.www.protocol.http.HttpURLConnection
plainConnect0
FINEST: ProxySelector Request for
https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
Jan 16, 2025 9:47:14 PM sun.net.www.protocol.https.HttpsClient New
FINEST: Looking for HttpClient for URL
https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
and proxy value of DIRECT
Jan 16, 2025 9:47:14 PM sun.net.www.protocol.https.HttpsClient <init>
FINEST: Creating new HttpsClient with
url:https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
and proxy:DIRECT with connect timeout:20000
Jan 16, 2025 9:47:15 PM sun.net.www.protocol.http.HttpURLConnection
plainConnect0
FINEST: Proxy used: DIRECT
Jan 16, 2025 9:47:16 PM jdk.internal.event.EventHelper logTLSHandshakeEvent
FINE: TLSHandshake: storage.googleapis.com:443, TLSv1.3,
TLS_AES_256_GCM_SHA384, 2230027675
Jan 16, 2025 9:47:16 PM sun.net.www.protocol.http.HttpURLConnection
writeRequests
FINE: sun.net.www.MessageHeader@67030140 13 pairs: {POST
/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
HTTP/1.1: null}{Accept-Encoding: gzip}{Authorization: Bearer
<REDACTED>>}{User-Agent: MultimapTimerPipeline apache-beam/2.59.0 (GPN:Beam)
Google-API-Java-Client/2.4.0 Google-HTTP-Java-Client/1.44.1
(gzip)}{x-goog-custom-audit-job:
multimaptimerpipeline-baeminbo-0117054655-6e49d6ed}{x-goog-api-client:
gl-java/22.0.2 gdcl/2.4.0 mac-os-x/15.2}{x-upload-content-type:
text/plain}{x-cloud-trace-context:
3042899c41252b26f51144930ff4cb1e/3008447248009734846;o=0}{Content-Type:
application/json; charset=UTF-8}{Host: storage.googleapis.com}{Accept:
*/*}{Connection: keep-alive}{Content-Length: 38}
Jan 16, 2025 9:47:25 PM
com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: Total: 38 bytes
Jan 16, 2025 9:47:25 PM
com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: {"metadata":{},"name":"template.json"}
Jan 16, 2025 9:47:26 PM sun.net.www.http.HttpClient logFinest
FINEST: KeepAlive stream used:
https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
Jan 16, 2025 9:47:26 PM sun.net.www.protocol.http.HttpURLConnection
getInputStream0
FINE: sun.net.www.MessageHeader@2fb6cf1d 12 pairs: {null: HTTP/1.1 404 Not
Found}{X-GUploader-UploadID:
AFIdbgToxMGUfI8spHUMSMKI_alHRDQsCZVLVKM3g7-yhU36YonczlEbFyGzT9_qWkovP7YXteS9tjr4HrxGxLJipI4wFzjdvzEgDfL44wkvaQ}{Date:
Fri, 17 Jan 2025 05:47:25 GMT}{Vary: Origin}{Vary: X-Origin}{Cache-Control:
no-cache, no-store, max-age=0, must-revalidate}{Expires: Mon, 01 Jan 1990
00:00:00 GMT}{Pragma: no-cache}{Content-Length: 247}{Server:
UploadServer}{Content-Type: text/html; charset=UTF-8}{Alt-Svc: h3=":443";
ma=2592000,h3-29=":443"; ma=2592000}
Jan 16, 2025 9:47:26 PM com.google.api.client.http.HttpResponse <init>
CONFIG: -------------- RESPONSE --------------
HTTP/1.1 404 Not Found
Cache-Control: no-cache, no-store, max-age=0, must-revalidate
Server: UploadServer
Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
X-GUploader-UploadID:
AFIdbgToxMGUfI8spHUMSMKI_alHRDQsCZVLVKM3g7-yhU36YonczlEbFyGzT9_qWkovP7YXteS9tjr4HrxGxLJipI4wFzjdvzEgDfL44wkvaQ
Vary: Origin
Vary: X-Origin
Expires: Mon, 01 Jan 1990 00:00:00 GMT
Pragma: no-cache
Content-Length: 247
Date: Fri, 17 Jan 2025 05:47:25 GMT
Content-Type: text/html; charset=UTF-8
Jan 16, 2025 9:47:26 PM
org.apache.beam.sdk.extensions.gcp.util.RetryHttpRequestInitializer$LoggingHttpBackOffHandler
handleResponse
FINE: Request failed with code 404, performed 0 retries due to IOExceptions,
performed 0 retries due to unsuccessful status codes, HTTP framework says
request can be retried, (caller responsible for retrying):
https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable.
Jan 16, 2025 9:47:26 PM
org.apache.beam.sdk.extensions.gcp.util.UploadIdResponseInterceptor
interceptResponse
FINE: Upload ID for url
https://storage.googleapis.com/upload/storage/v1/b/<REDACTED>/o?ifGenerationMatch=0&name=template.json&uploadType=resumable
on worker null is
AFIdbgToxMGUfI8spHUMSMKI_alHRDQsCZVLVKM3g7-yhU36YonczlEbFyGzT9_qWkovP7YXteS9tjr4HrxGxLJipI4wFzjdvzEgDfL44wkvaQ
Jan 16, 2025 9:47:27 PM
com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: Total: 247 bytes
Jan 16, 2025 9:47:27 PM
com.google.api.client.util.LoggingByteArrayOutputStream close
CONFIG: {
"error": {
"code": 404,
"message": "The specified bucket does not exist.",
"errors": [
{
"message": "The specified bucket does not exist.",
"domain": "global",
"reason": "notFound"
}
]
}
}
Jan 16, 2025 9:47:27 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Printed job specification to gs://<REDACTED>/template.json
Jan 16, 2025 9:57:03 PM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Template successfully created.
```
### Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
### Issue Components
- [ ] Component: Python SDK
- [x] Component: Java SDK
- [ ] Component: Go SDK
- [ ] Component: Typescript SDK
- [ ] Component: IO connector
- [ ] Component: Beam YAML
- [ ] Component: Beam examples
- [ ] Component: Beam playground
- [ ] Component: Beam katas
- [ ] Component: Website
- [ ] Component: Infrastructure
- [ ] Component: Spark Runner
- [ ] Component: Flink Runner
- [ ] Component: Samza Runner
- [ ] Component: Twister2 Runner
- [ ] Component: Hazelcast Jet Runner
- [x] Component: Google Cloud Dataflow Runner
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]