[jira] [Updated] (BEAM-13225) Dataflow Prime job fails when providing resource hints on a transform

2021-11-11 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-13225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden updated BEAM-13225:

Description: 
I have a classic Dataflow template written using the Apache Beam Java SDK 
v2.32.0.  The template simply consumes messages from a Pub/Sub subscription and 
writes them to Google Cloud Storage.

The template can successfully be used to run jobs with [Dataflow 
Prime|https://cloud.google.com/dataflow/docs/guides/enable-dataflow-prime] 
experimental features enabled through {{\-\-additional-experiments 
enable_prime}} and providing a pipeline level resource hint using 
{{\-\-parameters=resourceHints=min_ram=8GiB}}:
{code}
gcloud dataflow jobs run my-job-name \
  --additional-experiments enable_prime \
  --disable-public-ips \
  --gcs-location gs://bucket/path/to/template \
  --num-workers 1  \
  --max-workers 16 \
  --parameters=resourceHints=min_ram=8GiB,other_pipeline_options=true \
  --project my-project \
  --region us-central1 \
  --service-account-email my-service-acco...@my-project.iam.gserviceaccount.com 
\
  --staging-location gs://bucket/path/to/staging
  --subnetwork 
https://www.googleapis.com/compute/v1/projects/my-project/regions/us-central1/subnetworks/my-subnet
{code}

In an attempt to use Dataflow Prime's [Right 
Fitting|https://cloud.google.com/dataflow/docs/guides/right-fitting] 
capability, I change the pipeline code to include a resource hint on the FileIO 
transform:
{code}
class WriteGcsFileTransform
extends PTransform, WriteFilesResult> {

  private static final long serialVersionUID = 1L;

  @Override
  public WriteFilesResult expand(PCollection input) {

return input.apply(
FileIO.writeDynamic()
.by(myDynamicDestinationFunction)
.withDestinationCoder(Destination.coder())
.withNumShards(8)
.withNaming(myDestinationFileNamingFunction)
.withTempDirectory("gs://bucket/path/to/temp")
.withCompression(Compression.GZIP)
.setResourceHints(ResourceHints.create().withMinRam("32GiB"))
);
  }
{code}

Attempting to run jobs from a template based on the new code results in a 
continuous crash loop with the job never successfully running.  The lone 
repeated error log entry is:
{code}
{
  "insertId": 
"s=97e1ecd30e0243609d555685318325b4;i=4e1;b=6c7f5d65f3994eada5f20672dab1daf1;m=912f16c;t=5d024689cb030;x=b36751718b3d80c1",
  "jsonPayload": {
"line": "pod_workers.go:191",
"message": "Error syncing pod 4cf7cbf98df4b5e2d054abce7da1262b 
(\"df-df-hvm-my-job-name-11061310-qn51-harness-jb9f_default(4cf7c6bf982df4b5eb2d054abce7da12)\"),
 skipping: failed to \"StartContainer\" for \"artifact\" with CrashLoopBackOff: 
\"back-off 40s restarting failed container=artifact 
pod=df-df-hvm-my-job-name-11061310-qn51-harness-jb9f_default(4cf7c6bf982df4b5eb2d054abce7da12)\"",
"thread": "807"
  },
  "resource": {
"type": "dataflow_step",
"labels": {
  "project_id": "my-project",
  "region": "us-central1",
  "step_id": "",
  "job_id": "2021-11-06_12_10_27-510057810808146686",
  "job_name": "my-job-name"
}
  },
  "timestamp": "2021-11-06T20:14:36.052491Z",
  "severity": "ERROR",
  "labels": {
"compute.googleapis.com/resource_type": "instance",
"dataflow.googleapis.com/log_type": "system",
"compute.googleapis.com/resource_id": "4695846446965678007",
"dataflow.googleapis.com/job_name": "my-job-name",
"dataflow.googleapis.com/job_id": "2021-11-06_12_10_27-510057810808146686",
"dataflow.googleapis.com/region": "us-central1",
"dataflow.googleapis.com/service_option": "prime",
"compute.googleapis.com/resource_name": 
"df-hvm-my-job-name-11061310-qn51-harness-jb9f"
  },
  "logName": "projects/my-project/logs/dataflow.googleapis.com%2Fkubelet",
  "receiveTimestamp": "2021-11-06T20:14:46.471285909Z"
}
{code}

If the pipeline level resources hints and step level resources hint are both 
set to 8GiB, the pipeline fails with the same repetitive error.

  was:
I have a classic Dataflow template written using the Apache Beam Java SDK 
v2.32.0.  The template simply consumes messages from a Pub/Sub subscription and 
writes them to Google Cloud Storage.

The template can successfully be used to run jobs with [Dataflow Prime][1] 
experimental features enabled through `--additional-experiments enable_prime` 
and providing a pipeline level resource hint using 
`--parameters=resourceHints=min_ram=8GiB`:
```lang-sh
gcloud dataflow jobs run my-job-name \
  --additional-experiments enable_prime \
  --disable-public-ips \
  --gcs-location gs://bucket/path/to/template \
  --num-workers 1  \
  --max-workers 16 \
  --parameters=resourceHints=min_ram=8GiB,other_pipeline_options=true \
  --project my-project \
  --region us-central1 \
  --service-account-email my-service-acco...@my-project.iam.gserviceaccount.com 
\
  --staging-location gs://bucket/path/to/st

[jira] [Created] (BEAM-13225) Dataflow Prime job fails when providing resource hints on a transform

2021-11-11 Thread Brent Worden (Jira)
Brent Worden created BEAM-13225:
---

 Summary: Dataflow Prime job fails when providing resource hints on 
a transform
 Key: BEAM-13225
 URL: https://issues.apache.org/jira/browse/BEAM-13225
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow, sdk-java-core
Affects Versions: 2.32.0
Reporter: Brent Worden


I have a classic Dataflow template written using the Apache Beam Java SDK 
v2.32.0.  The template simply consumes messages from a Pub/Sub subscription and 
writes them to Google Cloud Storage.

The template can successfully be used to run jobs with [Dataflow Prime][1] 
experimental features enabled through `--additional-experiments enable_prime` 
and providing a pipeline level resource hint using 
`--parameters=resourceHints=min_ram=8GiB`:
```lang-sh
gcloud dataflow jobs run my-job-name \
  --additional-experiments enable_prime \
  --disable-public-ips \
  --gcs-location gs://bucket/path/to/template \
  --num-workers 1  \
  --max-workers 16 \
  --parameters=resourceHints=min_ram=8GiB,other_pipeline_options=true \
  --project my-project \
  --region us-central1 \
  --service-account-email my-service-acco...@my-project.iam.gserviceaccount.com 
\
  --staging-location gs://bucket/path/to/staging
  --subnetwork 
https://www.googleapis.com/compute/v1/projects/my-project/regions/us-central1/subnetworks/my-subnet
```

In an attempt to use Dataflow Prime's [Right Fitting][2] capability, I change 
the pipeline code to include a resource hint on the FileIO transform:
```lang-java
class WriteGcsFileTransform
extends PTransform, WriteFilesResult> {

  private static final long serialVersionUID = 1L;

  @Override
  public WriteFilesResult expand(PCollection input) {

return input.apply(
FileIO.writeDynamic()
.by(myDynamicDestinationFunction)
.withDestinationCoder(Destination.coder())
.withNumShards(8)
.withNaming(myDestinationFileNamingFunction)
.withTempDirectory("gs://bucket/path/to/temp")
.withCompression(Compression.GZIP)
.setResourceHints(ResourceHints.create().withMinRam("32GiB"))
);
  }
```

Attempting to run jobs from a template based on the new code results in a 
continuous crash loop with the job never successfully running.  The lone 
repeated error log entry is:
```lang-json
{
  "insertId": 
"s=97e1ecd30e0243609d555685318325b4;i=4e1;b=6c7f5d65f3994eada5f20672dab1daf1;m=912f16c;t=5d024689cb030;x=b36751718b3d80c1",
  "jsonPayload": {
"line": "pod_workers.go:191",
"message": "Error syncing pod 4cf7cbf98df4b5e2d054abce7da1262b 
(\"df-df-hvm-my-job-name-11061310-qn51-harness-jb9f_default(4cf7c6bf982df4b5eb2d054abce7da12)\"),
 skipping: failed to \"StartContainer\" for \"artifact\" with CrashLoopBackOff: 
\"back-off 40s restarting failed container=artifact 
pod=df-df-hvm-my-job-name-11061310-qn51-harness-jb9f_default(4cf7c6bf982df4b5eb2d054abce7da12)\"",
"thread": "807"
  },
  "resource": {
"type": "dataflow_step",
"labels": {
  "project_id": "my-project",
  "region": "us-central1",
  "step_id": "",
  "job_id": "2021-11-06_12_10_27-510057810808146686",
  "job_name": "my-job-name"
}
  },
  "timestamp": "2021-11-06T20:14:36.052491Z",
  "severity": "ERROR",
  "labels": {
"compute.googleapis.com/resource_type": "instance",
"dataflow.googleapis.com/log_type": "system",
"compute.googleapis.com/resource_id": "4695846446965678007",
"dataflow.googleapis.com/job_name": "my-job-name",
"dataflow.googleapis.com/job_id": "2021-11-06_12_10_27-510057810808146686",
"dataflow.googleapis.com/region": "us-central1",
"dataflow.googleapis.com/service_option": "prime",
"compute.googleapis.com/resource_name": 
"df-hvm-my-job-name-11061310-qn51-harness-jb9f"
  },
  "logName": "projects/my-project/logs/dataflow.googleapis.com%2Fkubelet",
  "receiveTimestamp": "2021-11-06T20:14:46.471285909Z"
}
```

Am I using resource hints on the transform incorrectly?


  [1]: https://cloud.google.com/dataflow/docs/guides/enable-dataflow-prime
  [2]: https://cloud.google.com/dataflow/docs/guides/right-fitting



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Closed] (BEAM-8323) Testing Guideline using deprecated DoFnTester

2020-05-15 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden closed BEAM-8323.
--
Fix Version/s: Not applicable
   Resolution: Duplicate

> Testing Guideline using deprecated DoFnTester
> -
>
> Key: BEAM-8323
> URL: https://issues.apache.org/jira/browse/BEAM-8323
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Reza ardeshir rokni
>Priority: Major
> Fix For: Not applicable
>
>
> [https://beam.apache.org/documentation/pipelines/test-your-pipeline/]
> Uses deprecated DoFnTester 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-8604) Remove deprecated DoFnTester docs

2020-05-15 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden closed BEAM-8604.
--
Fix Version/s: Not applicable
   Resolution: Duplicate

> Remove deprecated DoFnTester docs
> -
>
> Key: BEAM-8604
> URL: https://issues.apache.org/jira/browse/BEAM-8604
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Reza ardeshir rokni
>Assignee: Rose Nguyen
>Priority: Major
> Fix For: Not applicable
>
>
> Remove references to deprecated DoFnTester
> [https://beam.apache.org/documentation/pipelines/test-your-pipeline/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (BEAM-9817) beam_PreCommit_Website_Stage_GCS_Commit failed

2020-05-15 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden closed BEAM-9817.
--
Fix Version/s: Not applicable
   Resolution: Cannot Reproduce

> beam_PreCommit_Website_Stage_GCS_Commit failed
> --
>
> Key: BEAM-9817
> URL: https://issues.apache.org/jira/browse/BEAM-9817
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Kyle Weaver
>Priority: Minor
> Fix For: Not applicable
>
>
> Task :website:buildGcsWebsite FAILED
> Liquid Exception: 500 Internal Server Error in 
> documentation/transforms/python/elementwise/flatmap.md



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-4513) Slack self-service join link is outdated on "Contact us" page

2020-05-13 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden updated BEAM-4513:
---
Status: Triage Needed  (was: Open)

> Slack self-service join link is outdated on "Contact us" page
> -
>
> Key: BEAM-4513
> URL: https://issues.apache.org/jira/browse/BEAM-4513
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Mikhail Gryzykhin
>Assignee: Brent Worden
>Priority: Major
>  Labels: newbie, starter, usability
>
> On [https://beam.apache.org/community/contact-us/] page there is section for 
> slack that contains "Join" link.
> That link gets outdated every month. 
> We want to either get automated way to get this link valid, or specify 
> alternative way to getting on slack.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (BEAM-9956) Bad formatting on environments page

2020-05-13 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden updated BEAM-9956:
---
Status: Open  (was: Triage Needed)

> Bad formatting on environments page
> ---
>
> Key: BEAM-9956
> URL: https://issues.apache.org/jira/browse/BEAM-9956
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Kyle Weaver
>Assignee: Brent Worden
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://beam.apache.org/documentation/runtime/environments/
> At the bottom of the page, it looks like there are text instructions that are 
> formatted as code and vice-versa.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-8541) Beam directRunner documentation java tab has python information

2020-05-13 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-8541 started by Brent Worden.
--
> Beam directRunner documentation java tab has python information
> ---
>
> Key: BEAM-8541
> URL: https://issues.apache.org/jira/browse/BEAM-8541
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Daniel Collins
>Assignee: Brent Worden
>Priority: Major
>
> [https://beam.apache.org/documentation/runners/direct/]
>  
> Clicking on the "Java SDK" tab still has python examples.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (BEAM-9956) Bad formatting on environments page

2020-05-13 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on BEAM-9956 started by Brent Worden.
--
> Bad formatting on environments page
> ---
>
> Key: BEAM-9956
> URL: https://issues.apache.org/jira/browse/BEAM-9956
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Kyle Weaver
>Assignee: Brent Worden
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://beam.apache.org/documentation/runtime/environments/
> At the bottom of the page, it looks like there are text instructions that are 
> formatted as code and vice-versa.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-8541) Beam directRunner documentation java tab has python information

2020-05-13 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-8541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden reassigned BEAM-8541:
--

Assignee: Brent Worden

> Beam directRunner documentation java tab has python information
> ---
>
> Key: BEAM-8541
> URL: https://issues.apache.org/jira/browse/BEAM-8541
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Daniel Collins
>Assignee: Brent Worden
>Priority: Major
>
> [https://beam.apache.org/documentation/runners/direct/]
>  
> Clicking on the "Java SDK" tab still has python examples.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9926) Certain code examples for programming guide are not showing

2020-05-13 Thread Brent Worden (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106818#comment-17106818
 ] 

Brent Worden commented on BEAM-9926:


[~epicfaace], what browser are you using?  All the code examples appear for me 
using Chrome.  Can you provide additional information so I can try to reproduce?

> Certain code examples for programming guide are not showing
> ---
>
> Key: BEAM-9926
> URL: https://issues.apache.org/jira/browse/BEAM-9926
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Ashwin Ramaswami
>Priority: Major
> Attachments: screenshot-1.png
>
>
> Seems like the code examples for the entire State section are missing. See 
> [https://beam.apache.org/documentation/programming-guide/#state-and-timers]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-9991) Mention start/finishBundle in ParDo documentation

2020-05-13 Thread Brent Worden (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106798#comment-17106798
 ] 

Brent Worden commented on BEAM-9991:


[~rtnguyen], just to be clear, you are referring to the 
[https://beam.apache.org/documentation/transforms/java/elementwise/pardo/] page?

> Mention start/finishBundle in ParDo documentation
> -
>
> Key: BEAM-9991
> URL: https://issues.apache.org/jira/browse/BEAM-9991
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: Brent Worden
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9956) Bad formatting on environments page

2020-05-13 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden reassigned BEAM-9956:
--

Assignee: Brent Worden

> Bad formatting on environments page
> ---
>
> Key: BEAM-9956
> URL: https://issues.apache.org/jira/browse/BEAM-9956
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Kyle Weaver
>Assignee: Brent Worden
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> https://beam.apache.org/documentation/runtime/environments/
> At the bottom of the page, it looks like there are text instructions that are 
> formatted as code and vice-versa.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-9991) Mention start/finishBundle in ParDo documentation

2020-05-13 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-9991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden reassigned BEAM-9991:
--

Assignee: Brent Worden

> Mention start/finishBundle in ParDo documentation
> -
>
> Key: BEAM-9991
> URL: https://issues.apache.org/jira/browse/BEAM-9991
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Rose Nguyen
>Assignee: Brent Worden
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (BEAM-4513) Slack self-service join link is outdated on "Contact us" page

2020-05-13 Thread Brent Worden (Jira)


 [ 
https://issues.apache.org/jira/browse/BEAM-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brent Worden reassigned BEAM-4513:
--

Assignee: Brent Worden

> Slack self-service join link is outdated on "Contact us" page
> -
>
> Key: BEAM-4513
> URL: https://issues.apache.org/jira/browse/BEAM-4513
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Mikhail Gryzykhin
>Assignee: Brent Worden
>Priority: Major
>  Labels: newbie, starter, usability
>
> On [https://beam.apache.org/community/contact-us/] page there is section for 
> slack that contains "Join" link.
> That link gets outdated every month. 
> We want to either get automated way to get this link valid, or specify 
> alternative way to getting on slack.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-4513) Slack self-service join link is outdated on "Contact us" page

2020-05-13 Thread Brent Worden (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106773#comment-17106773
 ] 

Brent Worden commented on BEAM-4513:


This seems to work as expected now.  Clicking on *Slack* or *join the #beam 
channel* both are directed to the Beam Slack channel.  Am I missing something?

> Slack self-service join link is outdated on "Contact us" page
> -
>
> Key: BEAM-4513
> URL: https://issues.apache.org/jira/browse/BEAM-4513
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Mikhail Gryzykhin
>Priority: Major
>  Labels: newbie, starter, usability
>
> On [https://beam.apache.org/community/contact-us/] page there is section for 
> slack that contains "Join" link.
> That link gets outdated every month. 
> We want to either get automated way to get this link valid, or specify 
> alternative way to getting on slack.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (BEAM-4596) Release Candidates' Maven Repository

2020-05-13 Thread Brent Worden (Jira)


[ 
https://issues.apache.org/jira/browse/BEAM-4596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106768#comment-17106768
 ] 

Brent Worden commented on BEAM-4596:


[~cemo], do you still need this issue resolved?  It has been almost two years 
and is several versions ago.

> Release Candidates' Maven Repository
> 
>
> Key: BEAM-4596
> URL: https://issues.apache.org/jira/browse/BEAM-4596
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, website
>Affects Versions: 2.4.0
>Reporter: Cemalettin Koç
>Priority: Minor
>
> I would like to give a try with 2.5.0-RC2 release candidates but I could not 
> find anywhere these files. Please provide necessary repositories. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)