Pipeline parameters for running jobs in a cluster

Matthew K. Tue, 10 Dec 2019 14:46:44 -0800

Hi,

To run a beam job on a spark cluster with some number of nodes running:

1. Is it recommended to set pipeline parameters --num_workers, --max_num_workers, --autoscaling_algorithms, --worker_machine_type, etc, or beam (spark) will figure that out?

2. If that is recommended to set those params, what are the recommended values based on the machines and resources in the cluster?

Thanks

Pipeline parameters for running jobs in a cluster

Reply via email to