I assume from the previous messages that GCP Dataflow is being used as the
pipeline runner. Even without Flex Templates, the v2 runner can use docker
containers to install all dependencies from various sources[1]. I have
used docker containers to solve the same problem you mention: installing a
p
Hi,
I'm the guy who gave the Movie Magic talk. Since it's possible to write
stateful transforms with Beam, it is capable of some very sophisticated
flow control. I've not seen a python framework that combines this with
streaming data nearly as well. That said, there aren't a lot of great
workin
Oops, not sure I replied to all but I'm using ParquetIO:
PCollection records = pipeline.apply("Read parquet file
in as Generic Records",
ParquetIO.read(finalSchema).from(beamReadPath).withConfiguration(configuration));
The variable beamReadPath starts with the s3 prefix, and I set the
initial cre
Yes, I'm using ParquetIO as below:
PCollection records = pipeline.apply("Read parquet file
in as Generic Records",
ParquetIO.read(finalSchema).from(beamReadPath).withConfiguration(configuration));
On Fri, Dec 22, 2023 at 10:39 AM XQ Hu via user
wrote:
> Can you share some code snippets about h
Can you share some code snippets about how to read from S3? Do you use the
builtin TextIO?
On Fri, Dec 22, 2023 at 11:28 AM Ramya Prasad via user
wrote:
> Hello,
>
> I am a developer trying to use Apache Beam, and I have a nuanced problem I
> need help with. I have a pipeline which has to read i
When I search the Beam code base, there are plenty of places which
use Wait.on. You could check these code for some insights.
If this doesn't work, it would be better to create a small test case to
reproduce the problem and open a Github issue.
Sorry, I cannot help too much with this.
On Fri, Dec
Hello,
I am a developer trying to use Apache Beam, and I am running into an issue
where my WaitOn step is not working as expected. I want my pipeline to read
all the data from an S3 bucket using ParquetIO before moving on to the rest
of the steps in my pipeline. However, I see in my DAG that even
Hello,
I am a developer trying to use Apache Beam, and I have a nuanced problem I
need help with. I have a pipeline which has to read in 40 million records
from multiple Parquet files from AWS S3. The only way I can get the
credentials I need for this particular bucket is to call an API, which I d
You can use the same docker image for both template launcher and Dataflow
job. Here is one example:
https://github.com/google/dataflow-ml-starter/blob/main/tensorflow_gpu.flex.Dockerfile#L60
On Fri, Dec 22, 2023 at 8:04 AM Sumit Desai wrote:
> Yes, I will have to try it out.
>
> Regards
> Sumit
Yes, I will have to try it out.
Regards
Sumit Desai
On Fri, Dec 22, 2023 at 3:53 PM Sofia’s World wrote:
> I guess so, i am not an expert on using env variables in dataflow
> pipelines as any config dependencies i need, i pass them as job input
> params
>
> But perhaps you can configure variab
I guess so, i am not an expert on using env variables in dataflow pipelines
as any config dependencies i need, i pass them as job input params
But perhaps you can configure variables in your docker file (i am not an
expert in this either), as flex templates use Docker?
https://cloud.google.com
We are using an external non-public package which expects environmental
variables only. If environmental variables are not found, it will throw an
error. We can't change source of this package.
Does this mean we will face same problem with flex templates also?
On Fri, 22 Dec 2023, 3:39 pm Sofia’s
The flex template will allow you to pass input params with dynamic values
to your data flow job so you could replace the env variable with that
input? That is, unless you have to have env bars..but from your snippets it
appears you are just using them to configure one of your components?
Hth
On Fr
Hi Sofia and XQ,
The application is failing because I have loggers defined in every file and
the method to create a logger tries to create an object of
UplightTelemetry. If I use flex templated, will the environmental variables
I supply be loaded before the application gets loaded? If not, it woul
14 matches
Mail list logo