It's surprising since we have a successful test running with "Get File
Names" on the Beam direct runner.

https://ci-builds.apache.org/job/Hop/job/Hop-integration-tests/lastCompletedBuild/testReport/(root)/beam_directrunner/0010_get_file_names/

I think that the main thing is to have permissions on the gs:// location
you want to get files from.

Cheers,

Matt


Op wo 6 sep. 2023 09:05 schreef Fabian Peters <p...@mercadu.de>:

> Good morning all!
>
> Not having worked with Hop for a couple of months I downloaded the 2.5.0
> version and found that an existing pipeline failed to work as expected.
> This is due to the "Get file names" transform returning only a single row
> for each row passed to "Get filename from field". I ran into the same
> issue
> <https://issues.apache.org/jira/projects/HOP/issues/HOP-4191?filter=allissues>
>  last
> year, but the fix <https://github.com/apache/hop/pull/1674/files> I
> provided turned out to sometimes cause a stack overflow
> <https://issues.apache.org/jira/projects/HOP/issues/HOP-4528?filter=allissues>
>  and
> was reverted. (No hard feelings…)
>
> Is there another way to make this work on Beam/Dataflow? Or is there an
> alternative approach I can use to get all files in a GCS path, short of
> using their HTTP API?
>
> Besides this: Great work on the Dataflow template handling – works like a
> charm now!
>
> cheers
>
> Fabian
>

Reply via email to