Hi,

Maybe you could share some code, so we could have a better picture of what is going on.

Last time I had to read from HDFS (normally in our pipelines HDFS is just a sink), we used FileIO:
https://beam.apache.org/documentation/sdks/javadoc/2.3.0/index.html?org/apache/beam/sdk/io/FileIO.html

It gives you ReadableFile(s). Which we read as regular files, in our case, we converted each line to an object we expected for the next Transformer.

Cheers,
Leonardo Campos

On 02.09.2018 00:05, Mahesh Vangala wrote:
Hello all -

I have installed a pseudo-distributed yarn and spark.
My beam pipeline reads a TextIO from file and it runs fine when I
launch the pipeline using --master spark://master.
However, I am having difficulties in getting this run with --master yarn. I am pretty sure using TextIO from a local file in yarn is causing issues.
I did look into beam api beam.sdk.io.hadoop and spark, but no luck in
finding right info.
If you could nudge me in the right direction, that'd be great!
Thank you for your help.

Regards,
Mahesh

--MAHESH VANGALA

(PH) 443-326-1957
(WEB) MVANGALA.COM [1]

Links:
------
[1] http://mvangala.com

Reply via email to