Hello!
I'm working with some parquet files saved on amazon service and loading
them to dataframe with

Dataset<Row> df = spark.read() .parquet(parketFileLocation);

however, after some time I get the "Timeout waiting for connection from
pool" exception. I hope I'm not mistaken, but I think that there's the
limitation for the length of any open connection with s3a, but I have
enough local memory to actually just load the file and close the connection.

Is it possible to specify some option when reading the parquet to store the
data locally and release the connection? Or any other ideas on how to solve
the problem?

Thank you very much,
have a nice day!
Karin

Reply via email to