Re: [s3a] Spark is not reading s3 object content

Mich Talebzadeh Fri, 31 May 2024 02:27:31 -0700

Tell Spark to read from a single file

data = spark.read.text("s3a://test-bucket/testfile.csv")

This clarifies to Spark that you are dealing with a single file and avoids
any bucket-like interpretation.

HTH

Mich Talebzadeh,
Technologist | Architect | Data Engineer  | Generative AI | FinCrime
PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College
London <https://en.wikipedia.org/wiki/Imperial_College_London>
London, United Kingdom

   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

 https://en.everybodywiki.com/Mich_Talebzadeh

*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von
Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)".

On Fri, 31 May 2024 at 09:53, Amin Mosayyebzadeh <mosayyebza...@gmail.com>
wrote:

> I will work on the first two possible causes.
> For the third one, which I guess is the real problem, Spark treats the
> testfile.csv object with the url s3a://test-bucket/testfile.csv as a bucket
> to access _spark_metadata with url
> s3a://test-bucket/testfile.csv/_spark_metadata
> testfile.csv is an object and should not be treated as a bucket. But I am
> not sure how to prevent Spark from doing that.
>

Re: [s3a] Spark is not reading s3 object content

Reply via email to