[jira] [Created] (SPARK-40286) Load Data from S3 deletes data source file

Drew (Jira) Tue, 30 Aug 2022 23:04:36 -0700

Drew created SPARK-40286:
----------------------------

             Summary: Load Data from S3 deletes data source file
                 Key: SPARK-40286
                 URL: https://issues.apache.org/jira/browse/SPARK-40286
             Project: Spark
          Issue Type: Question
          Components: Documentation
    Affects Versions: 3.2.1
            Reporter: Drew



Hello, 

I'm using spark to load data into a hive table through Pyspark, and when I load 
data from a path in Amazon S3, the original file is getting wiped from the 
Directory. The file is found, and is populating the table with data. I also 
tried to add the `Local` clause but that throws an error when looking for the 
file. When looking through the documentation it doesn't explicitly state that 
this is the intended behavior.

Thanks in advance!
{code:java}
spark.sql("CREATE TABLE src (key INT, value STRING) STORED AS textfile")
spark.sql("LOAD DATA INPATH 's3://bucket/kv1.txt' OVERWRITE INTO TABLE 
src"){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-40286) Load Data from S3 deletes data source file

Reply via email to