hamfors opened a new issue, #26145:
URL: https://github.com/apache/superset/issues/26145

   When importing a dataset with api/v1/dataset/import the data in the 
referenced csv-file (ex. data: http://example/file.csv) is not overwritten the 
second time you import the same dataset (but with updated data in the csv-file) 
and overwrite=true is set in the request
   
   By looking at the code in datasets/commands/importers/v1/utils.py -> 
import_dataset
   ----
   if data_uri and (not table_exists or force_data):
           load_data(data_uri, dataset, dataset.database, session)
   ----
   In the second import rest-call data_uri exists and "not table_exists" = 
false because table exists
    I believe that force_data is not set to true if overwrite=true in the 
import rest call.
   
   #### How to reproduce the bug
   
   1. Upload a file.csv to a configured database (in my case mysql)
   2. Export the dataset that was created
   3. Add some extra lines in the file.csv
   4. Modify the exported dataset-zip file and add a line  "data: 
http://example/file.csv"; referencing the updated file.csv
   5. Import the zip file with the swagger ui and the set the overwrite=true
   
   or
   do 1 and 2
   3. Modify the dataset -zip with new "uuid" and "data" and "table_name"
   4. Import the zip file with the swagger ui and the set the overwrite=false 
to create a new dataset
   5. Add some extra lines in the file.csv
   6. Import the zip file again with the swagger ui and the set the 
overwrite=true
   
   ### Expected results
   I would expect the content of the updated file.csv in the physical table not 
the content of the original initial file.csv 
   I can also see that superset does not log: (the second time)
    logger.info("Downloading data from %s", data_uri)
   
   ### Actual results
   I see in the logs that the dataset is updated with the same information 
(column,metrics) but the content of the new updated file is not downloaded and 
inserted in the physical table
   
   ### Environment
   
   (please complete the following information):
   repository: apache/superset
   tag: latest-dev
   
   root@superset-865d68b7f6-b7hkl:/app/superset# superset --version
   Loaded your LOCAL configuration at [/app/pythonpath/superset_config.py]
   Python 3.9.18
   Flask 2.2.5
   Werkzeug 2.3.3
   
   FEATURE_FLAGS = {
       "EMBEDDED_SUPERSET": True,
       "DASHBOARD_RBAC": True,
       "THUMBNAILS": True,
       "HORIZONTAL_FILTER_BAR": True
   }
   ### Checklist
   
   Make sure to follow these steps before submitting your issue - thank you!
   
   - [ x] I have checked the superset logs for python stacktraces and included 
it here as text if there are any.
   - [ x] I have reproduced the issue with at least the latest released version 
of superset.
   - [ x] I have checked the issue tracker for the same issue and I haven't 
found one similar.
   
   ### Additional context
   
   Add any other context about the problem here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org
For additional commands, e-mail: notifications-h...@superset.apache.org

Reply via email to