[
https://issues.apache.org/jira/browse/PHOENIX-2432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013076#comment-15013076
]
Gabriel Reid commented on PHOENIX-2432:
---------------------------------------
FWIW, I'm not a big fan of implicit recursive deletes of directories.
Let's say that this is the first time that I've run the tool, and I just set
the output to my user directory, or even the root directory. This change will
ensure that all files in my user directory or root directory will be wiped out,
without warning, which would be a pretty unpleasant surprise.
Manually running {{hdfs dfs -rm -r /path/to/my/dir}} manually (or as part of a
driver shell script) seems like an acceptable trade off here, as well as being
the least surprising option.
Two other notes on this patch:
* If no output path is supplied, the tool will still attempt to delete the
non-existent path, which I assume will result in it printing an error every time
* Having an empty catch clause on the delete operation is probably a bad idea
-- if something is wrong in interacting with the FileSystem, we probably want
to know about it (i.e. at the very least log it).
> CsvBulkLoad output dir was not cleaned up for interrupted job
> -------------------------------------------------------------
>
> Key: PHOENIX-2432
> URL: https://issues.apache.org/jira/browse/PHOENIX-2432
> Project: Phoenix
> Issue Type: Bug
> Reporter: Alicia Ying Shu
> Assignee: Alicia Ying Shu
> Attachments: PHOENIX-2432.patch
>
>
> When we use the --output parameter to trigger a csvBulkLoad job, we need to
> manually delete the output folder before being able to trigger another job
> over the same table if we interrupted the job before its finishing
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)