[ 
https://issues.apache.org/jira/browse/HDDS-2114?focusedWorklogId=313338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-313338
 ]

ASF GitHub Bot logged work on HDDS-2114:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Sep/19 22:37
            Start Date: 16/Sep/19 22:37
    Worklog Time Spent: 10m 
      Work Description: anuengineer commented on issue #1440: HDDS-2114: Rename 
does not preserve non-explicitly created interim directories
URL: https://github.com/apache/hadoop/pull/1440#issuecomment-531984790
 
 
   I am going to +1 this. Since we want to make sure Hive works.
   
   I just want to understand this more clearly. The issue is really that if we 
were a real file system, then there is nothing called an implicit path. Since 
we are an object store, there is a notion of a implicitly created file path (in 
this case the intermediary directories). I am guessing that S3AFS has the same 
problem, and either Hive has a workaround for this, or S3A is doing something 
clever. Do we know how Hive works on S3?
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 313338)
    Time Spent: 1h 10m  (was: 1h)

> Rename does not preserve non-explicitly created interim directories
> -------------------------------------------------------------------
>
>                 Key: HDDS-2114
>                 URL: https://issues.apache.org/jira/browse/HDDS-2114
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Istvan Fajth
>            Assignee: Lokesh Jain
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: demonstrative_test.patch
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I am attaching a patch that adds a test that demonstrates the problem.
> The scenario is coming from the way how Hive implements acid transactions 
> with the ORC table format, but the test is redacted to the simplest possible 
> code that reproduces the issue.
> The scenario:
>  * Given a 3 level directory structure, where the top level directory was 
> explicitly created, and the interim directory is implicitly created (for 
> example either by creating a file with create("/top/interim/file") or by 
> creating a directory with mkdirs("top/interim/dir"))
>  * When the leaf is moved out from the implicitly created directory making 
> this directory an empty directory
>  * Then a FileNotFoundException is thrown when getFileStatus or listStatus is 
> called on the interim directory.
> The expected behaviour:
> after the directory is becoming empty, the directory should still be part of 
> the file system, moreover an empty FileStatus array should be returned when 
> listStatus is called on it, and also a valid FileStatus object should be 
> returned when getFileStatus is called on it.
>  
>  
> As this issue is present with Hive, and as this is how a FileSystem is 
> expected to work this seems to be an at least critical issue as I see, please 
> feel free to change the priority if needed.
> Also please note that, if the interim directory is explicitly created with 
> mkdirs("top/interim") before creating the leaf, then the issue does not 
> appear.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to