mchades commented on code in PR #7009:
URL: https://github.com/apache/gravitino/pull/7009#discussion_r2051909850


##########
docs/manage-fileset-metadata-using-gravitino.md:
##########
@@ -315,16 +315,52 @@ Currently, Gravitino supports two **types** of filesets:
    specified as `EXTERNAL`, the files of the fileset will **not** be deleted 
when
    the fileset is dropped.
 
-**storageLocation**
+:::note
+If the locations of the manged fileset do not exist, Gravitino will 
create/delete the locations when the fileset is created/deleted.
+Unless the catalog property `disable-filesystem-ops` is set to true or the 
location contains a 
[placeholder](./manage-fileset-metadata-using-gravitino.md#placeholder).
+:::
+
+#### storageLocation
 
 The `storageLocation` is the physical location of the fileset. Users can 
specify this location
 when creating a fileset, or follow the rules of the catalog/schema location if 
not specified.
 
+The value of `storageLocation` depends on the configuration settings of the 
catalog:
+- If this is a local fileset catalog, the `storageLocation` should be in the 
format of `file:///path/to/fileset`.
+- If this is a HDFS fileset catalog, the `storageLocation` should be in the 
format of `hdfs://namenode:port/path/to/fileset`.
+
+For a `MANAGED` fileset, the storage location is:
+
+1. The one specified by the user during the fileset creation, and the 
[placeholder](#placeholder) will be replaced by the
+   corresponding fileset property value.
+2. When the catalog property `location` is specified but the schema property 
`location` isn't specified, the storage location is:
+   1. `catalog location/schema name/fileset name` if `catalog location` does 
not contain any placeholder. 
+   2. `catalog location` - placeholders in the catalog location will be 
replaced by the corresponding fileset property value.
+
+3. When the catalog property `location` isn't specified but the schema 
property `location` is specified,
+   the storage location is:
+   1. `schema location/fileset name` if `schema location` does not contain any 
placeholder.
+   2. `schema location` - placeholders in the schema location will be replaced 
by the corresponding fileset property value.
+   
+4. When both the catalog property `location` and the schema property 
`location` are specified, the storage
+   location is:
+   1. `schema location/fileset name` if `schema location` does not contain any 
placeholder.
+   2. `schema location` - placeholders in the schema location will be replaced 
by the corresponding fileset property value.
+
+5. When both the catalog property `location` and schema property `location` 
isn't specified, the user
+   should specify the `storageLocation` in the fileset creation.

Review Comment:
   Thank you for pointing this out. How about change to:
   ```markdown
   For a `MANAGED` fileset, the storage location is determined in the following 
priority order:
   
   1. If the user specifies `storageLocation` during fileset creation:
      - This location is used, with any [placeholders](#placeholder) replaced 
by the corresponding fileset property values.
   
   2. If the user doesn't specify `storageLocation`:
      - If schema property `location` is specified:
        - Use `<schema location>/<fileset name>` if schema location has no 
placeholders
        - Use `<schema location>` with placeholders replaced by fileset 
property values
      
      - Otherwise, if catalog property `location` is specified:
        - Use `<catalog location>/<schema name>/<fileset name>` if catalog 
location has no placeholders
        - Use `<catalog location>` with placeholders replaced by fileset 
property values
      
      - If neither schema nor catalog location is specified:
        - The user must provide `storageLocation` during fileset creation
   
   For an `EXTERNAL` fileset, the user must always specify `storageLocation` 
during fileset creation. If the provided location contains placeholders, they 
will be replaced by the corresponding fileset property values.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to