anjiahao1 opened a new pull request, #17882:
URL: https://github.com/apache/nuttx/pull/17882

   # ArchiveFS
   
   ArchiveFS is a read-only filesystem driver for NuttX RTOS that allows 
mounting archive files (ZIP, 7ZIP, TAR, etc.) as virtual filesystems. It uses 
the libarchive library to parse various archive formats and provides 
transparent access to files within archives.
   
   ## Table of Contents
   
   - [Features](#features)
   - [Benefits](#benefits)
   - [Configuration](#configuration)
   - [Dependencies](#dependencies)
   - [Usage](#usage)
   - [Supported Operations](#supported-operations)
   - [Technical Implementation](#technical-implementation)
   - [Example Use Cases](#example-use-cases)
   
   ## Features
   
   - **Read-only access**: Extract and read files from archives without 
extracting to disk
   - **Multiple format support**: Configurable support for various archive 
formats including:
     - ZIP
     - 7ZIP
     - TAR
     - CPIO
     - AR
     - CAB
     - ISO9660
     - LHA
     - MTREE
     - RAR (versions 4 and 5)
     - RAW
     - WARC
     - XAR
     - EMPTY
   - **Directory traversal**: List and navigate archive contents as directories
   - **File operations**: Open, read, seek, and stat files within archives
   - **Memory efficient**: Uses streaming access without full extraction
   - **Thread safe**: Uses mutex for concurrent access protection
   - **Seek optimization**: Supports forward seeking with efficient skipping
   
   ## Benefits
   
   1. **Space efficient**: No need to extract archives to disk
   2. **Memory efficient**: Streaming access reduces RAM usage
   3. **Flexible**: Supports 15+ archive formats
   4. **Easy to use**: Standard POSIX-like file operations
   5. **Configurable**: Enable only needed formats to save binary size
   
   ## Configuration
   
   The ArchiveFS filesystem is configured via Kconfig options:
   
   ### Core Configuration
   
   ```bash
   CONFIG_FS_ARCHIVEFS=y
   ```
   
   This enables the ArchiveFS filesystem support.
   
   ### Buffer Size
   
   ```bash
   CONFIG_FS_ARCHIVEFS_BUFFER_SIZE=32768
   ```
   
   Configures the buffer size for reading archive data. Default is 32768 bytes 
(32KB). This buffer is used by libarchive to read and write data.
   
   ### Format Support
   
   #### Enable All Formats
   
   To enable support for all archive formats (note: this will increase binary 
size):
   
   ```bash
   CONFIG_FS_ARCHIVEFS_FORMAT_ALL=y
   ```
   
   #### Enable Individual Formats
   
   When `CONFIG_FS_ARCHIVEFS_FORMAT_ALL` is not set, you can enable specific 
formats individually. ZIP format is enabled by default.
   
   ```bash
   # ZIP (default enabled)
   CONFIG_FS_ARCHIVEFS_FORMAT_ZIP=y
   
   # Other formats (disabled by default)
   CONFIG_FS_ARCHIVEFS_FORMAT_7ZIP=n
   CONFIG_FS_ARCHIVEFS_FORMAT_AR=n
   CONFIG_FS_ARCHIVEFS_FORMAT_CAB=n
   CONFIG_FS_ARCHIVEFS_FORMAT_CPIO=n
   CONFIG_FS_ARCHIVEFS_FORMAT_EMPTY=n
   CONFIG_FS_ARCHIVEFS_FORMAT_ISO9660=n
   CONFIG_FS_ARCHIVEFS_FORMAT_LHA=n
   CONFIG_FS_ARCHIVEFS_FORMAT_MTREE=n
   CONFIG_FS_ARCHIVEFS_FORMAT_RAR=n
   CONFIG_FS_ARCHIVEFS_FORMAT_RAR_V5=n
   CONFIG_FS_ARCHIVEFS_FORMAT_RAW=n
   CONFIG_FS_ARCHIVEFS_FORMAT_TAR=n
   CONFIG_FS_ARCHIVEFS_FORMAT_WARC=n
   CONFIG_FS_ARCHIVEFS_FORMAT_XAR=n
   ```
   
   **Note**: Enabling `CONFIG_FS_ARCHIVEFS_FORMAT_ALL` will include all format 
support and significantly increase the binary size. For production builds, it's 
recommended to enable only the formats you actually need.
   
   ## Dependencies
   
   ArchiveFS requires the following dependencies:
   
   ### Required
   
   - **libarchive**: Must be enabled via `CONFIG_UTILS_LIBARCHIVE=y`
   
   ### Optional Compression Libraries
   
   Depending on the archive formats used, you may need additional compression 
libraries:
   
   - **XZ compression**: `CONFIG_UTILS_XZ=y` (for archives using LZMA/XZ 
compression)
   - **Zlib**: `CONFIG_LIB_ZLIB=y` (for archives using DEFLATE compression)
   
   ### Example Configuration
   
   A typical configuration for ZIP archive support with compression:
   
   ```bash
   CONFIG_FS_ARCHIVEFS=y
   CONFIG_FS_ARCHIVEFS_FORMAT_ZIP=y
   CONFIG_FS_ARCHIVEFS_BUFFER_SIZE=32768
   CONFIG_UTILS_LIBARCHIVE=y
   CONFIG_LIB_ZLIB=y
   ```
   
   ## Usage
   
   ### Basic Mounting
   
   Archive files can be mounted at any mount point in the NuttX filesystem. The 
archive file path is specified using the `-o` option:
   
   ```bash
   nsh> mount -t archivefs -o /path/to/archive.zip /mnt
   ```
   
   Where:
   - `-t archivefs`: Specifies the filesystem type as archivefs
   - `-o /path/to/archive.zip`: Path to the archive file to mount
   - `/mnt`: Mount point directory where the archive will be accessible
   
   ### Demo: Running ArchiveFS on QEMU
   
   This example demonstrates using ArchiveFS on the MPS3-AN547 board running on 
QEMU.
   
   #### 1. Prepare the Archive File
   
   On the host machine, create a test file and compress it into a ZIP archive:
   
   ```bash
   # Create a test file with random data
   dd if=/dev/urandom of=testfile.bin bs=1K count=10
   
   # Create a ZIP archive with LZMA compression
   7z a -tzip -mm=LZMA archive.zip testfile.bin
   ```
   
   #### 2. Configure and Build NuttX
   
   Configure the build for the ArchiveFS demo configuration:
   
   ```bash
   # Configure the build with CMake
   cmake -B build -DBOARD_CONFIG=boards/arm/mps/mps3-an547/configs/archivefs
   
   # Build the firmware
   cmake --build build
   ```
   
   #### 3. Start QEMU with Archive Loaded
   
   Start QEMU and load the archive file at a specific memory address. The 
MPS3-AN547 board has access to RAM starting at 0x60000000:
   
   ```bash
   # Start QEMU and load the archive file at memory address 0x60000000
   qemu-system-arm -M mps3-an547 -m 2G -nographic \
     -kernel build/nuttx.bin \
     -gdb tcp::1127 \
     -device loader,file=archive.zip,addr=0x60000000
   ```
   
   #### 4. Use ArchiveFS in NuttX Shell
   
   Once QEMU is running and NuttX has booted, you can mount and use the archive:
   
   ```bash
   # Mount the archive as a filesystem
   nsh> mount -t archivefs -o /dev/ram1 /archivefs
   
   # List contents of the archive
   nsh> ls -l /archivefs/
   -rw-r--r--        0     0     10240 archivefs  testfile.bin
   
   # Read a file from the archive
   nsh> cat /archivefs/testfile.bin
   
   # Copy a file from archive to another filesystem
   nsh> mount -t tmpfs tmp
   nsh> cp /archivefs/testfile.bin /tmp/testfile.bin
   
   # Verify the copied file
   nsh> ls -l /tmp/testfile.bin
   -rw-r--r--        0     0     10240 tmp        testfile.bin
   
   # Unmount the archive when done
   nsh> umount /archivefs
   ```
   
   ### Working with Multiple Archives
   
   You can mount multiple archives at different mount points:
   
   ```bash
   nsh> mount -t archivefs -o /path/to/archives/data.zip /data
   nsh> mount -t archivefs -o /path/to/archives/config.zip /config
   nsh> mount -t archivefs -o /path/to/archives/resources.zip /resources
   
   # List all mounts
   nsh> mount
   /data  archivefs  /path/to/archives/data.zip
   /config archivefs  /path/to/archives/config.zip
   /resources archivefs /path/to/archives/resources.zip
   ```
   
   ### Working with Directory Structures
   
   Archives containing directories can be navigated just like normal 
filesystems:
   
   ```bash
   nsh> ls -R /archivefs
   /archivefs:
   dir1/
   dir2/
   file1.txt
   
   /archivefs/dir1:
   subdir1/
   file2.txt
   
   /archivefs/dir1/subdir1:
   file3.txt
   
   /archivefs/dir2:
   file4.txt
   
   # Access files in subdirectories
   nsh> cat /archivefs/dir1/subdir1/file3.txt
   ```
   
   ## Supported Operations
   
   ArchiveFS implements the following VFS operations:
   
   ### File Operations
   
   #### `open()`
   Opens a file within the archive.
   
   ```c
   int fd = open("/archivefs/path/to/file.txt", O_RDONLY);
   ```
   
   - **Parameters**:
     - `O_RDONLY`: Read-only access (required, ArchiveFS is read-only)
     - Other flags are ignored
   - **Returns**: File descriptor on success, negative error code on failure
   
   #### `read()`
   Reads data from a file in the archive.
   
   ```c
   ssize_t bytes_read = read(fd, buffer, sizeof(buffer));
   ```
   
   - **Behavior**:
     - Reads decompressed data directly from the archive
     - Supports streaming access without full file extraction
     - Thread-safe with mutex protection
   
   #### `seek()`
   Seeks within a file in the archive.
   
   ```c
   off_t offset = lseek(fd, 0, SEEK_SET);   // Seek to beginning
   offset = lseek(fd, 100, SEEK_CUR);      // Seek 100 bytes forward
   offset = lseek(fd, 0, SEEK_END);        // Seek to end
   ```
   
   - **Optimization**:
     - Forward seeking uses efficient skipping with a seek buffer
     - Backward seeking requires reopening the archive entry (more expensive)
   - **Thread-safe**: All seek operations are protected by mutex
   
   #### `stat()` / `fstat()`
   Gets file metadata (size, permissions, timestamps).
   
   ```c
   struct stat st;
   int ret = stat("/archivefs/file.txt", &st);
   ```
   
   - **Returns**:
     - File size (`st_size`)
     - File permissions (`st_mode`)
     - Modification time (`st_mtime`)
     - Other standard stat fields from the archive entry
   
   #### `dup()`
   Duplicates a file descriptor.
   
   ```c
   int new_fd = dup(fd);
   ```
   
   - **Behavior**: Creates a new handle to the same archive entry
   
   #### `close()`
   Closes an open file and frees resources.
   
   ```c
   close(fd);
   ```
   
   ### Directory Operations
   
   #### `opendir()` / `readdir()` / `closedir()`
   Lists directory contents.
   
   ```c
   DIR *dir = opendir("/archivefs");
   struct dirent *entry;
   while ((entry = readdir(dir)) != NULL) {
       printf("File: %s\n", entry->d_name);
   }
   closedir(dir);
   ```
   
   - **Behavior**:
     - Lists all files and directories in the archive
     - Full paths are returned (not just filenames)
     - Directory traversal works recursively
   
   #### `rewinddir()`
   Rewinds directory listing to the beginning.
   
   ```c
   rewinddir(dir);
   ```
   
   ### Filesystem Operations
   
   #### `statfs()`
   Gets filesystem statistics.
   
   ```c
   struct statfs fs;
   statfs("/archivefs", &fs);
   printf("Filesystem type: 0x%lx\n", fs.f_type);
   ```
   
   - **Returns**:
     - `f_type`: ArchiveFS magic number
     - `f_namelen`: Maximum filename length
   
   #### `mount()` / `umount()`
   Mounts and unmounts archive filesystems.
   
   ```bash
   nsh> mount -t archivefs -o /path/to/archive.zip /mnt
   nsh> umount /mnt
   ```
   
   ## Technical Implementation
   
   ### Architecture
   
   ArchiveFS is built as a NuttX filesystem driver that integrates with the VFS 
(Virtual File System) layer. It uses the libarchive library to parse archive 
formats and provide access to archived files.
   
   #### Key Components
   
   1. **VFS Interface** (`g_archivefs_operations`)
      - Implements all required VFS operations
      - Provides POSIX-compatible file access
   
   2. **Private Data Structures**
      - `archivefs_priv_s`: Per-file private data
        - libarchive handle (`struct archive`)
        - Archive entry (`struct archive_entry`)
        - File handle to archive file
        - Read buffer
        - Seek buffer (allocated on-demand)
        - Mutex for thread safety
   
   3. **Callback Functions**
      - `archivefs_read_cb`: Reads data from the archive file
      - `archivefs_seek_cb`: Seeks within the archive file
      - `archivefs_close_cb`: Closes the archive file
   
   4. **Thread Safety**
      - All operations protected by mutex (`nxmutex_lock`/`nxmutex_unlock`)
      - Allows concurrent access from multiple threads
   
   #### Memory Management
   
   - **Heap Usage**: Uses `fs_heap` for allocations (filesystem-specific heap)
   - **Buffer Size**: Configurable via `CONFIG_FS_ARCHIVEFS_BUFFER_SIZE`
   - **Seek Buffer**: Allocated on-demand when seek operations are performed
   - **Efficient Cleanup**: All resources freed in `archivefs_free()`
   
   #### Seek Optimization
   
   ArchiveFS implements optimized seeking:
   
   1. **Forward Seek**: Efficiently skips data using a seek buffer
      - Reads data in chunks into the seek buffer
      - Discards data until reaching the target offset
      - No need to reopen the archive
   
   2. **Backward Seek**: Requires reopening the archive entry
      - Creates a new archive handle
      - Seeks to the target file
      - More expensive than forward seeking
   
   3. **Position Caching**: File position tracked in `filep->f_pos`
   
   ### Format Support
   
   ArchiveFS supports the following archive formats through libarchive:
   
   | Format | Kconfig Option | Default | Description |
   |--------|---------------|---------|-------------|
   | ZIP | `CONFIG_FS_ARCHIVEFS_FORMAT_ZIP` | y | ZIP archives (most common) |
   | 7ZIP | `CONFIG_FS_ARCHIVEFS_FORMAT_7ZIP` | n | 7-Zip archives |
   | TAR | `CONFIG_FS_ARCHIVEFS_FORMAT_TAR` | n | TAR archives (uncompressed) |
   | CPIO | `CONFIG_FS_ARCHIVEFS_FORMAT_CPIO` | n | CPIO archives |
   | AR | `CONFIG_FS_ARCHIVEFS_FORMAT_AR` | n | Unix AR archives |
   | CAB | `CONFIG_FS_ARCHIVEFS_FORMAT_CAB` | n | Microsoft CAB archives |
   | ISO9660 | `CONFIG_FS_ARCHIVEFS_FORMAT_ISO9660` | n | ISO 9660 CD-ROM 
images |
   | LHA | `CONFIG_FS_ARCHIVEFS_FORMAT_LHA` | n | LHA/LZH archives |
   | MTREE | `CONFIG_FS_ARCHIVEFS_FORMAT_MTREE` | n | BSD mtree format |
   | RAR | `CONFIG_FS_ARCHIVEFS_FORMAT_RAR` | n | RAR archives (version 4) |
   | RAR_V5 | `CONFIG_FS_ARCHIVEFS_FORMAT_RAR_V5` | n | RAR archives (version 
5) |
   | RAW | `CONFIG_FS_ARCHIVEFS_FORMAT_RAW` | n | Raw file data |
   | WARC | `CONFIG_FS_ARCHIVEFS_FORMAT_WARC` | n | Web ARChive format |
   | XAR | `CONFIG_FS_ARCHIVEFS_FORMAT_XAR` | n | Extensible Archive Format |
   | EMPTY | `CONFIG_FS_ARCHIVEFS_FORMAT_EMPTY` | n | Empty archives |
   
   **Note**: Each format adds to the binary size. Only enable formats you 
actually need.
   
   ### Error Handling
   
   ArchiveFS converts libarchive error codes to standard errno values:
   
   | libarchive Code | errno | Description |
   |-----------------|-------|-------------|
   | `ARCHIVE_RETRY` | `EAGAIN` | Operation should be retried |
   | `ARCHIVE_WARN` | `ENOEXEC` | Warning (non-fatal) |
   | `ARCHIVE_FAILED` | `EINVAL` | Operation failed |
   | `ARCHIVE_FATAL` | `EPERM` | Fatal error |
   
   ## Example Use Cases
   
   ### OTA (Over-The-Air) Updates
   
   Deliver OTA updates as compressed archives:
   
   ```bash
   # Download OTA update package
   nsh> wget -o /tmp/ota_update.zip https://example.com/updates/latest.zip
   
   # Mount and verify
   nsh> mount -t archivefs -o /tmp/ota_update.zip /ota
   
   # Check version compatibility
   nsh> cat /ota/version.txt
   2.1.0
   
   # Apply update if compatible
   nsh> copy /ota/firmware.bin /dev/flash0
   ```
   
   **Benefits**:
   - Smaller download size (compression)
   - Bandwidth savings
   - No temporary extraction storage needed
   - Atomic update verification
   
   ## Troubleshooting
   
   ### Common Issues
   
   #### Mount Fails with "No such device"
   
   **Cause**: Archive file not found or inaccessible
   
   **Solution**:
   ```bash
   # Check if archive file exists
   nsh> ls -l /path/to/archive.zip
   
   # Verify file is readable
   nsh> cat /path/to/archive.zip > /dev/null
   ```
   
   #### Cannot Open Files in Archive
   
   **Cause**: File format not supported
   
   **Solution**:
   - Verify archive format is enabled in configuration
   - Check `CONFIG_FS_ARCHIVEFS_FORMAT_*` options
   - Try enabling `CONFIG_FS_ARCHIVEFS_FORMAT_ALL` for testing
   
   #### Out of Memory Errors
   
   **Cause**: Buffer size too large or insufficient RAM
   
   **Solution**:
   ```bash
   # Reduce buffer size in configuration
   CONFIG_FS_ARCHIVEFS_BUFFER_SIZE=16384  # 16KB instead of 32KB
   ```
   
   #### Slow Performance
   
   **Cause**: Many backward seeks or small buffer size
   
   **Solution**:
   - Increase buffer size if RAM allows
   - Minimize backward seeks in application code
   - Consider caching frequently accessed files
   
   ### Debugging
   
   Enable debug options for troubleshooting:
   
   ```bash
   CONFIG_DEBUG_FS=y
   CONFIG_DEBUG_FS_ERROR=y
   CONFIG_DEBUG_FEATURES=y
   ```
   
   Debug messages will be logged to the console showing:
   - Archive open/close operations
   - File read operations
   - Seek operations
   - Error conditions
   
   ## References
   
   - **libarchive**: https://libarchive.org/
   - **NuttX Filesystems**: 
https://nuttx.apache.org/docs/latest/components/filesystem/index.html
   - **VFS Interface**: 
https://nuttx.apache.org/docs/latest/components/filesystem/index.html#vfs-interface


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to