ppkarwasz opened a new pull request, #428:
URL: https://github.com/apache/commons-codec/pull/428

   This PR adds a `GitIdentifiers` utility class to support computing SWHID 
identifiers for a wider range of sources.
   
   The `DigestUtils.gitTree` method introduced in #427 was limited to 
directories on the local filesystem. `GitIdentifiers`
   replaces and extends it to also handle virtual directory structures such as 
archive contents.
   
   ## New API
   
   **`blobId`**: computes a Git/SWHID blob identifier. Four overloads are 
provided:
   
   | Overload                                   | Notes                         
                 |
   
|--------------------------------------------|------------------------------------------------|
   | `blobId(MessageDigest, byte[])`            | Content already in memory     
                 |
   | `blobId(MessageDigest, InputStream)`       | Content buffered from stream  
                 |
   | `blobId(MessageDigest, long, InputStream)` | Size known; streams directly 
without buffering |
   | `blobId(MessageDigest, Path)`              | Regular file or symlink       
                 |
   
   **`treeId(MessageDigest, Path)`**: computes a Git/SWHID tree identifier for 
a directory on the filesystem.
   
   **`treeIdBuilder(MessageDigest)`**: returns a `TreeIdBuilder` for 
constructing a tree identifier from any source. The
   builder accumulates entries via:
   
   - `addFile(FileMode, String, …)`: three overloads matching those of `blobId` 
above; paths containing `/` automatically
     create intermediate subdirectories.
   - `addDirectory(String)`: creates a subdirectory node and returns its 
builder; accepts multi-level paths.
   - `build()`: computes and returns the tree identifier.
   
   **`FileMode`**: enum with values `REGULAR`, `EXECUTABLE`, `SYMBOLIC_LINK`, 
and `DIRECTORY`.
   
   ### Example
   
   ```java
   TreeIdBuilder builder = 
GitIdentifiers.treeIdBuilder(DigestUtils.getSha1Digest());
   try (TarArchiveInputStream tar = new TarArchiveInputStream(new 
GzipCompressorInputStream(input))) {
       TarArchiveEntry entry;
       while ((entry = tar.getNextTarEntry()) != null) {
           String name = entry.getName();
           if (name.isEmpty()) {
               continue; // root directory entry
           }
           if (entry.isDirectory()) {
               builder.addDirectory(name);
           } else if (entry.isSymbolicLink()) {
               builder.addFile(
                       FileMode.SYMBOLIC_LINK, name, 
entry.getLinkName().getBytes(StandardCharsets.UTF_8));
           } else {
               FileMode mode = (entry.getMode() & 0111) != 0 ? 
FileMode.EXECUTABLE : FileMode.REGULAR;
               // Pass the size so the content is streamed directly without 
buffering.
               builder.addFile(mode, name, entry.getSize(), tar);
           }
       }
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to