[jira] [Commented] (HADOOP-1593) FsShell should work with paths in non-default FileSystem

ASF GitHub Bot (Jira) Fri, 10 Apr 2026 21:43:29 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18072787#comment-18072787
 ]


ASF GitHub Bot commented on HADOOP-1593:
----------------------------------------

manika137 commented on code in PR #8400:
URL: https://github.com/apache/hadoop/pull/8400#discussion_r3067601972


##########
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/ReadBufferManagerV1.java:
##########
@@ -146,6 +168,124 @@ public void queueReadAhead(final AbfsInputStream stream, 
final long requestedOff
     }
   }
 
+  /**
+   * Queue a vectored read for a buffer-sized physical read unit.
+   *
+   * <p>The method first attempts to attach the logical unit to an already
+   * in-progress physical read for the same file and offset. If that is not
+   * possible, a free read buffer is acquired and a new backend read is
+   * queued.</p>
+   *
+   * @param stream         input stream for the file being read
+   * @param unit           buffer-sized combined file range to be read
+   * @param tracingContext tracing context used for the backend read request
+   * @param allocator      allocator used to create buffers for vectored 
fan-out
+   * @return {@code true} if the read was queued or attached to an existing
+   *         in-progress buffer; {@code false} if no buffer was available
+   */
+  boolean queueVectoredRead(AbfsInputStream stream,
+      CombinedFileRange unit,
+      TracingContext tracingContext,
+      IntFunction<ByteBuffer> allocator) {
+    /* Create a child tracing context for vectored read-ahead requests */
+    TracingContext readAheadTracingContext =
+        new TracingContext(tracingContext);
+    readAheadTracingContext.setPrimaryRequestID();
+    readAheadTracingContext.setReadType(ReadType.VECTORED_READ);
+
+    synchronized (this) {
+      if (isAlreadyQueued(stream, unit.getOffset())) {
+        ReadBuffer existing = findQueuedBuffer(stream, unit.getOffset());
+        if (existing != null && existing.getStream().getETag() != null
+            && stream.getETag().equals(existing.getStream().getETag())) {
+          /*
+           * For AVAILABLE buffers use actual bytes read (getLength()) for
+           * coverage check. For READING_IN_PROGRESS buffers use
+           * requestedLength as an estimate — the short-read guard will be
+           * applied later in doneReading before dispatching completion.
+           */
+          long end = existing.getOffset() + (
+              existing.getStatus() == ReadBufferStatus.AVAILABLE
+                  ? existing.getLength()
+                  : existing.getRequestedLength());
+          if (end >= unit.getOffset() + unit.getLength()) {
+            existing.setBufferType(BufferType.VECTORED);
+            existing.addVectoredUnit(unit);
+            existing.setAllocator(allocator);
+            if (existing.getStatus() == ReadBufferStatus.AVAILABLE) {
+              /*
+               * Buffer is already AVAILABLE. Trigger completion immediately.
+               * Use getLength() (actual bytes) for coverage — redundant here
+               * since the outer check already used getLength() for AVAILABLE,
+               * but kept explicit for clarity.
+               */
+              LOGGER.debug("Hitchhiking onto AVAILABLE buffer {}, length {}",
+                  existing, existing.getLength());
+              handleVectoredCompletion(existing,
+                  existing.getStatus(),
+                  existing.getLength());
+            }
+            /*
+             * For READING_IN_PROGRESS: unit is attached and will be
+             * completed in doneReading once actual bytes are known.
+             * Short-read safety is enforced there via per-unit coverage check.
+             */
+            return true;

Review Comment:
   this is not only the READING_IN_PROGRESS case right- incase the buffer is 
UNAVAILABLE (queued but not picked up by any thread yet) it would also land 
here.





> FsShell should work with paths in non-default FileSystem
> --------------------------------------------------------
>
>                 Key: HADOOP-1593
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1593
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Doug Cutting
>            Assignee: Mahadev Konar
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.17.0
>
>         Attachments: patch_1593.patch, patch_1593_1.patch
>
>
> If the default filesystem is, e.g., hdfs://foo:8888/, one should still be 
> able to do 'bin/hadoop fs -ls hdfs://bar:9999/' or 'bin/hadoop fs -ls 
> s3://cutting/foo'.  Currently these generate a filesystem mismatch exception. 
>  This is because FsShell assumes that all paths are in the default 
> FileSystem.  Rather, the default filesystem should only be used for paths 
> that do not specify a FileSystem.  This would easily be accomplished by using 
> Path#getFileSystem().



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-1593) FsShell should work with paths in non-default FileSystem

Reply via email to