dhruv-pratap commented on PR #13903:
URL: https://github.com/apache/iceberg/pull/13903#issuecomment-3224905636

   > I think this is a great idea, but this is a relatively heavy approach to 
implementation. I think we really don't need to do much more than swapping the 
PartitionsTable implementation with the "ManifestEntries" table implementation 
(+ a group and distinct?)
   > 
   > I also think this complexity is probably not needed
   > 
   > ```
   > New Planning Modes: Added LOCAL, DISTRIBUTED, and AUTO modes via 
METADATA_PLANNING_MODE table property
   > Auto-switching: Automatically uses distributed scanning when manifest 
count exceeds configurable threshold (default: 10)
   > Enhanced PartitionsTable: Implements DistributedPartitionsScan for 
parallel manifest processing
   > Comprehensive Testing: Added tests for core functionality and Spark 
integration (v3.5, v4.0)
   > Backward Compatibility: Existing behavior preserved with AUTO mode as 
default
   > ```
   > 
   > We can probably just swap out the implementation and be done with it. 
Almost everyone I know who relies on the "partitions" table actually does a 
"SELECT * FROM FILES/ENTRIES Group by Partition" anyway
   
   @RussellSpitzer I agree—that’s the same workaround we recommend to our users 
internally at Netflix as well. The reason I initially went with a more flexible 
approach was because I wasn’t sure how the community would feel about 
completely replacing the original implementation. I wanted to preserve backward 
compatibility just in case.
   
   But if there’s consensus, I’m more than happy to fully swap out the 
implementation. That would definitely simplify the solution and likely 
eliminate the need for the parameterized test coverage introduced by the new 
planning modes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to