andygrove opened a new issue, #43:
URL: https://github.com/apache/datafusion-java/issues/43

   ### Is your feature request related to a problem or challenge?
   
   DataFusion's DataFrame API offers eight set-operation methods — union,
   intersect, except, and their `*_by_name` / `*_distinct` variants — and
   none of them are reachable from Java today.
   
   ### Describe the solution you'd like
   
   Expose the following on `DataFrame`, each taking another `DataFrame`:
   
   - `union(DataFrame other)` — by-position, keeps duplicates
   - `unionDistinct(DataFrame other)` — by-position, deduplicated
   - `unionByName(DataFrame other)` — by-name, keeps duplicates
   - `unionByNameDistinct(DataFrame other)` — by-name, deduplicated
   - `intersect(DataFrame other)` — `INTERSECT ALL`
   - `intersectDistinct(DataFrame other)` — `INTERSECT`
   - `except(DataFrame other)` — `EXCEPT ALL`
   - `exceptDistinct(DataFrame other)` — `EXCEPT`
   
   Lifecycle question worth deciding up front: do these consume the
   right-hand DataFrame? DataFusion's Rust API takes `dataframe: DataFrame`
   (owned), so the Java side will need to either consume `other`'s native
   handle (and forbid further use, like `collect`) or clone the underlying
   `LogicalPlan` on the native side. Suggest cloning — simpler caller
   contract, and `LogicalPlan` clone is cheap.
   
   Tests in `DataFrameTransformationsTest` covering each variant against
   small fixtures.
   
   ### Describe alternatives you've considered
   
   `UNION` / `INTERSECT` / `EXCEPT` via SQL. Works but requires
   registering both sides as tables.
   
   ### Additional context
   
   All eight share one JNI entry point per operation kind plus a boolean
   flag (by-name, distinct). Could plausibly land as one PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to