nkronenfeld opened a new pull request, #36613:
URL: https://github.com/apache/spark/pull/36613

   ### What changes were proposed in this pull request?
   
   This PR simply adds typed select methods to Dataset up to the max Tuple size 
of 22.
   
   This has been bugging me for years, so I finally decided to get off my 
backside and do something about it :-).
   
   As noted in the JIRA issue, technically, this is a breaking change - indeed, 
I had to remove an old test that specifically tested that Spark didn't support 
typed select for tuples larger than 5.  However, it would take someone 
explicitly relying on select returning a DataFrame instead of a Dataset when 
using select on large tuples of typed columns (though I guess that test I had 
to remove exhibits one case where this may happen).
   
   I've set the PR as WIP because I've been unable to run all tests so far - 
not due to the fix, but rather due to not having things set up correctly on my 
computer.  Still working on that.
   
   ### Why are the changes needed?
   Arbitrarily supporting only up to 5-tuples is weird and unpredictable.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, select on tuples of all typed columns larger than 5 will now return a 
Dataset instead of a DataFrame
   
   ### How was this patch tested?
   I've run all sql tests, and they all pass (though testing itself still fails 
on my machine, I think with a path-too-long error
   I've added a test to make sure the typed select works on all sizes - mostly 
this is a compile issue, not a run-time issue, but I checked values too, just 
to double-check that I didn't miss anything (which is a big potential problem 
with long tuples and copy-paste errors)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to