Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/17723
  
    > I'm saying avoid exposing Hadoop APIs. Wrap them around something if 
possible.
    
    If that's all you'd like to see, it's not hard. But at the same time, it 
doesn't solve a whole lot of problems; as I mentioned, both the implementation 
of the provider and the Spark code calling into the provider have to use Hadoop 
APIs, so they still have to agree at some point.
    
    To try to give a more concrete example: let's say that we hide Hadoop 
types, and now some credential provider uses a UGI API that's only available in 
Hadoop 2.9, and someone tries to run it on Spark built Hadoop 2.6. Even though 
Spark hasn't exposed the Hadoop API, the dependency leaks, and things will 
still not work.
    
    > it looks like the best way to make progress is to generalize the 
interface. Are you OK with this?
    
    If masking the Hadoop types in the API make people happy, sure, do that. I 
don't see much benefit, since the implementation will still be pretty coupled 
to Hadoop one way or another, but it's not a big change from the current change 
anyway.
    
    I just don't want this to turn into "let's solve the problem of how to 
support all the security frameworks out there", when we have no idea of what 
other security frameworks we even might want to support.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to