shashwatsai opened a new pull request, #7809:
URL: https://github.com/apache/seatunnel/pull/7809

   ### Purpose of this pull request
   The Hadoop Source/Sink fails with Unable to find valid Kerberos Ticket. In 
HadoopFileSystemProxy, the UserGroupInformation Object is tightly bound to the 
security context of the initiating thread. If we try to run the privileged 
actions that require the subject's security context, it fails with the error, 
   
   ```
   Caused by: 
org.apache.hadoop.security.authentication.client.AuthenticationException: 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)
   ```
   
   We can attach Security Context of the Specified Subject to the thread 
Performing Action by using UserGroupInformation#doAs, it attaches the subject's 
security context for time being till the action is running. 
   
   This issue can be reproduced,
   - Set System property **javax.security.auth.useSubjectCredsOnly** to _false_.
   - Run the HdfsFile source/sync job
   
   Example: 
   ```
   env {
   "job.mode"=BATCH
   "job.name"="SeaTunnel_Job"
   "savemode.execute.location"=CLUSTER
   }
   source {
   HdfsFile {
       path="/data/daily/aggregation/tblAssocProfileData_daily_180710.txt"
       "file_format_type"=CSV
       "field_delimiter"="|"
       parallelism=1,
       "use_kerberos"="true"
       "kerberos_principal"="[email protected]"
       "fs.defaultFS"="hdfs://clusterA"
       "hdfs_site_path"="/etc/hadoop/conf/hdfs-site.xml"
       "kerberos_keytab_path"="/home/user/service_user.keytab"
       "krb5_path"="/etc/krb5.conf"
       "core_site_path"="/etc/hadoop/conf/core-site.xml"
   }
   }
   sink {
   HdfsFile {
       path="/Projects/test/reports/sea_tunnel/single/"
       tmp_path="/Projects/test/reports/"
       "file_format_type"=CSV
       "row_delimiter"="\n"
       "field_delimiter"="|"
       "enable_header_write"="false"
       "use_kerberos"="true"
       "kerberos_principal"="[email protected]"
       "fs.defaultFS"="hdfs://clusterA"
       "hdfs_site_path"="/etc/hadoop/conf/hdfs-site.xml"
       "kerberos_keytab_path"="/home/root/service_user.keytab"
       "krb5_path"="/etc/krb5.conf"
       "core_site_path"="/etc/hadoop/conf/core-site.xml"
   }
   }
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   
   <!--
   If tests were added, say they were added here. Please make sure to add some 
test cases that check the changes thoroughly including negative and positive 
cases if possible.
   If it was tested in a way different from regular unit tests, please clarify 
how you tested step by step, ideally copy and paste-able, so that other 
reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why 
it was difficult to add.
   If you are adding E2E test cases, maybe refer to 
https://github.com/apache/seatunnel/blob/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/connector-cdc-mysql-e2e/src/test/resources/mysqlcdc_to_mysql.conf,
 here is a good example.
   -->
   
   
   ### Check list
   
   * [ ] If any new Jar binary package adding in your PR, please add License 
Notice according
     [New License 
Guide](https://github.com/apache/seatunnel/blob/dev/docs/en/contribution/new-license.md)
   * [ ] If necessary, please update the documentation to describe the new 
feature. https://github.com/apache/seatunnel/tree/dev/docs
   * [ ] If you are contributing the connector code, please check that the 
following files are updated:
     1. Update 
[plugin-mapping.properties](https://github.com/apache/seatunnel/blob/dev/plugin-mapping.properties)
 and add new connector information in it
     2. Update the pom file of 
[seatunnel-dist](https://github.com/apache/seatunnel/blob/dev/seatunnel-dist/pom.xml)
     3. Add ci label in 
[label-scope-conf](https://github.com/apache/seatunnel/blob/dev/.github/workflows/labeler/label-scope-conf.yml)
     4. Add e2e testcase in 
[seatunnel-e2e](https://github.com/apache/seatunnel/tree/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/)
     5. Update connector 
[plugin_config](https://github.com/apache/seatunnel/blob/dev/config/plugin_config)
   * [ ] Update the 
[`release-note`](https://github.com/apache/seatunnel/blob/dev/release-note.md).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to