shashwatsai opened a new pull request, #7809:
URL: https://github.com/apache/seatunnel/pull/7809
### Purpose of this pull request
The Hadoop Source/Sink fails with Unable to find valid Kerberos Ticket. In
HadoopFileSystemProxy, the UserGroupInformation Object is tightly bound to the
security context of the initiating thread. If we try to run the privileged
actions that require the subject's security context, it fails with the error,
```
Caused by:
org.apache.hadoop.security.authentication.client.AuthenticationException:
GSSException: No valid credentials provided (Mechanism level: Failed to find
any Kerberos tgt)
```
We can attach Security Context of the Specified Subject to the thread
Performing Action by using UserGroupInformation#doAs, it attaches the subject's
security context for time being till the action is running.
This issue can be reproduced,
- Set System property **javax.security.auth.useSubjectCredsOnly** to _false_.
- Run the HdfsFile source/sync job
Example:
```
env {
"job.mode"=BATCH
"job.name"="SeaTunnel_Job"
"savemode.execute.location"=CLUSTER
}
source {
HdfsFile {
path="/data/daily/aggregation/tblAssocProfileData_daily_180710.txt"
"file_format_type"=CSV
"field_delimiter"="|"
parallelism=1,
"use_kerberos"="true"
"kerberos_principal"="[email protected]"
"fs.defaultFS"="hdfs://clusterA"
"hdfs_site_path"="/etc/hadoop/conf/hdfs-site.xml"
"kerberos_keytab_path"="/home/user/service_user.keytab"
"krb5_path"="/etc/krb5.conf"
"core_site_path"="/etc/hadoop/conf/core-site.xml"
}
}
sink {
HdfsFile {
path="/Projects/test/reports/sea_tunnel/single/"
tmp_path="/Projects/test/reports/"
"file_format_type"=CSV
"row_delimiter"="\n"
"field_delimiter"="|"
"enable_header_write"="false"
"use_kerberos"="true"
"kerberos_principal"="[email protected]"
"fs.defaultFS"="hdfs://clusterA"
"hdfs_site_path"="/etc/hadoop/conf/hdfs-site.xml"
"kerberos_keytab_path"="/home/root/service_user.keytab"
"krb5_path"="/etc/krb5.conf"
"core_site_path"="/etc/hadoop/conf/core-site.xml"
}
}
```
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
<!--
If tests were added, say they were added here. Please make sure to add some
test cases that check the changes thoroughly including negative and positive
cases if possible.
If it was tested in a way different from regular unit tests, please clarify
how you tested step by step, ideally copy and paste-able, so that other
reviewers can test and check, and descendants can verify in the future.
If tests were not added, please describe why they were not added and/or why
it was difficult to add.
If you are adding E2E test cases, maybe refer to
https://github.com/apache/seatunnel/blob/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/connector-cdc-mysql-e2e/src/test/resources/mysqlcdc_to_mysql.conf,
here is a good example.
-->
### Check list
* [ ] If any new Jar binary package adding in your PR, please add License
Notice according
[New License
Guide](https://github.com/apache/seatunnel/blob/dev/docs/en/contribution/new-license.md)
* [ ] If necessary, please update the documentation to describe the new
feature. https://github.com/apache/seatunnel/tree/dev/docs
* [ ] If you are contributing the connector code, please check that the
following files are updated:
1. Update
[plugin-mapping.properties](https://github.com/apache/seatunnel/blob/dev/plugin-mapping.properties)
and add new connector information in it
2. Update the pom file of
[seatunnel-dist](https://github.com/apache/seatunnel/blob/dev/seatunnel-dist/pom.xml)
3. Add ci label in
[label-scope-conf](https://github.com/apache/seatunnel/blob/dev/.github/workflows/labeler/label-scope-conf.yml)
4. Add e2e testcase in
[seatunnel-e2e](https://github.com/apache/seatunnel/tree/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/)
5. Update connector
[plugin_config](https://github.com/apache/seatunnel/blob/dev/config/plugin_config)
* [ ] Update the
[`release-note`](https://github.com/apache/seatunnel/blob/dev/release-note.md).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]