[
https://issues.apache.org/jira/browse/RANGER-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zbigniew Baranowski updated RANGER-5125:
----------------------------------------
Summary: Missing accessResult column value in ORC File Logging (was:
Missing the result column value in ORC File Logging)
> Missing accessResult column value in ORC File Logging
> -----------------------------------------------------
>
> Key: RANGER-5125
> URL: https://issues.apache.org/jira/browse/RANGER-5125
> Project: Ranger
> Issue Type: Bug
> Components: audit
> Affects Versions: 2.3.0, 2.4.0, 2.5.0
> Reporter: Zbigniew Baranowski
> Priority: Major
> Labels: easyfix
> Attachments: RANGER-5125.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> h4. {*}Description{*}:
> There is an issue in {{ORCFileUtil.log()}} when writing audit logs in ORC
> format. The _result_ field in the audit schema is of type \{{short }}and is
> not properly handled when being cast to a string. This results in empty
> values in the corresponding _accessResult_ column in the ORC file.
> h4. {*}Affected Component{*}:
> * {{org.apache.ranger.audit.provider.ORCFileUtil}}
> * {{castStringObject(Object object)}} method
> h4. {*}Steps to Reproduce{*}:
> # Run the main() from ORCFileUtil class:
> [https://github.com/apache/ranger/blob/a90a77e1ce12a0f7193533e846c504caea293d21/agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java#L85]
> # This will write the orc file under /tmp/test.orc
> # Open the file with for example spark and read out the content, the
> 'accessResult' column will not have values in any row even if the
> corresponding event had it set.
> {code:java}
> val df =spark.read.orc("/tmp/test.orc")
> df: org.apache.spark.sql.DataFrame = [repositoryType: int, repositoryName:
> string ... 24 more fields]
> scala> df.show(false)
> 25/01/29 19:28:12 WARN package: Truncated the string representation of a plan
> since it was too large. This behavior can be adjusted by setting
> 'spark.sql.debug.maxToStringFields'.
> +--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
> |repositoryType|repositoryName|user|eventTime
> |accessType|resourcePath
> |resourceType|action|accessResult|agentId|policyId|resultReason|aclEnforcer|sessionId|clientType|clientIP
>
> |requestData|agentHostname|logType|eventId|seqNum|eventCount|eventDurationMS|additionalInfo|clusterName|zoneName|
> +--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log001 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |0 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log111 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |1 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log221 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |2 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log331 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |3 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log441 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |4 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log551 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |5 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log661 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |6 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log771 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |7 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log881 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |8 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log991 |file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |9 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log10101|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |10 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log11111|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |11 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log12121|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |12 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log13131|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |13 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log14141|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |14 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log15151|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |15 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log16161|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |16 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log17171|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |17 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log18181|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |18 |0 |1 |0 | |
> | |
> |1 |hdfsdev | |2025-01-29 19:25:10|read
> |/tmp/test-audit.log19191|file | | | |0 |1
> |ranger-acl | | |127.0.0.1| |
> | |19 |0 |1 |0 | |
> | |
> +--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
> {code}
> *Expected Behavior:*
> * {{short}} values (result field) will be correctly converted to strings
> before writing to ORC.
> h4. {*}Root Cause{*}:
> * The {{castStringObject(Object object)}} method is missing a case for
> {{{}Short{}}}.
> * This results in {{null}} or incorrect conversions when a {{short}} value
> is written to ORC.
> h4. {*}Proposed Fix{*}:
> Modify {{castStringObject(Object object)}} in {{ORCFileUtil.java}} to
> properly handle {{Short}} values:
> {code:java}
> protected String castStringObject(Object object) {
> String ret = null;
> try {
> if (object instanceof String)
> ret = (String) object;
> else if (object instanceof Date) {
> ret = getDateString((Date) object);
> }
> else if (object instanceof Short) { // Fix: Added case for Short
> ret = ((Short) object).toString();
> }
> } catch (Exception e) {
> logger.error("Error while writing into ORC File:", e);
> }
> return ret;
> } {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)