Zbigniew Baranowski created RANGER-5125:
-------------------------------------------
Summary: Missing the result column value in ORC File Logging
Key: RANGER-5125
URL: https://issues.apache.org/jira/browse/RANGER-5125
Project: Ranger
Issue Type: Bug
Components: audit
Affects Versions: 2.5.0, 2.4.0, 2.3.0
Reporter: Zbigniew Baranowski
h4. {*}Description{*}:
There is an issue in {{ORCFileUtil.log()}} when writing audit logs in ORC
format. The _result_ field in the audit schema is of type \{{short }}and is not
properly handled when being cast to a string. This results in empty values in
the corresponding _accessResult_ column in the ORC file.
h4. {*}Affected Component{*}:
* {{org.apache.ranger.audit.provider.ORCFileUtil}}
* {{castStringObject(Object object)}} method
h4. {*}Steps to Reproduce{*}:
# Run the main() from ORCFileUtil class:
[https://github.com/apache/ranger/blob/a90a77e1ce12a0f7193533e846c504caea293d21/agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java#L85]
# This will write the orc file under /tmp/test.orc
# Open the file with for example spark and read out the content, the
'accessResult' column will not have values in any row even if the corresponding
event had it set.
{code:java}
val df =spark.read.orc("/tmp/test.orc")
df: org.apache.spark.sql.DataFrame = [repositoryType: int, repositoryName:
string ... 24 more fields]
scala> df.show(false)
25/01/29 19:28:12 WARN package: Truncated the string representation of a plan
since it was too large. This behavior can be adjusted by setting
'spark.sql.debug.maxToStringFields'.
+--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
|repositoryType|repositoryName|user|eventTime |accessType|resourcePath
|resourceType|action|accessResult|agentId|policyId|resultReason|aclEnforcer|sessionId|clientType|clientIP
|requestData|agentHostname|logType|eventId|seqNum|eventCount|eventDurationMS|additionalInfo|clusterName|zoneName|
+--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log001 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|0 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log111 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|1 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log221 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|2 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log331 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|3 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log441 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|4 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log551 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|5 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log661 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|6 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log771 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|7 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log881 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|8 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log991 |file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|9 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log10101|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|10 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log11111|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|11 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log12121|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|12 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log13131|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|13 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log14141|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|14 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log15151|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|15 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log16161|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|16 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log17171|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|17 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log18181|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|18 |0 |1 |0 | | |
|
|1 |hdfsdev | |2025-01-29 19:25:10|read
|/tmp/test-audit.log19191|file | | | |0 |1
|ranger-acl | | |127.0.0.1| | |
|19 |0 |1 |0 | | |
|
+--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
{code}
*Expected Behavior:*
* {{short}} values (result field) will be correctly converted to strings
before writing to ORC.
h4. {*}Root Cause{*}:
* The {{castStringObject(Object object)}} method is missing a case for
{{{}Short{}}}.
* This results in {{null}} or incorrect conversions when a {{short}} value is
written to ORC.
h4. {*}Proposed Fix{*}:
Modify {{castStringObject(Object object)}} in {{ORCFileUtil.java}} to properly
handle {{Short}} values:
{code:java}
protected String castStringObject(Object object) {
String ret = null;
try {
if (object instanceof String)
ret = (String) object;
else if (object instanceof Date) {
ret = getDateString((Date) object);
}
else if (object instanceof Short) { // Fix: Added case for Short
ret = ((Short) object).toString();
}
} catch (Exception e) {
logger.error("Error while writing into ORC File:", e);
}
return ret;
} {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)