Zbigniew Baranowski created RANGER-5125:
-------------------------------------------

             Summary: Missing the result column value in ORC File Logging
                 Key: RANGER-5125
                 URL: https://issues.apache.org/jira/browse/RANGER-5125
             Project: Ranger
          Issue Type: Bug
          Components: audit
    Affects Versions: 2.5.0, 2.4.0, 2.3.0
            Reporter: Zbigniew Baranowski


h4. {*}Description{*}:

There is an issue in {{ORCFileUtil.log()}} when writing audit logs in ORC 
format. The _result_ field in the audit schema is of type \{{short }}and is not 
properly handled when being cast to a string. This results in empty values in 
the corresponding _accessResult_ column in the ORC file.
h4. {*}Affected Component{*}:
 * {{org.apache.ranger.audit.provider.ORCFileUtil}}
 * {{castStringObject(Object object)}} method

h4. {*}Steps to Reproduce{*}:
 # Run the main() from ORCFileUtil class:  
[https://github.com/apache/ranger/blob/a90a77e1ce12a0f7193533e846c504caea293d21/agents-audit/src/main/java/org/apache/ranger/audit/utils/ORCFileUtil.java#L85]
 # This will write the orc file under /tmp/test.orc
 # Open the file with for example spark and read out the content, the 
'accessResult' column will not have values in any row even if the corresponding 
event had it set.

{code:java}
val df =spark.read.orc("/tmp/test.orc")
df: org.apache.spark.sql.DataFrame = [repositoryType: int, repositoryName: 
string ... 24 more fields]

scala> df.show(false)
25/01/29 19:28:12 WARN package: Truncated the string representation of a plan 
since it was too large. This behavior can be adjusted by setting 
'spark.sql.debug.maxToStringFields'.
+--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
|repositoryType|repositoryName|user|eventTime          |accessType|resourcePath 
           
|resourceType|action|accessResult|agentId|policyId|resultReason|aclEnforcer|sessionId|clientType|clientIP
 
|requestData|agentHostname|logType|eventId|seqNum|eventCount|eventDurationMS|additionalInfo|clusterName|zoneName|
+--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log001  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |0      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log111  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |1      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log221  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |2      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log331  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |3      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log441  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |4      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log551  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |5      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log661  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |6      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log771  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |7      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log881  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |8      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log991  |file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |9      |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log10101|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |10     |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log11111|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |11     |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log12121|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |12     |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log13131|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |13     |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log14141|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |14     |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log15151|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |15     |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log16161|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |16     |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log17171|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |17     |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log18181|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |18     |0     |1         |0              |              |           |     
   |
|1             |hdfsdev       |    |2025-01-29 19:25:10|read      
|/tmp/test-audit.log19191|file        |      |            |       |0       |1   
        |ranger-acl |         |          |127.0.0.1|           |             |  
     |19     |0     |1         |0              |              |           |     
   |
+--------------+--------------+----+-------------------+----------+------------------------+------------+------+------------+-------+--------+------------+-----------+---------+----------+---------+-----------+-------------+-------+-------+------+----------+---------------+--------------+-----------+--------+
{code}
*Expected Behavior:*
 * {{short}} values (result field) will be correctly converted to strings 
before writing to ORC.

h4. {*}Root Cause{*}:
 * The {{castStringObject(Object object)}} method is missing a case for 
{{{}Short{}}}.
 * This results in {{null}} or incorrect conversions when a {{short}} value is 
written to ORC.

h4. {*}Proposed Fix{*}:

Modify {{castStringObject(Object object)}} in {{ORCFileUtil.java}} to properly 
handle {{Short}} values:
{code:java}
protected String castStringObject(Object object) {
    String ret = null;
    try {
        if (object instanceof String)
            ret = (String) object;
        else if (object instanceof Date) {
            ret = getDateString((Date) object);
        }
        else if (object instanceof Short) {  // Fix: Added case for Short
            ret = ((Short) object).toString();
        }
    } catch (Exception e) {
        logger.error("Error while writing into ORC File:", e);
    }
    return ret;
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to