Neeru Gupta created ATLAS-1379: ---------------------------------- Summary: Avoid object query overhead when report query selects class type alias Key: ATLAS-1379 URL: https://issues.apache.org/jira/browse/ATLAS-1379 Project: Atlas Issue Type: Improvement Reporter: Neeru Gupta Assignee: Neeru Gupta Fix For: 0.8-incubating
When the DSL query is selecting the class alias, which causes Atlas to run an object query. Atlas should detect this query construct and avoid the overhead of full entity retrieval, when the entity ends up being serialized as an structure that just contains the guid, type, and status. For queries like: 'from hive_db as h select h', even though the end result only contains id object, Atlas code initially loads the entire object with all its properties and while serializing it ends up serializing only the id part. This bug is to avoid the overhead of loading the entire object when fields other than id will be discarded anyways. This added a lot of overhead and was identified as hotspot in our internal testing. This is specially relevant when large number of objects are retrieved as search results. Note that the fixes are backward compatible. The end result remains the same. Only the overwork to load unnecessary fields is avoided. Sample Query: 'from hive_db as h select h' Result: rows":[ { "$typeName$":"__tempQueryResultStruct2", "id":{ "id":"8159ee38-ec29-4d9a-845a-86fe17ab6bdb", "$typeName$":"hive_db", "version":0, "state":"ACTIVE" } }, { "$typeName$":"__tempQueryResultStruct2", "id":{ "id":"33016845-4b71-4d56-b131-b48ea38e5507", "$typeName$":"hive_db", "version":0, "state":"ACTIVE" } }, { "$typeName$":"__tempQueryResultStruct2", "id":{ "id":"b6fb9c53-680e-4be7-b143-f8d9355c5726", "$typeName$":"hive_db", "version":0, "state":"ACTIVE" } } ] -- This message was sent by Atlassian JIRA (v6.3.4#6332)