[ https://issues.apache.org/jira/browse/HIVE-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14050635#comment-14050635 ]
Gopal V commented on HIVE-7144: ------------------------------- Re-run tests with trunk. > GC pressure during ORC StringDictionary writes > ----------------------------------------------- > > Key: HIVE-7144 > URL: https://issues.apache.org/jira/browse/HIVE-7144 > Project: Hive > Issue Type: Bug > Components: File Formats > Affects Versions: 0.14.0 > Environment: ORC Table ~ 12 string columns > Reporter: Gopal V > Assignee: Gopal V > Labels: ORC, Performance > Attachments: HIVE-7144.1.patch, HIVE-7144.2.patch, HIVE-7144.3.patch, > orc-string-write.png > > > When ORC string dictionary writes data out, it suffers from bad GC > performance due to a few allocations in-loop. > !orc-string-write.png! > The conversions are as follows > StringTreeWriter::getStringValue() causes 2 conversions > LazyString -> Text (LazyString::getWritableObject) > Text -> String (LazyStringObjectInspector::getPrimitiveJavaObject) > Then StringRedBlackTree::add() does one conversion > String -> Text > This causes some GC pressure with un-necessary String and byte[] array > allocations. -- This message was sent by Atlassian JIRA (v6.2#6252)