[ https://issues.apache.org/jira/browse/HIVE-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13573298#comment-13573298 ]
Gopal V commented on HIVE-3992: ------------------------------- We can't fix it when the map-splits are properly distributed onto different map-tasks. But as the profile shows, we have a CombineHiveRecordReader which is reading multiple splits in the same process using different RCFileRecordReaders. I put in some prints to check for sync behaviour of readers. {code} ip-10-195-75-130: split = 0-67108864 ip-10-195-75-130: sync = 57 ip-10-195-75-130: Last seen sync = 70351814 (in 57-67108864) ip-10-195-75-130: split = 67108864-134217728 ip-10-195-75-130: sync = 70351814 ip-10-195-75-130: Last seen sync = 136274939 (in 70351814-134217728) ip-10-195-75-130: split = 134217728-157715536 ip-10-195-75-130: sync = 136274939 {code} so every preceding RCFileRecordReader knows what was the last sync point, except the next one fails to use that information & does a fresh sync(). We need a sync cache within the same process for the same file-split. I.e find me the last sync where sync.end > split.start && sync.start < split.start for the same path. Holding that info in-memory should avoid sync passes after the first 57 byte sync-check. > Hive RCFile::sync(long) does a sub-sequence linear search for sync blocks > ------------------------------------------------------------------------- > > Key: HIVE-3992 > URL: https://issues.apache.org/jira/browse/HIVE-3992 > Project: Hive > Issue Type: Bug > Environment: Ubuntu x86_64/java-1.6/hadoop-2.0.3 > Reporter: Gopal V > Attachments: select-join-limit.html > > > The following function does some bad I/O > {code} > public synchronized void sync(long position) throws IOException { > ... > try { > seek(position + 4); // skip escape > in.readFully(syncCheck); > int syncLen = sync.length; > for (int i = 0; in.getPos() < end; i++) { > int j = 0; > for (; j < syncLen; j++) { > if (sync[j] != syncCheck[(i + j) % syncLen]) { > break; > } > } > if (j == syncLen) { > in.seek(in.getPos() - SYNC_SIZE); // position before > // sync > return; > } > syncCheck[i % syncLen] = in.readByte(); > } > } > ... > } > {code} > This causes a rather large number of readByte() calls which are passed onto a > ByteBuffer via a single byte array. > This results in rather a large amount of CPU being burnt in a the linear > search for the sync pattern in the input RCFile (upto 92% for a skewed > example - a trivial map-join + limit 100). > This behaviour should be avoided at best or at least replaced by a rolling > hash for efficient comparison, since it has a known byte-width of 16 bytes. > Attached the stack trace from a Yourkit profile. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira