[jira] Commented: (HIVE-1801) HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice

2010-11-23 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935079#action_12935079
 ] 

He Yongqiang commented on HIVE-1801:


+1 running tests.

> HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice
> 
>
> Key: HIVE-1801
> URL: https://issues.apache.org/jira/browse/HIVE-1801
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
> Attachments: HIVE-1801.1.patch, HIVE-1801.2.patch
>
>
> HiveInputFormat or CombineHiveInputFormat RCFile.Reader.sync() twice. One in 
> getReader() and one in initIOContext(). We can avoid the latter one by read 
> the sync() position of the former one.
> We also sync() twice for SequenceFile but since SequenceFileReader is not a 
> part of Hive code, maybe we should be careful when depending on the 
> implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1801) HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice

2010-11-23 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12935048#action_12935048
 ] 

He Yongqiang commented on HIVE-1801:


siying, the patch is not for this jira. 
can you upload a new patch?

> HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice
> 
>
> Key: HIVE-1801
> URL: https://issues.apache.org/jira/browse/HIVE-1801
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
> Attachments: HIVE-1801.1.patch, HIVE-1802.1.patch
>
>
> HiveInputFormat or CombineHiveInputFormat RCFile.Reader.sync() twice. One in 
> getReader() and one in initIOContext(). We can avoid the latter one by read 
> the sync() position of the former one.
> We also sync() twice for SequenceFile but since SequenceFileReader is not a 
> part of Hive code, maybe we should be careful when depending on the 
> implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1801) HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice

2010-11-22 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934536#action_12934536
 ] 

He Yongqiang commented on HIVE-1801:


can you put "if (recordReader instanceof RCFileRecordReader)" at the same level 
with "else if (inputFormatClass.getName().contains("RCFile")) {"?

 } else if (inputFormatClass.getName().contains("RCFile")) {
-  RCFile.Reader in = new RCFile.Reader(fs, path, job);
   blockPointer = true;
-  in.sync(fileSplit.getStart());
-  blockStart = in.getPosition();
-  in.close();
+
+  if (recordReader instanceof RCFileRecordReader) {
+blockStart = ((RCFileRecordReader)recordReader).getStart();
+  } else {
+RCFile.Reader in = new RCFile.Reader(fs, path, job);
+in.sync(fileSplit.getStart());
+blockStart = in.getPosition();
+in.close();
+  }


> HiveInputFormat or CombineHiveInputFormat always sync blocks of RCFile twice
> 
>
> Key: HIVE-1801
> URL: https://issues.apache.org/jira/browse/HIVE-1801
> Project: Hive
>  Issue Type: Improvement
>Reporter: Siying Dong
>Assignee: Siying Dong
> Attachments: HIVE-1801.1.patch
>
>
> HiveInputFormat or CombineHiveInputFormat RCFile.Reader.sync() twice. One in 
> getReader() and one in initIOContext(). We can avoid the latter one by read 
> the sync() position of the former one.
> We also sync() twice for SequenceFile but since SequenceFileReader is not a 
> part of Hive code, maybe we should be careful when depending on the 
> implementation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.