[ 
https://issues.apache.org/jira/browse/HBASE-7704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13622660#comment-13622660
 ] 

Himanshu Vashishtha commented on HBASE-7704:
--------------------------------------------

Thanks for the reviews guys:
bq. Would suggest you add to the class comment how to run it.
Will do.
bq. Does "./bin/hbase org.apache.hadoop.hbase.util.HFileV1Detector --h" work?
Yes.
bq. Suggest you show us what '-h' output looks like here in this issue:
{code}
./bin/hbase org.apache.hadoop.hbase.util.HFileV1Detector -help
usage: HFileV1Detector [-h] [-n <arg>] [-p <arg>]
 -h,--help                    Help
 -n,--numberOfThreads <arg>   Number of threads
 -p,--path <arg>              Path to table
In case no option is provided, it process hbase.rootdir with 10 threads.
{code}
The help section is printed with -h, --h, -help, and --help

bq. Suggest it return -1 if hfilev1 files found: + return 0;
The goal of this tool is to tell the user which regions have hfilev1 in them. 
It prints out those regions (currently, it prints full path to print the table 
name).

bq. What will you ignore?
Yes, .logs, etc. Basically, all the non-table directories.

bq. You keep saying you are going to print regions that have v1 files but you 
seem to be printing out full path
Yes, its intentional as to let the user know which table this region belongs. 
IMO, it is good to know.

bq. Suggest an option that will fast fail... fail as soon as it finds the first 
v1. This is probably not important so if it takes a while, just punt.
bq. It looks like you do fail fast – you stop scanning a family as soon as you 
find a v1 file?
Yea, I designed it with the following behavior in mind:
1) Scan a table one at a time. This way we can give a table clean chit if no 
hfilev1 is found inside it.
2) Scan regions in parallel. Here, the executor comes in. Basically, scanning a 
region is a task. If a hfile is found in any of the CF, then there is no need 
to scan other families as we would like the user to compact that region anyway.

bq. Why not include original exception here: + throw new IOException("Unknown 
version for hfile: " + storeFilePath);?
Will do what sergey said.

The current output is:
{code}
Table hdfs://localhost:41020/hbase-0.94/-ROOT- has no HFileV1.
Found a v1 hfile, 
hdfs://localhost:41020/hbase-0.94/t/c6b79b9f1ca4a37921355ddbfb521761/f/2811264815153459761
Region has a hfile v1: 
hdfs://localhost:41020/hbase-0.94/t/c6b79b9f1ca4a37921355ddbfb521761
Table hdfs://localhost:41020/hbase-0.94/t has 1 number of HFileV1.

==================Regions to Major Compact==============

hdfs://localhost:41020/hbase-0.94/t/c6b79b9f1ca4a37921355ddbfb521761

===========End of Regions to Major Compact==============

Total  number of HFile V1 is: 1
{code}

I will add a section to print out all the hfile v1, and remove extra messaging 
and fixed nits suggested by Sergey and paste the output.

                
> migration tool that checks presence of HFile V1 files
> -----------------------------------------------------
>
>                 Key: HBASE-7704
>                 URL: https://issues.apache.org/jira/browse/HBASE-7704
>             Project: HBase
>          Issue Type: Task
>            Reporter: Ted Yu
>            Assignee: Himanshu Vashishtha
>            Priority: Blocker
>             Fix For: 0.95.1
>
>         Attachments: HBase-7704-v1.patch
>
>
> Below was Stack's comment from HBASE-7660:
> Regards the migration 'tool', or 'tool' to check for presence of v1 files, I 
> imagine it as an addition to the hfile tool 
> http://hbase.apache.org/book.html#hfile_tool2 The hfile tool already takes a 
> bunch of args including printing out meta. We could add an option to print 
> out version only – or return 1 if version 1 or some such – and then do a bit 
> of code to just list all hfiles and run this script against each. Could MR it 
> if too many files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to