[ https://issues.apache.org/jira/browse/PIG-1205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899722#action_12899722 ]
Dmitriy V. Ryaboy commented on PIG-1205: ---------------------------------------- bq. 1. Is it possible to specify min_row_key and max_row_key in parameters Even better than that -- you can specify lt, lte, gt, and gte. It's true that as written splits will be created for the whole table, but the filters will cause most of those splits to immediately exit. Not creating the splits is on my todo list (I already do this in the elephantbird version for 0.6) bq. 2. One small suggestion: move line 206 to if block (only one time setting is enough) Good idea. bq. 3. It's better to add warning log in HBaseBinaryConverter when the bytes is cut off for type conversion Will do. bq. 4. The parameter "Per-region limit" is a bit confusing for me, I think users would like to the set the limit on the whole table not per region. What do you think ? Trouble is, you can't enforce a total limit without post-processing. In practice, I use -limit when I am experimenting and want to get just a few rows from HBase; if I want a specific number of rows, I use both -limit (to speed up the tasks, since the scanners will exit early), and Pig's LIMIT operator (to get the exact number of rows I need). > Enhance HBaseStorage-- Make it support loading row key and implement StoreFunc > ------------------------------------------------------------------------------ > > Key: PIG-1205 > URL: https://issues.apache.org/jira/browse/PIG-1205 > Project: Pig > Issue Type: Sub-task > Affects Versions: 0.7.0 > Reporter: Jeff Zhang > Assignee: Dmitriy V. Ryaboy > Fix For: 0.8.0 > > Attachments: PIG_1205.patch, PIG_1205_2.patch, PIG_1205_3.patch, > PIG_1205_4.patch, PIG_1205_5.path > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.