[ 
https://issues.apache.org/jira/browse/PIG-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545642#comment-13545642
 ] 

Bill Graham commented on PIG-3108:
----------------------------------

Got it, thanks for the clarification. A few comments then:

1. Yes, we should rename {{addFiltersWithoutColumnPrefix}} to {{addScans}}, 
since the filter part is misleading and it seems we can always call that method 
as implemented.
2. Instead of calling {{addFiltersWithoutColumnPrefix}} from {{setLocation}} 
let's just remove the  addFamily/addColumn block from {{setLocation}}. Then in 
{{initScan()}} we can handle both scans and filters in one place with something 
like this (to replace the existing conditional):

{noformat}
addScans(columnInfo_);

if (!columnPrefixExists) {
  addFiltersWithoutColumnPrefix(columnInfo_);
}
{noformat}

3. Would you please update the javadocs in {{addFiltersWithoutColumnPrefix}} 
(or {{addScan}}) to describe the new logic as best you can. This section of the 
filter/scan code has been particularly nasty in the past so we should be as 
clear as possible about what's happening here.


                
> HBaseStorage returns empty maps when mixing wildcard- with other columns
> ------------------------------------------------------------------------
>
>                 Key: PIG-3108
>                 URL: https://issues.apache.org/jira/browse/PIG-3108
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.9.0, 0.9.1, 0.9.2, 0.10.0, 0.11, 0.10.1, 0.12
>            Reporter: Christoph Bauer
>             Fix For: 0.12
>
>         Attachments: PIG-3108.patch
>
>
> Consider the following:
> A and B should be the same (with different order, of course).
> {code}
> /*
> in hbase shell:
> create 'pigtest', 'pig'
> put 'pigtest' , '1', 'pig:name', 'A'
> put 'pigtest' , '1', 'pig:has_legs', 'true'
> put 'pigtest' , '1', 'pig:has_ribs', 'true'
> */
> A = LOAD 'hbase://pigtest' USING 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('pig:name pig:has*') AS 
> (name:chararray,parts);
> B = LOAD 'hbase://pigtest' USING 
> org.apache.pig.backend.hadoop.hbase.HBaseStorage('pig:has* pig:name') AS 
> (parts,name:chararray);
> dump A;
> dump B;
> {code}
> This is due to a bug in setLocation and initScan.
> For _A_ 
> # scan.addColumn(pig,name); // for 'pig:name'
> # scan.addFamily(pig); // for the 'pig:has*'
> So that's silently right.
> But for _B_
> # scan.addFamily(pig)
> # scan.addColumn(pig,name)
> will override the first call to addFamily, because you cannot mix them on the 
> same family.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to