Tom:
Can you pastebin the stack trace for the exception ?

It would be nice if you can show snippet of your code too. 

Thanks

> On Jun 15, 2016, at 8:24 AM, Ellis, Tom (Financial Markets IT) 
> <tom.el...@lloydsbanking.com.INVALID> wrote:
> 
> So I have a working prototype using just bulk puts on a table and using 
> setCellVisibility as necessary. Now I'm trying to do it using HFile.
> 
> Sorry Ram, I don't quite follow why the user doing the writing of the HFile 
> has to be an admin/super user? Is that necessary to load HFiles?
> 
> The use case is to hopefully have an application user (non admin) performing 
> the writes to an hbase table via a bulk load of an hfile, setting visibility 
> labels on individual cells as necessary. Then business users who has been 
> given the auth to view that label can see those cells, and others not.
> 
> I've seen that it's possible to do this with map reduce & setting the map 
> output to be a Put (and thus could setCellVisibility on the puts), but I'm 
> struggling to do this with Spark, as I keep getting the exception that I 
> can't cast a Put to a Cell.
> 
> Cheers,
> 
> Tom Ellis
> Consultant Developer – Excelian
> Data Lake | Financial Markets IT
> LLOYDS BANK COMMERCIAL BANKING
> 
> 
> E: tom.el...@lloydsbanking.com
> Website: www.lloydsbankcommercial.com
> , , ,
> Reduce printing. Lloyds Banking Group is helping to build the low carbon 
> economy.
> Corporate Responsibility Report: www.lloydsbankinggroup-cr.com/downloads
> 
> 
> -----Original Message-----
> From: ramkrishna vasudevan [mailto:ramkrishna.s.vasude...@gmail.com]
> Sent: 15 June 2016 12:31
> To: user@hbase.apache.org
> Subject: Re: Writing visibility labels with HFileOutputFormat2
> 
> -- This email has reached the Bank via an external source --
> 
> 
>>> We could I guess create multiple puts for cells in the same row with
> different labels and use the setCellVisibility on each individual put/cell, 
> but will this create additional overhead?
> This can be done. If you want different cells in the same row to have 
> different labels then it is better to create those many puts and 
> setCellVisibility on each of them. What type of overhead you see here? In 
> terms of the server processing them? If so there should not be much overhead 
> here and also adding different cells to every column inturn means you need 
> every cell to be treated differenly in terms of security. so should be fine 
> IMHO.
> 
> Without doing put.setCellvisibility() there is no other way I believe. One 
> question regarding your use case Now in the mail you had told about the spark 
> job where you will create a bulk loaded file. Now if that is to have all the 
> visibility related information of all the cells then the user doing this job 
> should be an admin or super user right Why is the case that a normal client 
> user will read through all the visibility cells which may or may not be 
> associated with that user?
> 
> Thank you very much for testing and using this feature. LEt us know your 
> feedback and if you find any gaps here. Happy to help.
> 
> Regards
> Ram
> 
> 
>> On Wed, Jun 15, 2016 at 4:09 PM, Ellis, Tom (Financial Markets IT) < 
>> tom.el...@lloydsbanking.com.invalid> wrote:
>> 
>> Hmm, is there no other way to set labels on individual cells where we
>> don't have to give the client users system perms? For instance, client
>> users can set the cell visibility on the entire put without having
>> this (i.e. put.setCellVisibility("label")) and the
>> VisibilityController will check this.
>> 
>> We could I guess create multiple puts for cells in the same row with
>> different labels and use the setCellVisibility on each individual
>> put/cell, but will this create additional overhead?
>> 
>> Cheers,
>> 
>> Tom Ellis
>> Consultant Developer – Excelian
>> Data Lake | Financial Markets IT
>> LLOYDS BANK COMMERCIAL BANKING
>> 
>> 
>> E: tom.el...@lloydsbanking.com
>> Website: www.lloydsbankcommercial.com
>> , , ,
>> Reduce printing. Lloyds Banking Group is helping to build the low
>> carbon economy.
>> Corporate Responsibility Report:
>> www.lloydsbankinggroup-cr.com/downloads
>> 
>> 
>> -----Original Message-----
>> From: ramkrishna vasudevan [mailto:ramkrishna.s.vasude...@gmail.com]
>> Sent: 15 June 2016 11:24
>> To: user@hbase.apache.org
>> Subject: Re: Writing visibility labels with HFileOutputFormat2
>> 
>> -- This email has reached the Bank via an external source --
>> 
>> 
>> The visibility expression resolver tries to scan the labels table and
>> the user using the resolver should have the SYSTEM privileges. Since
>> the information that is getting accessed is sensitive information.
>> 
>> Suppose in your above case you have the client user added as a an
>> admin then when you scan the label table you should be able to  scan it.
>> 
>> Regards
>> Ram
>> 
>> On Wed, Jun 15, 2016 at 3:09 PM, Ellis, Tom (Financial Markets IT) <
>> tom.el...@lloydsbanking.com.invalid> wrote:
>> 
>>> Yeah, thanks for this Ram. Although in my testing I have found that
>>> a client user attempting to use the visibility expression resolver
>>> doesn't seem to have the ability to scan the hbase:labels table for
>>> the full list of labels and thus can't get the ordinals/tags to add
>>> to the cell. Does the client user attempting to use the
>>> VisibilityExpressionResolver have to have some special permissions?
>>> 
>>> Scan of hbase:labels by client user:
>>> 
>>> hbase(main):003:0> scan 'hbase:labels'
>>> ROW                                         COLUMN+CELL
>>> \x00\x00\x00\x01                           column=f:\x00,
>>> timestamp=1465216652662, value=system
>>> 1 row(s) in 0.0650 seconds
>>> 
>>> Scan of hbase:labels by hbase user:
>>> 
>>> hbase(main):001:0> scan 'hbase:labels'
>>> ROW                                         COLUMN+CELL
>>> \x00\x00\x00\x01                           column=f:\x00,
>>> timestamp=1465216652662, value=system
>>> \x00\x00\x00\x02                           column=f:\x00,
>>> timestamp=1465216944935, value=protected
>>> \x00\x00\x00\x02                           column=f:hbase,
>>> timestamp=1465547138533, value=
>>> \x00\x00\x00\x02                           column=f:tom,
>>> timestamp=1465980236882, value=
>>> \x00\x00\x00\x03                           column=f:\x00,
>>> timestamp=1465500156667, value=testtesttest
>>> \x00\x00\x00\x03                           column=f:@hadoop,
>>> timestamp=1465980236967, value=
>>> \x00\x00\x00\x03                           column=f:hadoop,
>>> timestamp=1465547304610, value=
>>> \x00\x00\x00\x03                           column=f:hive,
>>> timestamp=1465501322616, value=
>>> \x00\x00\x00\x04                           column=f:\x00,
>>> timestamp=1465570719901, value=confidential
>>> \x00\x00\x00\x05                           column=f:\x00,
>>> timestamp=1465835047835, value=branch
>>> \x00\x00\x00\x05                           column=f:hdfs,
>>> timestamp=1465980237060, value=
>>> \x00\x00\x00\x06                           column=f:\x00,
>>> timestamp=1465980447307, value=group
>>> \x00\x00\x00\x06                           column=f:hdfs,
>>> timestamp=1465980454130, value=
>>> 6 row(s) in 0.7370 seconds
>>> 
>>> Cheers,
>>> 
>>> Tom Ellis
>>> Consultant Developer – Excelian
>>> Data Lake | Financial Markets IT
>>> LLOYDS BANK COMMERCIAL BANKING
>>> 
>>> 
>>> E: tom.el...@lloydsbanking.com
>>> Website: www.lloydsbankcommercial.com , , , Reduce printing. Lloyds
>>> Banking Group is helping to build the low carbon economy.
>>> Corporate Responsibility Report:
>>> www.lloydsbankinggroup-cr.com/downloads
>>> 
>>> -----Original Message-----
>>> From: Anoop John [mailto:anoop.hb...@gmail.com]
>>> Sent: 08 June 2016 11:58
>>> To: user@hbase.apache.org
>>> Subject: Re: Writing visibility labels with HFileOutputFormat2
>>> 
>>> -- This email has reached the Bank via an external source --
>>> 
>>> 
>>> Thanks Ram.. Ya that seems the best way as CellCreator is public
>>> exposed class. May be we should explain abt this in hbase book under
>>> the Visibility labels area.  Good to know you have Visibility labels
>>> based usecase. Let us know in case of any trouble.  Thanks.
>>> 
>>> -Anoop-
>>> 
>>> On Wed, Jun 8, 2016 at 1:43 PM, ramkrishna vasudevan <
>>> ramkrishna.s.vasude...@gmail.com> wrote:
>>>> Hi
>>>> 
>>>> It can be done. See the class CellCreator which is Public facing
>>> interface.
>>>> When you create your spark job to create the hadoop files that
>>>> produces the
>>>> HFileOutputformat2 data. While creating the KeyValues you can use
>>>> the CellCreator to create your KeyValues and use the
>>>> CellCreator.getVisibilityExpressionResolver() to map your String
>>>> Visibility tags with the system generated ordinals.
>>>> 
>>>> For eg, you can see how TextSortReducer works.  I think this
>>>> should help you solve your problem. Let us know if you need
>>>> further
>> information.
>>>> 
>>>> Regards
>>>> Ram
>>>> 
>>>> On Tue, Jun 7, 2016 at 3:58 PM, Ellis, Tom (Financial Markets IT)
>>>> < tom.el...@lloydsbanking.com.invalid> wrote:
>>>> 
>>>>> Hi Ram,
>>>>> 
>>>>> We're attempting to do it programmatically so:
>>>>> 
>>>>> The HFile is created by a Spark job using saveAsNewAPIHadoopFile,
>>>>> and using ImmutableBytesWritable as the key (rowkey) with
>>>>> KeyValue as the value, and using the HFilOutputFormat2 format.
>>>>> This HFile is then loaded using HBase client's
>>>>> LoadIncrementalHFiles.doBulkLoad
>>>>> 
>>>>> Is there a way to do this programmatically without using the
>>>>> ImportTsv tool? I was taking a look at
>>>>> VisibilityUtils.createVisibilityExpTags and maybe being able to
>>>>> just create the Tags myself that way (although it's obviously
>>>>> @InterfaceAudience.Private) but it seems to be able to use that
>>>>> I'd
>>> need to know Label ordinality client side..
>>>>> 
>>>>> Thanks for your help,
>>>>> 
>>>>> Tom
>>>>> 
>>>>> -----Original Message-----
>>>>> From: ramkrishna vasudevan
>>>>> [mailto:ramkrishna.s.vasude...@gmail.com]
>>>>> Sent: 07 June 2016 11:19
>>>>> To: user@hbase.apache.org
>>>>> Subject: Re: Writing visibility labels with HFileOutputFormat2
>>>>> 
>>>>> -- This email has reached the Bank via an external source --
>>>>> 
>>>>> 
>>>>> Hi Ellis
>>>>> 
>>>>> How is the HFileOutputFormat2 files created?  Are you using the
>>>>> ImportTsv tool?  If you are using the ImportTsv tool then yes
>>>>> there is a way to specify visibility tags while loading from the
>>>>> ImportTsv tool and those visibility tags are also bulk loaded as
>> HFile.
>>>>> 
>>>>> There is an attribute CELL_VISIBILITY_COLUMN_SPEC that can be
>>>>> used to indicate that the data will have Visibility Tags and the
>>>>> tool will automatically parse the specified field as Visibility Tag.
>>>>> 
>>>>> In case you have access to the code you can see the test case
>>>>> TestImportTSVWithVisibilityLabels to get an initial idea of how
>>>>> it is being done. If not get back to us, happy to help .
>>>>> 
>>>>> Regards
>>>>> Ram
>>>>> 
>>>>> 
>>>>> 
>>>>> On Tue, Jun 7, 2016 at 3:36 PM, Ellis, Tom (Financial Markets IT)
>>>>> < tom.el...@lloydsbanking.com.invalid> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> 
>>>>>> I was wondering if it's possible/how to write Visibility Labels
>>>>>> to an HFileOutputFormat2? I believe Visibility Labels are just
>>>>>> implemented as Tags, but with the normal way of writing them
>>>>>> with Mutation#setCellVisibility these are formally written as
>>>>>> Tags to the cells during the VisibilityController coprocessor
>>>>>> as we need to assert the expression is valid for the labels configured.
>>>>>> 
>>>>>> How can we add visibility labels to cells if we have a job that
>>>>>> creates an HFile with HFileOutputFormat2 which is then
>>>>>> subsequently loaded using LoadIncrementalHFiles?
>>>>>> 
>>>>>> Cheers,
>>>>>> 
>>>>>> Tom Ellis
>>>>>> Consultant Developer - Excelian Data Lake | Financial Markets
>>>>>> IT LLOYDS BANK COMMERCIAL BANKING
>>>>>> ________________________________
>>>>>> 
>>>>>> E:
>>>>>> tom.el...@lloydsbanking.com<mailto:tom.el...@lloydsbanking.com>
>>>>>> Website:
>>>>>> www.lloydsbankcommercial.com<http://www.lloydsbankcommercial.co
>>>>>> m/
>>>>>> , , ,
>>>>>> Reduce printing. Lloyds Banking Group is helping to build the
>>>>>> low carbon economy.
>>>>>> Corporate Responsibility Report:
>>>>>> www.lloydsbankinggroup-cr.com/downloads<
>>>>>> http://www.lloydsbankinggroup-cr.com/downloads>
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Lloyds Banking Group plc. Registered Office: The Mound,
>>>>>> Edinburgh
>>>>>> EH1
>>>>> 1YZ.
>>>>>> Registered in Scotland no. SC95000. Telephone: 0131 225 4555.
>>>>>> Lloyds Bank plc. Registered Office: 25 Gresham Street, London
>>>>>> EC2V
>>> 7HN.
>>>>>> Registered in England and Wales no. 2065. Telephone 0207626 1500.
>>>>>> Bank
>>>>> of Scotland plc.
>>>>>> Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in
>>>>>> Scotland
>>>>> no.
>>>>>> SC327000. Telephone: 03457 801 801. Cheltenham & Gloucester plc.
>>>>>> Registered
>>>>>> Office: Barnett Way, Gloucester GL4 3RL. Registered in England
>>>>>> and Wales 2299428. Telephone: 0345 603 1637
>>>>>> 
>>>>>> Lloyds Bank plc, Bank of Scotland plc are authorised by the
>>>>>> Prudential Regulation Authority and regulated by the Financial
>>>>>> Conduct Authority and Prudential Regulation Authority.
>>>>>> 
>>>>>> Cheltenham & Gloucester plc is authorised and regulated by the
>>>>>> Financial Conduct Authority.
>>>>>> 
>>>>>> Halifax is a division of Bank of Scotland plc. Cheltenham &
>>>>>> Gloucester Savings is a division of Lloyds Bank plc.
>>>>>> 
>>>>>> HBOS plc. Registered Office: The Mound, Edinburgh EH1 1YZ.
>>>>>> Registered in Scotland no. SC218813.
>>>>>> 
>>>>>> This e-mail (including any attachments) is private and
>>>>>> confidential and may contain privileged material. If you have
>>>>>> received this e-mail in error, please notify the sender and
>>>>>> delete it (including any
>>>>>> attachments) immediately. You must not copy, distribute,
>>>>>> disclose or use any of the information in it or any
>>>>>> attachments. Telephone calls may be monitored or recorded.
>>>>> 
>>>>> 
>>>>> Lloyds Banking Group plc. Registered Office: The Mound, Edinburgh
>>>>> EH1
>>> 1YZ.
>>>>> Registered in Scotland no. SC95000. Telephone: 0131 225 4555.
>>>>> Lloyds Bank plc. Registered Office: 25 Gresham Street, London
>>>>> EC2V
>> 7HN.
>>>>> Registered in England and Wales no. 2065. Telephone 0207626 1500.
>>>>> Bank
>>> of Scotland plc.
>>>>> Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in
>>>>> Scotland
>>> no.
>>>>> SC327000. Telephone: 03457 801 801. Cheltenham & Gloucester plc.
>>>>> Registered
>>>>> Office: Barnett Way, Gloucester GL4 3RL. Registered in England
>>>>> and Wales 2299428. Telephone: 0345 603 1637
>>>>> 
>>>>> Lloyds Bank plc, Bank of Scotland plc are authorised by the
>>>>> Prudential Regulation Authority and regulated by the Financial
>>>>> Conduct Authority and Prudential Regulation Authority.
>>>>> 
>>>>> Cheltenham & Gloucester plc is authorised and regulated by the
>>>>> Financial Conduct Authority.
>>>>> 
>>>>> Halifax is a division of Bank of Scotland plc. Cheltenham &
>>>>> Gloucester Savings is a division of Lloyds Bank plc.
>>>>> 
>>>>> HBOS plc. Registered Office: The Mound, Edinburgh EH1 1YZ.
>>>>> Registered in Scotland no. SC218813.
>>>>> 
>>>>> This e-mail (including any attachments) is private and
>>>>> confidential and may contain privileged material. If you have
>>>>> received this e-mail in error, please notify the sender and
>>>>> delete it (including any
>>>>> attachments) immediately. You must not copy, distribute, disclose
>>>>> or use any of the information in it or any attachments. Telephone
>>>>> calls may be monitored or recorded.
>>> 
>>> 
>>> Lloyds Banking Group plc. Registered Office: The Mound, Edinburgh
>>> EH1
>> 1YZ.
>>> Registered in Scotland no. SC95000. Telephone: 0131 225 4555. Lloyds
>>> Bank plc. Registered Office: 25 Gresham Street, London EC2V 7HN.
>>> Registered in England and Wales no. 2065. Telephone 0207626 1500.
>>> Bank
>> of Scotland plc.
>>> Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in
>>> Scotland
>> no.
>>> SC327000. Telephone: 03457 801 801. Cheltenham & Gloucester plc.
>>> Registered
>>> Office: Barnett Way, Gloucester GL4 3RL. Registered in England and
>>> Wales 2299428. Telephone: 0345 603 1637
>>> 
>>> Lloyds Bank plc, Bank of Scotland plc are authorised by the
>>> Prudential Regulation Authority and regulated by the Financial
>>> Conduct Authority and Prudential Regulation Authority.
>>> 
>>> Cheltenham & Gloucester plc is authorised and regulated by the
>>> Financial Conduct Authority.
>>> 
>>> Halifax is a division of Bank of Scotland plc. Cheltenham &
>>> Gloucester Savings is a division of Lloyds Bank plc.
>>> 
>>> HBOS plc. Registered Office: The Mound, Edinburgh EH1 1YZ.
>>> Registered in Scotland no. SC218813.
>>> 
>>> This e-mail (including any attachments) is private and confidential
>>> and may contain privileged material. If you have received this
>>> e-mail in error, please notify the sender and delete it (including
>>> any
>>> attachments) immediately. You must not copy, distribute, disclose or
>>> use any of the information in it or any attachments. Telephone calls
>>> may be monitored or recorded.
>> 
>> 
>> Lloyds Banking Group plc. Registered Office: The Mound, Edinburgh EH1 1YZ.
>> Registered in Scotland no. SC95000. Telephone: 0131 225 4555. Lloyds
>> Bank plc. Registered Office: 25 Gresham Street, London EC2V 7HN.
>> Registered in England and Wales no. 2065. Telephone 0207626 1500. Bank of 
>> Scotland plc.
>> Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in Scotland no.
>> SC327000. Telephone: 03457 801 801. Cheltenham & Gloucester plc.
>> Registered
>> Office: Barnett Way, Gloucester GL4 3RL. Registered in England and
>> Wales 2299428. Telephone: 0345 603 1637
>> 
>> Lloyds Bank plc, Bank of Scotland plc are authorised by the Prudential
>> Regulation Authority and regulated by the Financial Conduct Authority
>> and Prudential Regulation Authority.
>> 
>> Cheltenham & Gloucester plc is authorised and regulated by the
>> Financial Conduct Authority.
>> 
>> Halifax is a division of Bank of Scotland plc. Cheltenham & Gloucester
>> Savings is a division of Lloyds Bank plc.
>> 
>> HBOS plc. Registered Office: The Mound, Edinburgh EH1 1YZ. Registered
>> in Scotland no. SC218813.
>> 
>> This e-mail (including any attachments) is private and confidential
>> and may contain privileged material. If you have received this e-mail
>> in error, please notify the sender and delete it (including any
>> attachments) immediately. You must not copy, distribute, disclose or
>> use any of the information in it or any attachments. Telephone calls
>> may be monitored or recorded.
> 
> 
> Lloyds Banking Group plc. Registered Office: The Mound, Edinburgh EH1 1YZ. 
> Registered in Scotland no. SC95000. Telephone: 0131 225 4555. Lloyds Bank 
> plc. Registered Office: 25 Gresham Street, London EC2V 7HN. Registered in 
> England and Wales no. 2065. Telephone 0207626 1500. Bank of Scotland plc. 
> Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in Scotland no. 
> SC327000. Telephone: 03457 801 801. Cheltenham & Gloucester plc. Registered 
> Office: Barnett Way, Gloucester GL4 3RL. Registered in England and Wales 
> 2299428. Telephone: 0345 603 1637
> 
> Lloyds Bank plc, Bank of Scotland plc are authorised by the Prudential 
> Regulation Authority and regulated by the Financial Conduct Authority and 
> Prudential Regulation Authority.
> 
> Cheltenham & Gloucester plc is authorised and regulated by the Financial 
> Conduct Authority.
> 
> Halifax is a division of Bank of Scotland plc. Cheltenham & Gloucester 
> Savings is a division of Lloyds Bank plc.
> 
> HBOS plc. Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in 
> Scotland no. SC218813.
> 
> This e-mail (including any attachments) is private and confidential and may 
> contain privileged material. If you have received this e-mail in error, 
> please notify the sender and delete it (including any attachments) 
> immediately. You must not copy, distribute, disclose or use any of the 
> information in it or any attachments. Telephone calls may be monitored or 
> recorded.

Reply via email to