Re: Filtering in HBase Audit Logs

2018-02-28 Thread Jerry He
Some people probably can share more lights. I recall no hbase built-in construct to limit or filter the Audit logs, and I agree it can be very verbose depending on the access operations. We can propose two approaches to access and fix the issues. 1 Some low hanging cleanup and simplification of

Re: [ANNOUNCE] New HBase committer Peter Somogyi

2018-02-22 Thread Jerry He
Congrats, Peter! On Thu, Feb 22, 2018 at 2:53 PM, Andrew Purtell wrote: > Congratulations and welcome, Peter! > > > On Thu, Feb 22, 2018 at 11:08 AM, Sean Busbey wrote: > > > On behalf of the Apache HBase PMC, I am pleased to announce that Peter > >

Re: Hbase Snapshot Export Data storage

2018-01-05 Thread Jerry He
Keeping the data files in the /hbase/archive is the way how snapshot (and exported snapshot) works. There are reference links to them so that they are not actually cleaned/deleted. And restored snapshot will answer client query by following the reference links to these data files. It is not

[ANNOUNCE] Please welcome new HBase committer YI Liang

2017-12-20 Thread Jerry He
On behalf of the Apache HBase PMC, I am pleased to announce that Yi Liang has accepted the PMC's invitation to become a committer on the project. We appreciate all of Yi's great work thus far and look forward to his continued involvement. Please join me in congratulating Yi! -- Thanks, Jerry

Re: what is the role of ProtoBuffer while creating of instance of Table/Htable

2017-12-10 Thread Jerry He
As Stack said. You can try the hbase-shaded-client if you can not upgrade your protobuf version. Thanks. On Tue, Dec 5, 2017 at 10:00 AM Stack wrote: > On Mon, Dec 4, 2017 at 10:10 PM, Manjeet Singh > > wrote: > > > Thanks for your reply, I have

Re: Deleting and cleaning old snapshots exported to S3

2017-11-27 Thread Jerry He
Hi, Tim You seem to be have a nice solution/tool to the problem. If you would like to contribute to the HBase open source, that will certainly be welcomed. Once inside HBase, we can open up the access to the needed methods. Thanks. On Wed, Nov 22, 2017 at 2:03 PM, Ted Yu

Re: [ANNOUNCE] New HBase committer Zheng Hu

2017-10-23 Thread Jerry He
Congrats and welcome! On Mon, Oct 23, 2017 at 10:29 AM, Huaxiang Sun wrote: > Congratulations, Zheng! >

Re: [DISCUSS] Planning changes on RegionServer totalRequestCount metrics

2017-08-03 Thread Jerry He
I like the ideal to clean it up and make it clear. I had the same confusion as well. Will look at the JIRA and comment there as well. Thanks. Jerry On Wed, Aug 2, 2017 at 11:48 PM, Yu Li wrote: > Dear all, > > Recently in HBASE-18469

Re: [ANNOUNCE] New HBase committer Abhishek Singh Chouhan

2017-07-31 Thread Jerry He
Congrats Abhishek! Jerry On Mon, Jul 31, 2017 at 1:41 AM, Anoop John wrote: > Congrats Abhishek > > On Mon, Jul 31, 2017 at 1:48 PM, Yu Li wrote: >> Congratulations, Abhishek! >> >> Best Regards, >> Yu >> >> On 31 July 2017 at 15:40, Jingcheng Du

Re: [ANNOUNCE] Devaraj Das joins the Apache HBase PMC

2017-07-05 Thread Jerry He
Congrats, Devaraj ! Thanks, Jerry On Wed, Jul 5, 2017 at 1:36 PM, Nick Dimiduk wrote: > Congratulations Devaraj! > > On Wed, Jul 5, 2017 at 9:27 AM, Josh Elser wrote: > >> I'm pleased to announce yet another PMC addition in the form of Devaraj >> Das.

Re: [ANNOUNCE] Chunhui Shen joins the Apache HBase PMC

2017-07-04 Thread Jerry He
Congrats, Chunhui! Thanks Jerry On Tue, Jul 4, 2017 at 8:37 PM Anoop John wrote: > Congrats Chunhui.. > > On Wed, Jul 5, 2017 at 6:55 AM, Pankaj kr wrote: > > Congratulations Chunhui..!! > > > > Regards, > > Pankaj > > > > > > -Original

Re: Thrift server kerberos ticket refresh

2017-06-25 Thread Jerry He
wal logic might be called indirectly > through some process/module that the thrift server is importing or > using, but after a thorough spelunking around the code-base, I was not > able to find any path to ticket renewal logic. Which is why I turned > to the list :) > > On Wed, Jun

Re: Thrift server kerberos ticket refresh

2017-06-20 Thread Jerry He
The right code can be hard to find and may not be even in the Thrift module. Did you encounter any problem, e.g. the Thrift server giving out errors due to expired Kerberos ticket? Thanks, Jerry On Tue, Jun 20, 2017 at 11:05 AM, Steen Manniche wrote: > Hi Ted, > > thanks

Re: Regarding Connection Pooling

2017-06-14 Thread Jerry He
At the high level, you have the Connection (which is HConnection) you obtained from ConnectionFactory API . As you mentioned, you create one such Connection. Internally within that Connection, there are physical RPC/socket connections to the different region servers. By default, one physical

Re: Why do I get ServiceException?

2017-05-30 Thread Jerry He
ServiceException simply means your service call failed for some reason. For example, the service could not be found on the region server side because the coprocessor was not registered correctly. Or the protobuf message / parameter was missing some required field. Do you see more detailed root

Re: MultiTableInputFormat class

2017-05-24 Thread Jerry He
You would pass multiple Scans to MultiTableInputFormat. Each Scan object is for each table. The input to the mapper is the same as before: the scan result (row key and values for the row). Thanks, Jerry On Mon, May 22, 2017 at 3:33 AM, Rajeshkumar J wrote: > Hi,

Re: Missing data in snapshot - possible flush timing issue?

2017-05-24 Thread Jerry He
Thanks for reporting it. Good find. On Wed, May 24, 2017 at 11:16 AM, Jerry He <jerry...@gmail.com> wrote: > In theory, the snapshot is a best effort Point in Time snapshot. No > guarantee of any clean cutoff. The data you missed in this snapshot will > appear

Re: Missing data in snapshot - possible flush timing issue?

2017-05-24 Thread Jerry He
In theory, the snapshot is a best effort Point in Time snapshot. No guarantee of any clean cutoff. The data you missed in this snapshot will appear in the next one. But the current problem you saw can and should be fixed though. Jerry On Wed, May 24, 2017 at 9:04 AM, LeBlanc, Jacob

Re: Where should coprocessor dependencies go when using HDFS?

2017-05-21 Thread Jerry He
As Ted said. HBASE-14548 provides more options, but it is not in HBase 1.2.x. Thanks. Jerry On Sun, May 21, 2017 at 4:28 PM, Ted Yu wrote: > Looks like your code depends on > https://mvnrepository.com/artifact/org.apache.lucene/lucene-queryparser > > Consider packaging

Re: Cant load coprocessor

2017-05-20 Thread Jerry He
it only contains: > >> > >> * Manifest-Version: 1.0Created-By: 1.8.0_102 (Oracle > >>Corporation) * > >> > >> > >> Regards, > >> > >> Cheyenne O. Forbes > >> > >> > >>

Re: Cant load coprocessor

2017-05-17 Thread Jerry He
Try to see if you have the right jar in the right path and it contains your coprocessor class. The code says: ... } catch (ClassNotFoundException e) { throw new IOException("Class " + attr.getClassName() + " cannot be loaded", e); Thanks. Jerry On Tue, May 16, 2017 at 6:49 PM,

Re: ANNOUNCE: Yu Li joins the Apache HBase PMC

2017-04-14 Thread Jerry He
Congratulations and welcome, Yu! On Fri, Apr 14, 2017 at 6:47 PM, Andrew Purtell wrote: > Congratulations and welcome! > > > On Fri, Apr 14, 2017 at 7:22 AM, Anoop John wrote: > > > On behalf of the Apache HBase PMC I"m pleased to announce that Yu Li

Re: [ANNOUNCE] - Welcome our new HBase committer Anastasia Braginsky

2017-03-27 Thread Jerry He
Congrats and welcome! Jerry

Re: Question in WALEdit

2017-03-23 Thread Jerry He
it is good question you have on WALEdit and transaction atomicity. One WALEdit can have mutations from different rows. But WALEdit is indeed like a transaction in HBase. This seemingly is contradictory. It can be explained. A HBase region 'tries' to apply the incoming batch of mutations in batch

Re: HFILE creation to use a different committer

2017-03-16 Thread Jerry He
I think you are right. FileOutputFormat has a default hard-coded FileOutputCommitter. If you want to use DirectoOutputcommitter, check the third-party patched hadoop package that provides this class on how to set this DirectoOutputcommitter. Or you can extends HFileOutputFormat2 and provides a

Re: hbase dynamic configuration

2017-03-01 Thread Jerry He
"ERROR: wrong number of arguments (2 for 1)" > > On Wed, Mar 1, 2017 at 12:55 PM, Jerry He <jerry...@gmail.com> wrote: > > > These properties can be used on the client side and the server side. > > I assume you are asking about them on the server side. > >

Re: hbase dynamic configuration

2017-02-28 Thread Jerry He
These properties can be used on the client side and the server side. I assume you are asking about them on the server side. Unfortunately, these two are not supported yet for dynamic configuration. Thanks. Jerry On Tue, Feb 28, 2017 at 10:00 PM, Rajeshkumar J

Re: [ANNOUNCE] New HBase Committer Josh Elser

2016-12-11 Thread Jerry He
Congratulations , Josh! Good work on the PQS too. Jerry On Sun, Dec 11, 2016 at 12:14 PM, Josh Elser wrote: > Thanks, all. I'm looking forward to continuing to work with you all! > > > Nick Dimiduk wrote: > >> On behalf of the Apache HBase PMC, I am pleased to announce that

Re: HBase rest custom authentication

2016-11-02 Thread Jerry He
The reason your custom authentication does not work is probably due to HBASE-12231. It does not get invoked. The code snippet that Ted copied is inside "if keberos security". The AuthFilter itself can handle custom authentication class. But AuthFilter is not added in your case. Jerry On

Re: ETL HBase HFile+HLog to ORC(or Parquet) file?

2016-10-22 Thread Jerry He
Hi, Demai If you think something helpful can be done within HBase, feel free to propose on the JIRA. Jerry On Fri, Oct 21, 2016 at 2:41 PM, Mich Talebzadeh wrote: > Hi Demai, > > As I understand you want to use Hbase as the real time layer and Hive Data > Warehouse

Re: ImportTSV write to remote HDFS concurrently.

2016-10-22 Thread Jerry He
It is based on the number of live regions. Jerry On Fri, Oct 21, 2016 at 7:50 AM, Vadim Vararu wrote: > Hi guys, > > I'm trying to run the importTSV job and to write the result into a remote > HDFS. Isn't it supposed to write data concurrently? Asking cause i get the

Re: [ANNOUNCE] Stephen Yuan Jiang joins Apache HBase PMC

2016-10-15 Thread Jerry He
Congratulations, Stephen. Jerry On Fri, Oct 14, 2016 at 12:56 PM, Dima Spivak wrote: > Congrats, Stephen! > > -Dima > > On Fri, Oct 14, 2016 at 11:27 AM, Enis Söztutar wrote: > > > On behalf of the Apache HBase PMC, I am happy to announce that Stephen >

Re: CopyTable fails on copying between two secured clusters

2016-09-08 Thread Jerry He
Check the peer address you specified in the command line. It does not seem to match your remote cluster ZK parent node. Jerry On Thursday, September 8, 2016, Frank Luo wrote: > I don't think they are pointing to different locations. Both of them > should be /hbase-secure.

Re: hbase.server.scanner.max.result.size

2016-08-16 Thread Jerry He
>> - *hbase.client.scanner.max.result.size * As documented, the client scanner tries with a max size limit per fetch. See a related: hbase.client.scanner.caching >> - *hbase.server.scanner.max.result.size: **(HBase

Re: distcp hbase-0.94.6(CDH4.5) hfiles to hbase1.2(CDH5.8)?

2016-08-14 Thread Jerry He
If you distcp the raw hfiles, you have a couple of options to restore the data on the second cluster. 1. You an copy the entire hbase root.dir, you can set the hbase root.dir to this directory and bootstrap the new cluster from there. Before you start the new cluster, run the 'hbase upgrade'

Re: Built in REST API only gave me one column qualifier

2016-06-11 Thread Jerry He
It should work as documented. Did you have the Accept header as xml, json or protobuf? Then you will get the complete row/family. Jerry On Sat, Jun 11, 2016 at 11:22 AM, Bin Wang wrote: > Hi there, > > I started the built-in REST API server for HBase and I am planning to

Re: Zookeeper too many connections when using co-processor

2016-06-05 Thread Jerry He
Your aggregation coprocessor seems to be global -- meaning that it uses the Table scanner which can go to other regions on other region servers. Do you have to do so? The coprocessor is hosted on a Region. You can just get a local region scanner to do local scan. This won't work for you? This will

Re: hfile v2 and bloomfilter

2016-05-15 Thread Jerry He
Another good place to look at is the design doc attached to the HFile v2 JIRA: https://issues.apache.org/jira/browse/HBASE-3857 Jerry On Sun, May 15, 2016 at 5:12 PM, Stack wrote: > On Sun, May 15, 2016 at 5:05 AM, Shushant Arora > > wrote: > > >

Re: Hbase v1 vs 0.98x

2016-04-20 Thread Jerry He
The Release Announcements for the releases are probably the best source of 'What's New' in the releases. The release managers did their best to give a good summary; HBase 1.0.0 release

Re: does hbase scan doubts

2016-03-22 Thread Jerry He
The HBase client scanner goes through the regions in the scan range one at a time, making call to the region server. See this pending JIRA for the relevant discussion. https://issues.apache.org/jira/browse/HBASE-1935 Thanks. Jerry On Sun, Mar 13, 2016 at 11:37 AM, Shushant Arora

Re: HFile vs Parquet for very wide table

2016-01-22 Thread Jerry He
at 10:04 AM, Krishna <research...@gmail.com> wrote: > Thanks Ted, Jerry. > > Computing pairwise similarity is the primary purpose of the matrix. This is > done by extracting all rows for a set of columns at each iteration. > > On Thursday, January 21, 2016, Jerry He &l

Re: HFile vs Parquet for very wide table

2016-01-21 Thread Jerry He
What do you want to do with your matrix data? How do you want to use it? Do you need random read/write or point query? Do you need to get the row/record or many many columns at a time? If yes, HBase is a good choice for you. Parquet is good as a storage format for large scans, aggregations, on

Re: doAs with HBase Java API and Apache Ranger

2015-12-22 Thread Jerry He
what I've done is correct, > is there anything else you can think of that might be wrong? Is this just a > matter for the Ranger team or whomever is responsible for the HBase Ranger > plugin? > > Best Regards, > Adam > > On Sun, 20 Dec 2015 at 03:40 Jerry He <jerry...@

Re: doAs with HBase Java API and Apache Ranger

2015-12-19 Thread Jerry He
To answer your HBase question, user context is obtained when 'Connection' object is created. 'Table' shares what the 'Connection' has. Also like Ted mentioned, use ProxyUser needs additional config on the serer side. Jerry On Fri, Dec 18, 2015 at 9:37 AM, Ted Yu wrote: >

Re: Slow response on HBase REST api using globbing option

2015-12-03 Thread Jerry He
>From HBase 0.98, there have been changes going into the Rest gateway, mainly more scan support. There seems to be a change in the way the url table/rowkey* is executed on the Rest gateway. in pre-0.96, we set the startKey = rowkey and endKey = rowkey + one byte of 255 on Rest gateway in the

Re: Access cell tags from HBase shell

2015-09-15 Thread Jerry He
abels etc are also implemented by storing them > as > > > cell > > > > tags. Yes as others said, the tags is by default a server only > thing. > > > > Means you can not pass tags from/to client along with cells. There > is > > > some > >

Re: Access cell tags from HBase shell

2015-08-31 Thread Jerry He
Hi, Suresh In you Java client program, you can 'label' the cells in your PUT. You can ask which labeled cells to be returned in your Get and Scan, but the labels are not returned with the cells. Yes, "labels on cells are only interpreted server side" Jerry On Mon, Aug 31, 2015 at 1:27 PM,

Re: REST Impersonation?

2015-08-10 Thread Jerry He
The basic concept and impersonation support is this: Your HBase Rest gateway is running under a user id, say 'hbase'. The incoming Rest client user id is 'user1'. On the HBase server (master or region server), you want the authorization (ACL) to be done on 'user1'. You want the user id 'hbase'

Re: Hbase vs Cassandra

2015-06-01 Thread Jerry He
Another point to add is the new HBase read high-availability using timeline-consistent region replicas feature from HBase 1.0 onward, which brings HBase closer to Cassandra in term of Read Availability during node failures. You have a choice for Read Availability now.

Re: Questions related to HBase general use

2015-05-10 Thread Jerry He
Hi, Yong You have a good understanding of the benefit of HBase already. Generally speaking, HBase is suitable for real time read/write to your big data set. Regarding the HBase performance evaluation tool, the 'read' test use HBase 'get'. For 1m rows, the test would issue 1m 'get' (and RPC) to

Re: New Blog Post: Scan Improvements in HBase 1.1.0

2015-05-02 Thread Jerry He
Jonathan, It will be good to include/add HBASE-1 to the blog as well? Jerry On Sat, May 2, 2015 at 5:11 PM, Jerry He jerry...@gmail.com wrote: Good article. Clear and Informative! Jerry On Fri, May 1, 2015 at 2:45 PM, Jonathan Lawlor jonathan.law...@cloudera.com wrote: Hey folks

Re: New Blog Post: Scan Improvements in HBase 1.1.0

2015-05-02 Thread Jerry He
Good article. Clear and Informative! Jerry On Fri, May 1, 2015 at 2:45 PM, Jonathan Lawlor jonathan.law...@cloudera.com wrote: Hey folks, A new blog post just went up on the Apache Blog ( https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1). This blog post focuses on recent

Re: HBase Filesystem Adapter

2015-04-30 Thread Jerry He
We've also made HBase running on IBM GPFS. http://en.wikipedia.org/wiki/IBM_General_Parallel_File_System We have a Hadoop FileSystem implementation that translates hadoop calls into GPFS native calls. Overall it has been running well on live clusters. Jerry

Re: Wrong request count of jmx

2015-04-21 Thread Jerry He
The readRequestCount actually counts the number of rows returned from the region server with totals of all the hosted regions. The writeRequestCount counts the number of mutations/appends/increments with totals of all the hosted regions. The totalRequestCount is a mixed bag as Elliott mentioned,

Re: Please welcome new HBase committer Jing Chen (Jerry) He

2015-04-02 Thread Jerry He
...@gmail.com] Sent: 02 April 2015 01:56 To: user@hbase.apache.org Cc: d...@hbase.apache.org Subject: Re: Please welcome new HBase committer Jing Chen (Jerry) He Congratulations, Jerry. On Wed, Apr 1, 2015 at 10:53 AM, Andrew Purtell apurt...@apache.org wrote

Re: Please welcome new HBase committer Srikanth Srungarapu

2015-04-02 Thread Jerry He
Congratulations to you, Srikanth! On Thu, Apr 2, 2015 at 2:44 PM, Srikanth Srungarapu srikanth...@gmail.com wrote: Thanks folks for your kind wishes! On Thu, Apr 2, 2015 at 3:00 AM, Rajeshbabu Chintaguntla chrajeshbab...@gmail.com wrote: Congratulations Srikanth!! Thanks,

Re: Recovering from corrupt blocks in HFile

2015-03-20 Thread Jerry He
it if necessary, do I simply need to delete the HFile to make HDFS happy or is there something I need to do at the HBase level to tell it that data will be going away? Thanks so much everyone for your help on this issue! -md On Wed, Mar 18, 2015 at 10:46 PM, Jerry He

Re: Recovering from corrupt blocks in HFile

2015-03-19 Thread Jerry He
! -md On Wed, Mar 18, 2015 at 10:46 PM, Jerry He jerry...@gmail.com wrote: From HBase perspective, since we don't have a ready tool, the general idea will need you to have access to HBase source code and write your own tool. On the high level, the tool will read/scan the KVs from the hfile

Re: Recovering from corrupt blocks in HFile

2015-03-18 Thread Jerry He
nothing is currently able to read past the first corruption point, but I'll just have to wash, rinse, and repeat to see how much good data is left is the file as a whole. -md On Wed, Mar 18, 2015 at 2:41 PM, Jerry He jerry...@gmail.com wrote: For a 'fix' and 'recover' hfile tool at HBase

Re: Recovering from corrupt blocks in HFile

2015-03-18 Thread Jerry He
For a 'fix' and 'recover' hfile tool at HBase level, the relatively easy thing we can recover is probably the data (KVs) up to the point when we hit the first corruption caused exception. After that, it will not be as easy. For example, if the current key length or value length is bad, there is

Re: [ANNOUNCE] Apache HBase 1.0.0 is now available for download

2015-02-24 Thread Jerry He
Congratulations on the milestone!

Re: PerformanceEvaluation: filterScan

2015-02-16 Thread Jerry He
Hi, I was on 0.98 running PerformanceEvaluation. randomRead or sequentialRead all show good read counts on the table regions. filterScan shows exactly the same as you had. Looking at the code. It could be as expected. The filter has no matching returns, so the region server is in

Re: the Exception of VisibilityLabelsCache not yet instantiated happen when I set the class ExpAsStringVisibilityLabelServiceImpl as Implement of VisibilityLabelService.

2015-02-02 Thread Jerry He
Hi, All the default scan label generators uses the VisibilityLabelCache to get user labels somewhere. But the VisibilityLableCache implementation is for the DefaultVisibilityLabelServiceImpl mostly. Even if we add the method createAndGet() for the VisibilityLableCache to

Re: the Exception of VisibilityLabelsCache not yet instantiated happen when I set the class ExpAsStringVisibilityLabelServiceImpl as Implement of VisibilityLabelService.

2015-02-02 Thread Jerry He
Hi, There is a SimpleScanLabelGenerator in the package. This is works with ExpAsStringVisibilityLabelServiceImpl. And it is used in the unit test for ExpAsStringVisibilityLabelServiceImpl. Jerry On Mon, Feb 2, 2015 at 10:54 AM, Jerry He jerry...@gmail.com wrote: Hi, All the default scan

Re: BulkLoad 200GB table with one region. Is it OK?

2014-10-02 Thread Jerry He
The reference files will be rewritten during compaction, which normally happens right after splits. You did not mention if your 200gb data is one file,or many hfiles. Jerry On Oct 2, 2014 12:26 PM, Serega Sheypak serega.shey...@gmail.com wrote: Sorry, massive IO. This table is read-only. So

Re: Copying data from 94 to 98 ..

2014-09-16 Thread Jerry He
The Export and Import table tools in HBase. They are slower than snapshot though. Also, you probably need to come up with something to get around your hdfs protocol mismatch for this as well.

Re: Copying data from 94 to 98 ..

2014-09-15 Thread Jerry He
While you continue on the snapshot approach, have you tried to Export the table in 0.94 to hdfs, and then Import the data from hdfs to 0.98? On Sep 15, 2014 10:19 PM, Matteo Bertozzi theo.berto...@gmail.com wrote: can you post the full exception and the file path ? maybe there is a bug in

HBase Reference Guide for previous releases

2014-09-10 Thread Jerry He
Hi, folk I am trying to find out if we keep HBase Ref Guide and other docs from previous releases on the Apache HBase site. We have: http://hbase.apache.org/book/book.html which seems to be the latest from master. I found this one for 0.94: http://hbase.apache.org/0.94/book/book.html But I can

Re: HBase Reference Guide for previous releases

2014-09-10 Thread Jerry He
it is mentioned in the relevant section of the guide. On Wed, Sep 10, 2014 at 2:06 PM, Jerry He jerry...@gmail.com wrote: Hi, folk I am trying to find out if we keep HBase Ref Guide and other docs from previous releases on the Apache HBase site. We have: http://hbase.apache.org/book

HBase and HDFS HA failover with QJM

2014-06-07 Thread Jerry He
Hi, guys Does anybody have experience on HBase and HDFS HA failover with QJM? Any recommended settings for HBase to make region servers ride over the NN failover smoothly under load? Particularly, is there a need to set these in hbase-site.xml? dfs.client.retry.policy.enabled