Re: upgrade hadoop , keep hbase

2021-12-20 Thread Josh Elser

Hi Michael!

I have to give you the "default" HBase, but defer you to the Hadoop 
community (or someone from there who is paying attention to this list) 
for the precise answer.


The question you are asking is: does the binary compatibility of Hadoop 
jars changes between Hadoop 3.1.2 and 3.3.0. The short answer is "maybe".


In general, within one Hadoop major version, the likelihood of a change 
which breaks binary compatibility [1] is relatively low. However, "low" 
is not "none". If you're doing this on a local environment or a dev 
system, you're probably OK. I would definitely recommend you recompile 
HBase if you're using it for a production system. You wouldn't want to 
be chasing a fix for this if it manifests in a subtle/strange manner.


- Josh

[1] https://docs.oracle.com/javase/specs/jls/se7/html/jls-13.html

On 12/20/21 11:01 AM, Michael Wohlwend wrote:

Hello all,

I have hadoop 3.1.2 running, with hbase 2.1.4

If I want to update to hadoop 3.3.0,  is it sufficient to replace all the hadoop
3.1.2 jars in the hbase folder tree with the 3.3.0 version?
Or do I have to make a new hbase build?


Thanks for answering
  Michael





Re: Troubleshooting ipc.RpcServer: (responseTooSlow)

2021-10-28 Thread Josh Elser
This is not indicative of an outright problem and can just be a result 
of your data and the hardware which you are running HBase on.


Things to note from this data:

1. This RPC will return up to 1000 rows
2. The size of the data returned is not consistent (200KB for one, 18B 
for the other)
3. The queuetimems was 0 which means that the full RPC time was spent 
scanning data (not waiting to begin)


A general pattern of debugging is to identify one or a few Scans from 
your application(s) which are "slow", and diagnose if they are always 
slow or sometimes slow. Try to come up with a specific problem statement.


You would have to first make a decision if they _should_ be slow (i.e. 
are they reading a significant amount of data and filtering out data?), 
or if they are unexpectedly slow.


If they are unexpectedly slow, you would need to start collecting 
additional logging from HBase and/or HDFS, and use other JDK-based 
performance tracking tools (jvisualvm, flightrecorder, yourkit profiler, 
async-profiler) and try to identify if there is a software explanation 
for the slowness. Look at the obvious things and keep a list of things 
you rule out as "not a problem" vs. "potentially a problem".


Good luck.

On 10/28/21 11:16 AM, Hamado Dene wrote:

Sorry i forgot to Specify my hbase version:
Hbase 2.2.6Hadoop: 2.8.5


  


 Il giovedì 28 ottobre 2021, 16:55:40 CEST, Hamado Dene 
 ha scritto:
  
  Hi hbase community,

Lately during our nactivities we constantly receive the warn:

2021-10-28 16:45:00,854 WARN  [RpcServer.default.FPBQ.Fifo.handler=46,queue=1,port=16020] ipc.RpcServer: (responseTooSlow): 
{"call":"Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)","starttimems":"1635432272849","responsesize":"221799","method":"Scan","param":"scanner_id:
 3011016724423115474 number_of_rows: 1000 close_scanner: false next_call_seq: 0 client_handles_partials: true client_handles_heartbeats: tr 
\u003cTRUNCATED\u003e","processingtimems":28005,"client":"10.200.86.173:60806","queuetimems":0,"class":"HRegionServer","scandetails":"table: mn1_7491_hinvio region: 
mn1_7491_hinvio.}

2021-10-28 16:45:02,527 WARN  [RpcServer.default.FPBQ.Fifo.handler=36,queue=1,port=16020] ipc.RpcServer: (responseTooSlow): 
{"call":"Scan(org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ScanRequest)","starttimems":"1635432272520","responsesize":"18","method":"Scan","param":"scanner_id:
 3011016724423115473 number_of_rows: 1000 close_scanner: false next_call_seq: 0 client_handles_partials: true client_handles_heartbeats: tr 
\u003cTRUNCATED\u003e","processingtimems":30006,"client":"10.200.86.130:53476","queuetimems":0,"class":"HRegionServer","scandetails":"table: mn1_7491_hinvio region: 
mn1_7491_hinvio ..}
We are still trying to understand which improvements to implement in order to 
be able to manage the problem.Has anyone ever had the same problem? AndWhat are 
the configurations on which we can act to have better performance?
Thanks,
Hamado Dene



Re: Issue with HBase when dfs.encrypt.data.transfer enabled in HDFS (HBASE-26007)

2021-10-19 Thread Josh Elser
Keeping Hadoop client libraries in sync is one of those things which, 
within a major version, is not likely to cause you problems. However, if 
you are having a problem, that is the first thing I would do.


If you're in that territory, it's also a good idea to recompile HBase 
against that exact version of Hadoop (we operate under those same 
assumptions because we can't realistically monitor builds of HBase 
against every single Hadoop version). Something like:


`mvn package assembly:single -Dhadoop.profile=3.0 
-Dhadoop-three.version=3.2.2 -DskipTests`


The specific error about HdfsFileStatus is a known breakage from Hadoop 
2 to Hadoop 3 in which HdfsFileStatus used to be a class in Hadoop2 and 
then they changed it to be an interface in Hadoop3. That's why the 
recompilation is important (that's just how Java works).


On 10/18/21 11:42 AM, Damillious Jones wrote:

Thanks for the response. They are not. Hadoop 3.2.2 is running 3.2.2 libs,
while HBase is using the 2.10 Hadoop libs. Do these need to be in sync?

I did try syncing them up and adding 3.2.2 libs into HBase, replacing all
of the hadoop-* files and I got this error:

Unhandled: Found interface org.apache.hadoop.hdfs.protocol.HdfsFileStatus,
but class was expected

In searching for an answer to this I read that if using 3.1 or higher of
Hadoop you need to compile HBase with special flag
https://issues.apache.org/jira/browse/HBASE-22394
https://issues.apache.org/jira/browse/HBASE-24154

Do I have to compile HBase in order to get HBase to work with Hadoop 3.1 or
higher?

On Mon, Oct 18, 2021 at 7:13 AM Josh Elser  wrote:


Are the Hadoop JARs which you're using inside HBase the same as the
Hadoop version you're running? (e.g. in $HBASE_HOME/lib)

On 10/15/21 6:18 PM, Damillious Jones wrote:

Hi all, I am seeing a similar issue which is noted in HBASE-26007 where
HBase will not start if dfs.encrypt.data.transfer in HDFS is set to true.
When I start HBase I see the following error message on the master node:

java.io.IOException: Invalid token in javax.security.sasl.qop:
  at


org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessage(DataTransferSaslUtil.java:220)


I am using Hadoop 3.2.2 and HBase 2.4.5 with Java 1.8. If I use Hadoop
3.1.1 it works fine. Has anyone else encountered this issue?

Any help would be appreciated, thanks.







Re: Issue with HBase when dfs.encrypt.data.transfer enabled in HDFS (HBASE-26007)

2021-10-18 Thread Josh Elser
Are the Hadoop JARs which you're using inside HBase the same as the 
Hadoop version you're running? (e.g. in $HBASE_HOME/lib)


On 10/15/21 6:18 PM, Damillious Jones wrote:

Hi all, I am seeing a similar issue which is noted in HBASE-26007 where
HBase will not start if dfs.encrypt.data.transfer in HDFS is set to true.
When I start HBase I see the following error message on the master node:

java.io.IOException: Invalid token in javax.security.sasl.qop:
 at
org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessage(DataTransferSaslUtil.java:220)

I am using Hadoop 3.2.2 and HBase 2.4.5 with Java 1.8. If I use Hadoop
3.1.1 it works fine. Has anyone else encountered this issue?

Any help would be appreciated, thanks.



Re: Major problem for us with Phoenix joins with certain aggregations

2021-10-11 Thread Josh Elser

No worries. Thanks for confirming!

On 10/10/21 1:43 PM, Simon Mottram wrote:

Hi

Thanks for the reply, I posted here by mistake and wasn't sure how to delete.  
It's indeed a problem with phoenix

Sorry to waste your time

Cheers

S




____
From: Josh Elser 
Sent: Saturday, 9 October 2021 3:25 am
To: user@hbase.apache.org
Subject: Re: Major problem for us with Phoenix joins with certain aggregations

That error sounds like a bug in Phoenix.

Maybe you could try with a newer version of Phoenix? Asking over on
user@phoenix might net a better result.

On 9/27/21 11:47 PM, Simon Mottram wrote:

Forgot to mention this is only an issue for LAST_VALUE (so far!)

This works fine

   SELECT
"BIOMATERIAL_NAME",
AVG("PLANT_FRUIT_COUNT")
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
EXISTS (
SELECT
DOCID
FROM
VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
WHERE
DOCID = OBSERVATION_VALUE_ID
AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
   OFFSET 0;

From: Simon Mottram 
Sent: 28 September 2021 4:34 PM
To: user@hbase.apache.org 
Subject: Major problem for us with Phoenix joins with certain aggregations

Hi

Got my fingers crossed that there's a work around for this as this really is a 
big problem for us

We are using:

Amazon EMR

Release label:emr-6.1.0
Hadoop distribution:Amazon
Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0

Thin Client version:
phoenix-5.0.0-HBase-2.0-thin-client.jar

We get the following error when doing an aggregation where

1.  A JOIN is empty
2.  The column is INTEGER or DATETIME

Remote driver error: IllegalArgumentException: offset (25) + length (4) exceed 
the capacity of the array: 25

The query that breaks is:

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
ON DOCID = OBSERVATION_VALUE_ID
AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe'
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
   OFFSET 0;

I can refactor this using EXIST but get same error, presumably the driver knows 
to treat them the same:

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
EXISTS (
SELECT
DOCID
FROM
VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
WHERE
DOCID = OBSERVATION_VALUE_ID
AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
   OFFSET 0;

If we remove the external reference we get no error, regardless of whether 
there are any hits or not

-- these all work
There are no hits for this query

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
BIOMATERIAL_TYPE = 'aardvark'
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
   OFFSET 0;

Lots of hits for this query:

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
   OFFSET 0;









Re: Major problem for us with Phoenix joins with certain aggregations

2021-10-08 Thread Josh Elser

That error sounds like a bug in Phoenix.

Maybe you could try with a newer version of Phoenix? Asking over on 
user@phoenix might net a better result.


On 9/27/21 11:47 PM, Simon Mottram wrote:

Forgot to mention this is only an issue for LAST_VALUE (so far!)

This works fine

  SELECT
"BIOMATERIAL_NAME",
AVG("PLANT_FRUIT_COUNT")
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
EXISTS (
SELECT
DOCID
FROM
VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
WHERE
DOCID = OBSERVATION_VALUE_ID
AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
  OFFSET 0;

From: Simon Mottram 
Sent: 28 September 2021 4:34 PM
To: user@hbase.apache.org 
Subject: Major problem for us with Phoenix joins with certain aggregations

Hi

Got my fingers crossed that there's a work around for this as this really is a 
big problem for us

We are using:

Amazon EMR

Release label:emr-6.1.0
Hadoop distribution:Amazon
Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0

Thin Client version:
phoenix-5.0.0-HBase-2.0-thin-client.jar

We get the following error when doing an aggregation where

   1.  A JOIN is empty
   2.  The column is INTEGER or DATETIME

Remote driver error: IllegalArgumentException: offset (25) + length (4) exceed 
the capacity of the array: 25

The query that breaks is:

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
ON DOCID = OBSERVATION_VALUE_ID
   AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe'
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
  OFFSET 0;

I can refactor this using EXIST but get same error, presumably the driver knows 
to treat them the same:

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
EXISTS (
SELECT
DOCID
FROM
VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
WHERE
DOCID = OBSERVATION_VALUE_ID
AND TAGNAME = 'TRIAL_ID'
AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
  OFFSET 0;

If we remove the external reference we get no error, regardless of whether 
there are any hits or not

-- these all work
There are no hits for this query

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
WHERE
BIOMATERIAL_TYPE = 'aardvark'
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
  OFFSET 0;

Lots of hits for this query:

SELECT
"BIOMATERIAL_NAME",
FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
FROM
VARIABLE_VALUES_QA.OBSERVATION
GROUP BY
"BIOMATERIAL_NAME"
LIMIT 10
  OFFSET 0;







Re: HBASE Queries & Slack Channel Support

2021-08-23 Thread Josh Elser
+1 for following up in Phoenix for Phoenix-specific question, but I 
thought it was worth mentioning that there's no reason that you can't do 
"high throughput" access to HBase via Phoenix. Phoenix has parity for 
most high-throughput approaches that you would have access to in HBase.


There is no one answer to which method you should use, because the 
reality is "it depends". To set clear expectations, high latencies are 
often the tradeoff you have to make for high throughput (latency and 
throughput are often inversely proportional).


Usually, the first round of performance issues boil down to data 
modeling. It's a good thought exercise for you to think through what 
you're requirements are and what the "average" latency for HBase is on 
your hardware (decoupled from your real-life data), and then compare 
that to your actual workload. This helps frame your current performance 
against a "potential" performance.


On 8/23/21 2:29 PM, Daniel Wong wrote:

Hi Atul questions about Phoenix support is better done through the apache 
phoenix mailing lists.  Depending on you use patterns Phoenix may or may not 
perform better than only Hbase.  I'm happy to invite you to the apache phoenix 
slack as well if you reach out to me though you are more likely to hit a wider 
audience in user@phoenix.

Daniel Wong

On 2021/08/22 10:34:28, "Gupta, Atul"  wrote:

Hi HBASE PMC Members.

Greeting!

We are one of the active users of HBASE and Phoenix. There are number of HBASE 
RT/Batch use cases are running in Lowes. I’m owning platform team, so my team 
is responsible for maintaining and support the HBASE business use cases.
In last few months we are facing number of challenges due to read/write 
latencies.  I have reached out directly to PMC members, and everyone has 
suggested me to write to user mail group.

We are writing data into HBASE using Phoenix which is I guess not a recommended 
method if someone is looking for high throughput. Do we have any document 
regarding this? Also, what are Phoenix limitations if any?

Do we have SLACK channel which I can join to clarify my some of doubts quickly?

Looking for your quick response on this.

Thanks,
Atul Gupta
Sr. Director, Data Engineering
Lowes

NOTICE: All information in and attached to the e-mails below may be 
proprietary, confidential, privileged and otherwise protected from improper or 
erroneous disclosure. If you are not the sender's intended recipient, you are 
not authorized to intercept, read, print, retain, copy, forward, or disseminate 
this message. If you have erroneously received this communication, please 
notify the sender immediately by phone (704-758-1000) or by e-mail and destroy 
all copies of this message electronic, paper, or otherwise. By transmitting 
documents via this email: Users, Customers, Suppliers and Vendors collectively 
acknowledge and agree the transmittal of information via email is voluntary, is 
offered as a convenience, and is not a secured method of communication; Not to 
transmit any payment information E.G. credit card, debit card, checking 
account, wire transfer information, passwords, or sensitive and personal 
information E.G. Driver's license, DOB, social security, or any other informati

  on the user wishes to remain confidential; To transmit only non-confidential 
information such as plans, pictures and drawings and to assume all risk and 
liability for and indemnify Lowe's from any claims, losses or damages that may 
arise from the transmittal of documents or including non-confidential 
information in the body of an email transmittal. Thank you.




Re: Hbase export is very slow - help needed

2021-08-19 Thread Josh Elser
Export is a MapReduce job, and HBase will only configure a maximum of 
one Mapper per Region in the table being scanned.


If you have multiple regions for your tsdb table, then it's possible 
that you need to tweak the concurrency on the YARN side such that you 
have multiple Mappers running in parallel?


Sounds like looking at the YARN Application log and UI is your next best 
bet.


On 8/18/21 4:52 AM, Nguyen, Tai Van (EXT - VN) wrote:

Hi HBase Team

Image can see here :

  * Export with single regionserver: https://imgur.com/86wSUMV

  * Export with two regionservers: https://imgur.com/a/XMovlZx


Log show about time was:

root@solaltiplano-track4-master:~/hbase-exporting/latest# cat 
hbase_export_compress_default.log | grep export
Starting hbase export at Fri Jun 11 12:22:46 UTC 2021
tsdb table exported in  6279 seconds
tsdb-meta table exported in  6 seconds
tsdb-tree table exported in  7 seconds
tsdb-uid table exported in  90 seconds
Ending hbase export at Fri Jun 11 14:09:08 UTC 2021


  *


Thanks,
Tai



*From:* Mathews, Jacob 1. (Nokia - IN/Bangalore) 
*Sent:* Monday, August 16, 2021 6:47 PM
*To:* Nguyen, Tai Van (EXT - VN) 
*Subject:* FW: Hbase export is very slow - help needed

*From:*Mathews, Jacob 1. (Nokia - IN/Bangalore)
*Sent:* Friday, August 6, 2021 12:38 PM
*To:* user@hbase.apache.org
*Subject:* Hbase export is very slow - help needed

Hi HBase team,

We are trying to use Hbase export mentioned here: 
http://hbase.apache.org/book.html#export 



But it is happening sequentially row by row as seen from the logs.

we tried many options of the Hbase export, but all were taking long time.

Backup folder contents size:

bash-4.2$ du -kh

16K ./tsdb-tree

16K ./tsdb-meta

60M   ./tsdb-uid

5.9G   ./tsdb

6.0G   .

took around 104 minutes for 6gb compressed data.

Is there a way we can parallelise this and improve the export time.

Below are the charts from Hbase .

Export with single regionserver:

Export with two regionservers:

Scaling the HBase Region server also did not help, the export still 
happens sequentially.


Thanks

Jacob Mathews



Re: Hbase RegionServer Sizing

2021-08-11 Thread Josh Elser
Looks like you're running a third party company's distribution of HBase. 
I'd recommend you start by engaging them for support.


Architecturally, HBase does very little when there is no client load 
applied to the system. If you're experiencing OOME's when the system is 
idle, that sounds like you might have some kind of basic misconfiguration.


On 8/3/21 8:56 AM, Anshuman Singh wrote:

Hi,

I have a 7 node Hbase 2.0.0.3.1 cluster with 12 TB of data, 128MB memstore 
flush size and 20GB regions.
At a time writing happens only on 7 regions and I’m not performing any reads.

I’m seeing OOM errors with 32 GB heap even if I stop the ingestions.

I followed this blog post related to Hbase RegionServer sizing 
http://hadoop-hbase.blogspot.com/2013/01/hbase-region-server-memory-sizing.html.
 But here according to the explanation only the active regions ( on which new 
data is being written) will impact the heap/memstore but the heap usage even 
without any ingestion is touching 32GB. Even the memstore size as shown on the 
Hbase UI is 0 and the block cache is totally free.

I took heap dumps and it shows the majority of heap being used by byte[] 
objects of regionserverHStore and regionserverHRegionServer.

Can anyone point me in the right direction about what may be causing high heap 
usage even without reads and writes?

Regards,
Anshuman



Re: Regionserver reports RegionTooBusyException on import

2021-05-06 Thread Josh Elser

You were able to work around the durability concerns by skipping the WAL (never 
forget that this means your data in HBase is *not* guaranteed to be there).

We’re already doing this. This is actually not a problem for us, because we 
verify the data after the import (using our own restore-test mapreduce report).


Yes, I was summarizing what you had said to then make sure you 
understood the implications of what you had done. Good to hear you are 
verifying this.



Of course, you can also change your application (the Import m/r job) such that 
you can inject sleeps, but I assume you don't want to do that. We don't expose 
an option in that job (to my knowledge) that would inject slowdowns.


That’s funny - I was just talking about this with my colleague more in jest. 
But would it be possible that the MemStore realizes that the incoming write 
rate is higher than the flushing rate and slow down the write requests a little 
bit?
That means putting the „sleep“ into MemStore as a kind of an adaptive 
congestion control: MemStore could measure the incoming rate and the flushing 
rate and add some sleeps on demand...


HBase is essentially do what you're asking. By throwing the 
RegionTooBusyException, the client is pushed into a retry loop. The 
client will pause before it retries, increase the amount of time it 
waits the next time (by some function, I forget exactly what), and then 
retry the same operation.


The problem you're facing is that the default configuration is 
insufficient for the load and/or hardware that you're throwing at HBase.


The other thing you should be asking yourself is if you have a hotspot 
in your table design which is causing the load to not be evenly spread 
across all RegionServers.


Re: Regionserver reports RegionTooBusyException on import

2021-05-06 Thread Josh Elser
Your analysis seems pretty accurate so far. Ultimately, it sounds like 
your SAN is the bottleneck here.


You were able to work around the durability concerns by skipping the WAL 
(never forget that this means your data in HBase is *not* guaranteed to 
be there).


It sounds like compactions are the next bottleneck for you. 
Specifically, your compactions can't complete fast enough to drive down 
the number of storefiles you have.


You have two straightforward approaches to try:
1. Increase the number of compaction threads inside your regionserver. 
hbase.regionserver.thread.compaction.small is likely the one you want to 
increase. Eventually, you may need to also increase 
hbase.regionserver.thread.compaction.large


2. Increase the hbase.client.retries.number to a larger value and/or 
increase hbase.client.pause so that the client will retry more times 
before giving up or wait longer in-between retry attempts


Of course, you can also change your application (the Import m/r job) 
such that you can inject sleeps, but I assume you don't want to do that. 
We don't expose an option in that job (to my knowledge) that would 
inject slowdowns.


On 4/28/21 7:56 AM, Udo Offermann wrote:

Hello everybody

We are migrating from HBase 1.0 to HBase 2.2.5 and observe problem importing 
data to the new HBase 2 cluster. The HBase clusters are connected to a SAN.
For the import we are using the standard HBbase Import (i.e. no bulk import).

We tested the import several times at the HBase 1.0 cluster and never faced any 
problems.

The problem we observe is : org.apache.hadoop.hbase.RegionTooBusyException
In the log files of the region servers we found
  
regionserver.MemStoreFlusher: ... has too many store files


It seems that other people faced similar problems like described in this blog 
post: https://gbif.blogspot.com/2012/07/optimizing-writes-in-hbase.html
However the provided solution does not help in our case (especially increasing 
hbase.hstore.blockingStoreFiles).

In fact the overall problem seems to be that the Import mappers are too fast 
for the region servers so that they cannot flush and compact the HFiles in 
time, even if they stop accepting further writes when
the value of hbase.hstore.blockingStoreFiles is exceeded.

Increasing hbase.hstore.blockingStoreFiles means hat the region server is 
allowed to keep more HFiles but as long as the write throughput of the mappers 
is that high, the region server will never be able to flush and compact the 
written data in time so that in the end the region servers are too busy and 
finally treated as crashed!

IMHO it comes simply to the point that the incoming rate (mapper write operations) 
> processing rate (writing to MemStore, Flushes and Compations) which leads 
always into disaster - if I remember correctly my queues lecture at the university 
;-)

We also found in the logs lots of "Slow sync cost“ so we also turned of WAL 
files for the import:

yarn jar $HBASE_HOME/lib/hbase-mapreduce-2.2.5.jar import 
-Dimport.wal.durability=SKIP_WAL …
which eliminated the „Slow sync cost“ messages but it didn’t solve our overall 
problem.

So my question is: isn’t there a way to somehow slow down the import mapper so 
that the incoming rate < region server’s processing rate?
Are there other possibilities that we can try. One thing that might help (at 
least for the import scenario) is using bulk import but the question is whether 
other scenarios with a high write load will lead to similar problems!

Best regards
Udo









Re: HBase Operating System compatabiliity

2021-04-20 Thread Josh Elser
The Apache HBase community does not provide any compatibility matrix 
which includes operating systems. The compatibility matrix which HBase 
does provide includes Java version, Hadoop version, and some other 
expectations like SSH, DNS, and NTP.


https://hbase.apache.org/book.html#basic.prerequisites

On 4/20/21 1:32 AM, Debraj Manna wrote:

Hbase doc lists out few
 Operating
System specific issues

Hadoop documentation

also states like below. So checking if Hbase also has some
similar requirements and what is the minimum OS version hbase officially
supports.


- Operating Systems: The community SHOULD maintain the same minimum OS
requirements (OS kernel versions) within a minor release. Currently
GNU/Linux and Microsoft Windows are the OSes officially supported by the
community, while Apache Hadoop is known to work reasonably well on other
OSes such as Apple MacOSX, Solaris, etc. Support for any OS SHOULD NOT be
dropped without first being documented as deprecated for a full major
release and MUST NOT be dropped without first being deprecated for at least
a full minor release.

Is there any place I can check in which all OS apache hbase is tested on?

On Tue, Apr 20, 2021 at 10:34 AM Mallikarjun 
wrote:


Doesn't java philosophy of `*write once and run anywhere*` apply to
hbase/hadoop?

We have deployed our production setup on Debian 7/8/9 at different points
of time.

---
Mallikarjun


On Tue, Apr 20, 2021 at 10:05 AM Debraj Manna 
wrote:


Hi

Can someone let me know if there is a compatibility matrix for different
HBase and OS Versions? Which all OS versions and flavors hbase is tested
on?

I could not find anything in hbase doc <

http://hbase.apache.org/book.html

.

Hadoop hardware/software requirements
<


https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/Compatibility.html#:~:text=Currently%20GNU%2FLinux%20and%20Microsoft,Apple%20MacOSX%2C%20Solaris%2C%20etc

.>
also do not say anything explicitly.

Thanks,







Re: Is HBase 2.X suitable for running on Hadoop 3.X?

2021-04-13 Thread Josh Elser
Looks like you don't have the pthread library available. Did you make 
sure you installed the necessary prerequisites for your operating system?


I'd suggest you take Hadoop compilation questions to the Hadoop user 
mailing list for some more prompt answers.


On 4/13/21 1:35 AM, Ascot Moss wrote:

Hi,

I tried to build Hadoop 3.2.2, but got error below, please help!



Run Build Command(s):/usr/bin/gmake -f Makefile cmTC_c17b0/fast &&
/usr/bin/gmake  -f CMakeFiles/cmTC_c17b0.dir/build.make
CMakeFiles/cmTC_c17b0.dir/build

gmake[1]: Entering directory
`/hadoop-3.2.2-src/hadoop-common-project/hadoop-common/target/native/CMakeFiles/CMakeTmp'

Building C object CMakeFiles/cmTC_c17b0.dir/src.c.o

/usr/bin/cc -DCMAKE_HAVE_LIBC_PTHREAD   -o
CMakeFiles/cmTC_c17b0.dir/src.c.o -c
/hadoop-3.2.2-src/hadoop-common-project/hadoop-common/target/native/CMakeFiles/CMakeTmp/src.c

Linking C executable cmTC_c17b0

/usr/local/bin/cmake -E cmake_link_script
CMakeFiles/cmTC_c17b0.dir/link.txt --verbose=1

/usr/bin/cc -rdynamic CMakeFiles/cmTC_c17b0.dir/src.c.o -o cmTC_c17b0

CMakeFiles/cmTC_c17b0.dir/src.c.o: In function `main':

src.c:(.text+0x2d): undefined reference to `pthread_create'

src.c:(.text+0x39): undefined reference to `pthread_detach'

src.c:(.text+0x45): undefined reference to `pthread_cancel'

src.c:(.text+0x56): undefined reference to `pthread_join'

src.c:(.text+0x6a): undefined reference to `pthread_atfork'

collect2: error: ld returned 1 exit status

gmake[1]: *** [cmTC_c17b0] Error 1

gmake[1]: Leaving directory
`/hadoop-3.2.2-src/hadoop-common-project/hadoop-common/target/native/CMakeFiles/CMakeTmp'

gmake: *** [cmTC_c17b0/fast] Error 2




On Fri, Apr 2, 2021 at 7:01 PM Wei-Chiu Chuang 
wrote:


i think it's time to remove that statement. We have lots of production
users running HBase 2 on Hadoop 3 for several years now.

On Fri, Apr 2, 2021 at 6:32 PM 张铎(Duo Zhang) 
wrote:


According to the compatibility matrix, HBase 2.3.4 could work together

with

Hadoop 3.2.2.

And if you just want to connec to HDFS 3.2.2, you could just use the pre
built artifacts for HBase 2.3.4. If you want to use Hadoop 3.2.2 client

in

HBase, you need to build the artifacts by your own.

Thanks.

hossein ahmadzadeh  于2021年3月31日周三 上午3:30写道:




The HBase documentation 

noted

that:

Hadoop 2.x is faster and includes features, such as short-circuit reads
(See Leveraging local data), which will help improve your HBase random

read

profile. Hadoop 2.x also includes important bug fixes that will improve
your overall HBase experience. HBase does not support running with

earlier

versions of Hadoop. See the table below for requirements specific to
different HBase versions. *Hadoop 3.x is still in early access releases

and

has not yet been sufficiently tested by the HBase community for

production

use cases.*

But right next to this point in the compatibility table says that HBase
2.3.4 is fully functional with Hadoop 3.2.2. So I got confused about
whether We can use HBase 2.3.4 in production alongside Hadoop 3.2.2 or

not?


‌Best Regards.









Re: HBASE WALs

2021-04-07 Thread Josh Elser
Would recommend you reach out to Cloudera Support if you're already 
using CDH. They will be able to help you a more hands-on with steps to 
find the busted procWAL(s) and recover.


On 4/7/21 2:11 AM, Marc Hoppins wrote:

Unfortunately, we are currently stuck using CDH 6.3.2 with Hbase 2.1.0.  The 
company cannot really justify the cost of upgrading this particular offering at 
the incredibly expensive price per node, as we do not have any money-making on 
the data being stored to justify such spending for the size of the cluster.

-Original Message-
From: Stack 
Sent: Wednesday, April 7, 2021 12:55 AM
To: Hbase-User 
Subject: Re: HBASE WALs

EXTERNAL

On Tue, Mar 30, 2021 at 2:52 AM Marc Hoppins  wrote:


Dear HBASE gang,

...and, as I previously mentioned, we now have a grand bunch of OLD
WALs milling about.



WALs in the masterProcWALs dir?

MY thinking is that if nothing is going on with writing, then anything in

any masterProcWALs must be related to the bad table and we can just
wipe them and restart HBASE.

Questions I have:

Am I correct in my theory? (I am far from being a Java guy so am not
sure how to follow the process there)



If the old masterProcWALs are not clearing out, must be corruption in the older 
WALs that is preventing them 'completing' so they can be released (meantime new 
procs are added ahead of the old ones...so more WALs show up).



If another (quicker) choice was made and we stop DB operations,
disable all tables then delete masterProcWALs, WITHOUT waiting for
compactions to finish, would we have a real problem with where HBASE
thinks data is or where it should be going due to anything that was
pending in masterWALs for
(possibly) all tables?



Compactions are interruptible. Compactions have nothing to do w/ the 
masterProcStore (or with where data is located).




Is there any sane way to deal with the information in masterWALs?  Or
is that only a Java API thing?



Old WALs are corrupt. Could try and get hbase to quiescent state, stop it, and 
try removing an old WAL... restart, see if it all ok. Hard part is that 
procedures sometimes span WALs so removal may just move forward the corruption.

Upgrade is your best course to 2.3. The procedure store will be migrated. 
There'll likely be some mess to be cleaned up but at least there is tooling to 
do so in later hbases.

S




Thanks for all the help/info thus far.

-Original Message-
From: Marc Hoppins 
Sent: Friday, March 26, 2021 10:49 AM
To: user@hbase.apache.org
Subject: RE: HBASE WALs

EXTERNAL

I wonder if anyone can explain the following:

Before I tried my attempt to fix, HBASE master was retrying to deal
with that stuck region. The attempt counter was increasing - I think
at last count we were up to 3000 or something.  After my attempt, and
I restarted HBASE, it has not tried to fix the stuck region and
attempts are currently at zero.  All procs and locks still exist.

-Original Message-
From: Wellington Chevreuil 
Sent: Tuesday, March 23, 2021 6:16 PM
To: Hbase-User 
Subject: Re: HBASE WALs

EXTERNAL



I am still not certain what will happen.  masterProcWALs contain
info for all (running) tables, yes?


masterProcWALs only contain info for running procedures, not user
table data. User table data go on "normal" WALs, not "masterProcWALs".

  If all tables are disabled and I remove the master wals, how will
that

affect the other tables? When I disabled all tables, hundreds of
master WALs are now created. This means there is a bunch of pending
operations, yes?  Is it going to make some other things inconsistent?


Table disabling involves the unassignment of all these tables regions.
Each of these "unassign" operations comprise a set of sequential phases.
These internal operations are called "procedures". Information about
the progress of these operations as it progresses through its
different phases are stored in these masterProcWALs files. That's why
triggering the "disable"
command will create some data under masterProcWALs. If all the disable
commands finished successfully, and all your procedures are finished
(apart from that rogue one existing for while already), you would be
good to clean out masterProcWALs.

I did try to set the table state manually to see if the faulty table
would

fire up and I restarted hbase...state was the same a locked table
state due to pending disable and stuck region.


That's because of the rogue procedure. When you restarted master, it
went through masterProcWals and resumed the rogue procedure from the
unfinished state it was when you restarted hbase. If you had removed
masterProcWALs prior to restart, the rogue procedure would now be gone.

We may have the go-ahead to remove this table - I assume we cannot
clone it

while it is in a state of (DISABLED) flux but, once again, messing
with master WALs has me on edge.


 From what I understand, you already have the tables disabled, and no
unfinished procs apart from the rogue one, so just clean out

Re: problem building hbase 3.0.0

2021-01-29 Thread Josh Elser
`-DskipTests` is the standard Maven "ism" to skip tests. Your output 
appears to indicate that you were running tests, so perhaps your 
invocation was incorrect? You've not provided enough information for us 
to know why exactly your build failed.


You do not have to build HBase from source in order to build an 
application. Every official release which this project creates has 
artifacts which are published in Maven Central for you to consume directly.


One final note: you are building from the master branch (3.0.0-SNAPSHOT) 
which is still under development. I'd suggest that you add a dependency 
in your application to the Maven dependency 
org.apache.hbase:hbase-client for the version of HBase that you are 
actually running.


If you do not yet have a HBase which is already running, you can 
download a pre-built release from 
https://hbase.apache.org/downloads.html. Be sure to grab a version of 
Hadoop which is compatible with the HBase release which you are running.


On 1/29/21 8:54 AM, richard t wrote:

hi,
my ultimate goal is to have a basic CRUD client to hbase using the java api.
in trying to get there, I downloaded the source for hbase 3.0.0 because of the 
client source and wanted to build this client and test it against my hbase 
instance that is running.
the hbase source is not building due to this error:BUILD FAILURE
[INFO] 
[INFO] Total time: 18:03 min
[INFO] Finished at: 2021-01-28T11:22:12-05:00
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:3.0.0-M4:test (default-test) on 
project hbase-http: There are test failures.
[ERROR]
[ERROR] Please refer to 
/home/rtkatch/Downloads/hbase-src/hbase-master/hbase-http/target/surefire-reports
 for the individual test results.
[ERROR] Please refer to dump files (if any exist) [date].dump, 
[date]-jvmRun[N].dump and [date].dumpstream.
I have tried to run with -DskipTests but this error seems to be persistent.I am 
using maven 3.6.3. compiling with openjdk 1.8
I had found some information where one can set up configuration in the pom to 
skip tests, so I will try to set that up.
Any information on this would be much appreciated.
One other comment, I just don't understand why one has to build everything just 
to get a test client... there really should be a bundle that contains only the 
stuff needed to connect to the hbase. I noticed that there were different 
archetypes but am not sure how to use them other than copying them to my dev 
directory for use.
thanks!




Re: Region server idle

2021-01-24 Thread Josh Elser
Yes, each RegionServer has its own writeahead log (which is named with 
that RS' hostname).


You'd want to look at the HBase master log, specifically for reasons as 
to why balancing is not naturally happening. It very well could be that 
you have other regions in transition (which may prevent balancing from 
happening). This is just one reason balancing may not naturally 
happening, but you should be able to see this in the active master log 
(potentially with enabling DEBUG on org.apache.hadoop.hbase, first). 
Don't forget about the hbase shell command to request the balancer to 
immediately run (so you can look for that logging at a specific point in 
time).


On 1/18/21 7:23 AM, Marc Hoppins wrote:

I have been checking for days and there are no outstanding RITs.  Region 
servers do not have their own WAL files, do they?

What gives me pause is that, although the affected servers (hbase19 & hbase20) 
have 11 and 3 regions respectively, there must be very little useable data as the 
requests per second are negligible for hbase19 and zero  for hbase20.

I would have expected SOME movement to distribute data and work onto these 
'vacant' systems after more than 2 weeks.

The circumstance behind hbase19 going offline is that a memory module had 
vailed and dump data was constantly filling up tmp storage so the oncall guy 
made the decision to shut the system down. Given that a lot of hbase work is 
done in memory is there any possible way something still lingers in memory 
somewhere that has not been flushed?

As for hbase20, an IT guy decommissioned the host in the Cloudera console and 
recommissioned it as a test to see if region balancing proceeded as normal. 
Obviously, it hasn't. For obvious reasons, a second test has not been performed.

-----Original Message-
From: Josh Elser 
Sent: Tuesday, January 12, 2021 4:56 PM
To: user@hbase.apache.org
Subject: Re: Region server idle

EXTERNAL

Yes, in general, HDFS rebalancing will cause a decrease in the performance of 
HBase as it removes the ability for HBase to short-circuit some read logic. It 
should not, however, cause any kind of errors or lack of availability.

You should feel free to investigate the RITs you have now, rather than wait for 
a major compaction to finish. As a reminder, you can also force one to happen 
now via the `major_compact` HBase shell command, for each table (or at least 
the tables which are most important). Persistent RITs will prevent balancing 
from happening, that may be your smoking gun.

It may also be helpful for you to reach out to your vendor for support if you 
have not done so already.

On 1/12/21 6:11 AM, Marc Hoppins wrote:

I read that HDFS balancing doesn't sit well with HBASE balancing.  A colleague 
rebalanced HDFS on Friday. If I look for rebalance, despite me wandering 
through HBASE (Cloudera manager) it redirects to HDFS balance.

I'd suggest I wait for major compaction to occur but who knows when that will 
be? Despite the default setting of 7 days in place, from what I read this will 
be dependent on no RITs being performed.  As this is not just a working cluster 
but one of the more important ones, I am not sure if we can finish up any RITs 
to make the database 'passive' enough to perform a major compaction.

Once again, experience in this area may be giv8ing me misinformation.

-Original Message-
From: Josh Elser 
Sent: Monday, January 11, 2021 5:34 PM
To: user@hbase.apache.org
Subject: Re: Region server idle

EXTERNAL

The Master stacktrace you have there does read as a bug, but it shouldn't be 
affecting balancing.

That Chore is doing work to apply space quotas, but your quota here is only 
doing RPC (throttle) quotas. Might be something already fixed since the version 
you're on. I'll see if anything jumps out at me on Jira.

If the Master isn't giving you any good logging, you could set the
Log4j level to DEBUG for org.apache.hadoop.hbase (either via CM or the
HBase UI for the active master, assuming that feature isn't disabled
for security reasons in your org -- master.ui.readonly something
something config property in hbase-site.xml)

If DEBUG doesn't help, I'd set TRACE level for 
org.apache.hadoop.hbase.master.balancer. Granted, it might not be obvious to 
the untrained eye, but if you can share that DEBUG/TRACE after you manuall 
invoke the balancer again via hbase shell, it should be enough for those 
watching here.

On 1/11/21 5:32 AM, Marc Hoppins wrote:

OK. So I tried again after running kinit and got the following:

Took 0.0010 seconds
hbase(main):001:0> list_quotas
OWNERQUOTAS
USER => robot_urlrs TYPE => THROTTLE, THROTTLE_TYPE 
=> REQUEST_NUMBER, LIMIT => 100req/sec, SCOPE => MACHINE
1 row(s)

Not sure what to make of it but it doesn't seem like it is enough to prevent 
balancing. 

Re: Region server idle

2021-01-12 Thread Josh Elser
Yes, in general, HDFS rebalancing will cause a decrease in the 
performance of HBase as it removes the ability for HBase to 
short-circuit some read logic. It should not, however, cause any kind of 
errors or lack of availability.


You should feel free to investigate the RITs you have now, rather than 
wait for a major compaction to finish. As a reminder, you can also force 
one to happen now via the `major_compact` HBase shell command, for each 
table (or at least the tables which are most important). Persistent RITs 
will prevent balancing from happening, that may be your smoking gun.


It may also be helpful for you to reach out to your vendor for support 
if you have not done so already.


On 1/12/21 6:11 AM, Marc Hoppins wrote:

I read that HDFS balancing doesn't sit well with HBASE balancing.  A colleague 
rebalanced HDFS on Friday. If I look for rebalance, despite me wandering 
through HBASE (Cloudera manager) it redirects to HDFS balance.

I'd suggest I wait for major compaction to occur but who knows when that will 
be? Despite the default setting of 7 days in place, from what I read this will 
be dependent on no RITs being performed.  As this is not just a working cluster 
but one of the more important ones, I am not sure if we can finish up any RITs 
to make the database 'passive' enough to perform a major compaction.

Once again, experience in this area may be giv8ing me misinformation.

-Original Message-----
From: Josh Elser 
Sent: Monday, January 11, 2021 5:34 PM
To: user@hbase.apache.org
Subject: Re: Region server idle

EXTERNAL

The Master stacktrace you have there does read as a bug, but it shouldn't be 
affecting balancing.

That Chore is doing work to apply space quotas, but your quota here is only 
doing RPC (throttle) quotas. Might be something already fixed since the version 
you're on. I'll see if anything jumps out at me on Jira.

If the Master isn't giving you any good logging, you could set the Log4j level 
to DEBUG for org.apache.hadoop.hbase (either via CM or the HBase UI for the 
active master, assuming that feature isn't disabled for security reasons in 
your org -- master.ui.readonly something something config property in 
hbase-site.xml)

If DEBUG doesn't help, I'd set TRACE level for 
org.apache.hadoop.hbase.master.balancer. Granted, it might not be obvious to 
the untrained eye, but if you can share that DEBUG/TRACE after you manuall 
invoke the balancer again via hbase shell, it should be enough for those 
watching here.

On 1/11/21 5:32 AM, Marc Hoppins wrote:

OK. So I tried again after running kinit and got the following:

Took 0.0010 seconds
hbase(main):001:0> list_quotas
OWNERQUOTAS
   USER => robot_urlrs TYPE => THROTTLE, THROTTLE_TYPE => 
REQUEST_NUMBER, LIMIT => 100req/sec, SCOPE => MACHINE
1 row(s)

Not sure what to make of it but it doesn't seem like it is enough to prevent 
balancing.  There are other tables and (probably) other users.

-Original Message-
From: Marc Hoppins 
Sent: Monday, January 11, 2021 9:52 AM
To: user@hbase.apache.org
Subject: RE: Region server idle

EXTERNAL

I tried. Appears to have failed reading data from hbase:meta. These are 
repeated errors for the whole run of list_quotas.

A balance task was run on Friday. It took 9+ hours. The affected host had 6 
regions - no procedures/locks or processes were running for those 6 regions. 
Today, that host has 8 regions.  No real work being performed on them.  The 
other server - which went idle as a result of removing hbase19 host from hbase 
and re-inserting to hbase - is still doing nothing and has no regions assigned.

I was su - hbase hbase shell to run it.



HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit:
http://hbase.apache.org/2.0/book.html#shell
Version 2.1.0-cdh6.3.2, rUnknown, Fri Nov  8 05:44:07 PST 2019 Took 0.0011 seconds 
hbase(main):001:0> list_quotas
OWNER  QUOTAS
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=8, exceptions:
Mon Jan 11 09:16:46 CET 2021, RpcRetryingCaller{globalStartTime=1610353006298, 
pause=100, maxAttempts=8}, javax.security.sasl.SaslException: Call to 
dr1-hbase18.jumb
 
o.hq.com/10.1.140.36:16020 failed on local exception: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provi  

   ded (Mechanism level: Failed to find any Kerberos tgt)] [Caused by

Re: Region server idle

2021-01-11 Thread Josh Elser
The Master stacktrace you have there does read as a bug, but it 
shouldn't be affecting balancing.


That Chore is doing work to apply space quotas, but your quota here is 
only doing RPC (throttle) quotas. Might be something already fixed since 
the version you're on. I'll see if anything jumps out at me on Jira.


If the Master isn't giving you any good logging, you could set the Log4j 
level to DEBUG for org.apache.hadoop.hbase (either via CM or the HBase 
UI for the active master, assuming that feature isn't disabled for 
security reasons in your org -- master.ui.readonly something something 
config property in hbase-site.xml)


If DEBUG doesn't help, I'd set TRACE level for 
org.apache.hadoop.hbase.master.balancer. Granted, it might not be 
obvious to the untrained eye, but if you can share that DEBUG/TRACE 
after you manuall invoke the balancer again via hbase shell, it should 
be enough for those watching here.


On 1/11/21 5:32 AM, Marc Hoppins wrote:

OK. So I tried again after running kinit and got the following:

Took 0.0010 seconds
hbase(main):001:0> list_quotas
OWNERQUOTAS
  USER => robot_urlrs TYPE => THROTTLE, THROTTLE_TYPE => 
REQUEST_NUMBER, LIMIT => 100req/sec, SCOPE => MACHINE
1 row(s)

Not sure what to make of it but it doesn't seem like it is enough to prevent 
balancing.  There are other tables and (probably) other users.

-Original Message-
From: Marc Hoppins 
Sent: Monday, January 11, 2021 9:52 AM
To: user@hbase.apache.org
Subject: RE: Region server idle

EXTERNAL

I tried. Appears to have failed reading data from hbase:meta. These are 
repeated errors for the whole run of list_quotas.

A balance task was run on Friday. It took 9+ hours. The affected host had 6 
regions - no procedures/locks or processes were running for those 6 regions. 
Today, that host has 8 regions.  No real work being performed on them.  The 
other server - which went idle as a result of removing hbase19 host from hbase 
and re-inserting to hbase - is still doing nothing and has no regions assigned.

I was su - hbase hbase shell to run it.



HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.1.0-cdh6.3.2, rUnknown, Fri Nov  8 05:44:07 PST 2019 Took 0.0011 seconds 
hbase(main):001:0> list_quotas
OWNER  QUOTAS
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after 
attempts=8, exceptions:
Mon Jan 11 09:16:46 CET 2021, RpcRetryingCaller{globalStartTime=1610353006298, 
pause=100, maxAttempts=8}, javax.security.sasl.SaslException: Call to 
dr1-hbase18.jumb
 
o.hq.com/10.1.140.36:16020 failed on local exception: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provi  

   ded (Mechanism level: Failed to find any Kerberos tgt)] [Caused by 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentia  
   
ls provided (Mechanism level: Failed to find any Kerberos tgt)]]
Mon Jan 11 09:16:46 CET 2021, RpcRetryingCaller{globalStartTime=1610353006298, 
pause=100, maxAttempts=8}, java.io.IOException: Call to 
dr1-hbase18.jumbo.hq.com/   

  10.1.140.36:16020 failed on local exception: java.io.IOException: Can not 
send request because relogin is in progress.
Mon Jan 11 09:16:46 CET 2021, RpcRetryingCaller{globalStartTime=1610353006298, 
pause=100, maxAttempts=8}, java.io.IOException: Call to 
dr1-hbase18.jumbo.hq.com/   

  10.1.140.36:16020 failed on local exception: java.io.IOException: Can not 
send request because relogin is in progress.
Mon Jan 11 09:16:47 CET 2021, RpcRetryingCaller{globalStartTime=1610353006298, 
pause=100, maxAttempts=8}, java.io.IOException: Call to 
dr1-hbase18.jumbo.hq.com/   

  10.1.140.36:16020 failed on local exception: java.io.IOException: Can not 
send request because relogin is in progress.
Mon Jan 11 09:16:47 CET 2021, RpcRetryingCaller{globalStartTime=1610353006298, 
pause=100, maxAttempts=8}, java.io.IOException: C

Re: [DISCUSS] Removing problematic terms from our project

2020-06-22 Thread Josh Elser

+1

On 6/22/20 4:03 PM, Sean Busbey wrote:

We should change our use of these terms. We can be equally or more clear in
what we are trying to convey where they are present.

That they have been used historically is only useful if the advantage we
gain from using them through that shared context outweighs the potential
friction they add. They make me personally less enthusiastic about
contributing. That's enough friction for me to advocate removing them.

AFAICT reworking our replication stuff in terms of "active" and "passive"
clusters did not result in a big spike of folks asking new questions about
where authority for state was.

On Mon, Jun 22, 2020, 13:39 Andrew Purtell  wrote:


In response to renewed attention at the Foundation toward addressing
culturally problematic language and terms often used in technical
documentation and discussion, several projects have begun discussions, or
made proposals, or started work along these lines.

The HBase PMC began its own discussion on private@ on June 9, 2020 with an
observation of this activity and this suggestion:

There is a renewed push back against classic technology industry terms that
have negative modern connotations.

In the case of HBase, the following substitutions might be proposed:

- Coordinator instead of master

- Worker instead of slave

Recommendations for these additional substitutions also come up in this
type of discussion:

- Accept list instead of white list

- Deny list instead of black list

Unfortunately we have Master all over our code base, baked into various
APIs and configuration variable names, so for us the necessary changes
amount to a new major release and deprecation cycle. It could well be worth
it in the long run. We exist only as long as we draw a willing and
sufficient contributor community. It also wouldn’t be great to have an
activist fork appear somewhere, even if unlikely to be successful.

Relevant JIRAs are:

- HBASE-12677 :
Update replication docs to clarify terminology
- HBASE-13852 :
Replace master-slave terminology in book, site, and javadoc with a more
modern vocabulary
- HBASE-24576 :
Changing "whitelist" and "blacklist" in our docs and project

In response to this proposal, a member of the PMC asked if the term
'master' used by itself would be fine, because we only have use of 'slave'
in replication documentation and that is easily addressed. In response to
this question, others on the PMC suggested that even if only 'master' is
used, in this context it is still a problem.

For folks who are surprised or lacking context on the details of this
discussion, one PMC member offered a link to this draft RFC as background:
https://tools.ietf.org/id/draft-knodel-terminology-00.html

There was general support for removing the term "master" / "hmaster" from
our code base and using the terms "coordinator" or "leader" instead. In the
context of replication, "worker" makes less sense and perhaps "destination"
or "follower" would be more appropriate terms.

One PMC member's thoughts on language and non-native English speakers is
worth including in its entirety:

While words like blacklist/whitelist/slave clearly have those negative
references, word master might not have the same impact for non native
English speakers like myself where the literal translation to my mother
tongue does not have this same bad connotation. Replacing all references
for word *master *on our docs/codebase is a huge effort, I guess such a
decision would be more suitable for native English speakers folks, and
maybe we should consider the opinion of contributors from that ethinic
minority as well?

These are good questions for public discussion.

We have a consensus in the PMC, at this time, that is supportive of making
the above discussed terminology changes. However, we also have concerns
about what it would take to accomplish meaningful changes. Several on the
PMC offered support in the form of cycles to review pull requests and
patches, and two PMC members offered  personal bandwidth for creating and
releasing new code lines as needed to complete a deprecation cycle.

Unfortunately, the terms "master" and "hmaster" appear throughout our code
base in class names, user facing API subject to our project compatibility
guidelines, and configuration variable names, which are also implicated by
compatibility guidelines given the impact of changes to operators and
operations. The changes being discussed are not backwards compatible
changes and cannot be executed with swiftness while simultaneously
preserving compatibility. There must be a deprecation cycle. First, we must
tag all implicated public API and configuration variables as deprecated,
and release HBase 3 with these deprecations in place. Then, we must
undertake rename and removal as appropriate, and release the resul

Re: Hbase WAL Class

2020-06-19 Thread Josh Elser

https://hbase.apache.org/mail-lists.html

On 6/18/20 9:10 PM, Govindhan S wrote:

Hello Josh,

Great Day.

I don't see a subscribe option over there. Could you please educate me 
more on this.


~ Govins

On Friday, 19 June, 2020, 02:17:13 am IST, Josh Elser 
 wrote:



Please subscribe to the list so that you see when people reply to you.

https://lists.apache.org/thread.html/r68d91878bb6576850233bce83baa3479a19fedeeb32c76151c8c9abc%40%3Cuser.hbase.apache.org%3E

On 6/16/20 12:40 PM, Josh Elser wrote:
 > `hbase wal` requires you to provide options. You provided none, so the
 > command printed you the help message.
 >
 > Please read the help message and provide the necessary ""
 > argument(s).
 >
 > On 6/16/20 11:57 AM, Govindhan S wrote:
 >> Hello Hbase Users,
 >> I am a newbie to hbase. I do have a HDInsight HDP2.6.2.3 cluster
 >> running with hbase version 1.1.2
 >> When i try to use the WAL utility its failing with the below issues :
 >> $ hbase wal -p > hbase_WAL_output.txt$ cat hbase_WAL_output.txtusage:
 >> WAL  [-h] [-j] [-p] [-r ] [-s ] [-w
 >> ] -h,--help             Output help message -j,--json
 >>  Output JSON -p,--printvals        Print values -r,--region 
 >>  Region to filter by. Pass encoded region name; e.g.
 >>      '9192caead6a5a20acb4454ffbc79fa14' -s,--sequence    Sequence
 >> to filter by. Pass sequence number. -w,--row         Row to
 >> filter by. Pass row name.
 >>
 >> I thought the issue with the capital WAL needs to be defined, but when
 >> i tried that:
 >> $ hbase WAL -p > hbase_WAL_output.txtError: Could not find or load
 >> main class WAL
 >> Warm Regards,Govindhan S
 >>


Re: Hbase WAL Class

2020-06-18 Thread Josh Elser

Please subscribe to the list so that you see when people reply to you.

https://lists.apache.org/thread.html/r68d91878bb6576850233bce83baa3479a19fedeeb32c76151c8c9abc%40%3Cuser.hbase.apache.org%3E

On 6/16/20 12:40 PM, Josh Elser wrote:
`hbase wal` requires you to provide options. You provided none, so the 
command printed you the help message.


Please read the help message and provide the necessary "" 
argument(s).


On 6/16/20 11:57 AM, Govindhan S wrote:

Hello Hbase Users,
I am a newbie to hbase. I do have a HDInsight HDP2.6.2.3 cluster 
running with hbase version 1.1.2

When i try to use the WAL utility its failing with the below issues :
$ hbase wal -p > hbase_WAL_output.txt$ cat hbase_WAL_output.txtusage: 
WAL  [-h] [-j] [-p] [-r ] [-s ] [-w 
] -h,--help             Output help message -j,--json
 Output JSON -p,--printvals        Print values -r,--region 
 Region to filter by. Pass encoded region name; e.g.  
     '9192caead6a5a20acb4454ffbc79fa14' -s,--sequence    Sequence 
to filter by. Pass sequence number. -w,--row         Row to 
filter by. Pass row name.


I thought the issue with the capital WAL needs to be defined, but when 
i tried that:
$ hbase WAL -p > hbase_WAL_output.txtError: Could not find or load 
main class WAL

Warm Regards,Govindhan S



Re: HBase Master initialized long time after upgrade from 1.x to 2.x.

2020-06-18 Thread Josh Elser
I recall a version of HBase 2 where MasterProcWALs didn't get cleaned 
up. Given the ID count in your pv2 wal file names is up to 200K's, I 
would venture a guess that the master is just spinning to process a 
bunch of old procedures.


You could try to move them to the side and be prepared to use HBCK2 to 
fix anything that is assigned properly as a result of SCP's not running.


Gentle reminder: please always share the full version details when 
asking questions.


On 6/18/20 10:03 AM, 郭文傑 (Rock) wrote:

  Hi,

My HBase cluster has about 50TB data.
After upgrade to 2.x, the Master startup long time.
I check the log that master try to reload WAL.
It already initialized more than 75 hours, I afraid it will spend more days.
How can I handle this issue?

The below is master log:
2020-06-18 21:59:14,690 INFO  [PEWorker-4] zookeeper.MetaTableLocator:
Setting hbase:meta (replicaId=0) location in ZooKeeper as
persp-16.persp.net,16020,1592295446983

2020-06-18 21:59:14,691 INFO  [PEWorker-4]
assignment.RegionTransitionProcedure: Dispatch pid=192, ppid=145,
state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; AssignProcedure
table=hbase:meta, region=1588230740
2020-06-18 21:59:14,897
INFO  [RpcServer.priority.FPBQ.Fifo.handler=19,queue=1,port=16000]
assignment.AssignProcedure: Retry=625296 of max=2147483647; pid=192,
ppid=145, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true;
AssignProcedure table=hbase:meta, region=1588230740; rit=OPENING, location=
persp-16.persp.net,16020,1592295446983

2020-06-18 21:59:14,924 WARN  [WALProcedureStoreSyncThread]
wal.WALProcedureStore: procedure WALs count=456 above the warning threshold
10. check running procedures to see if something is stuck.
2020-06-18 21:59:14,924 INFO  [WALProcedureStoreSyncThread]
wal.WALProcedureStore: Rolled new Procedure Store WAL, id=219931
2020-06-18 21:59:14,929 INFO  [PEWorker-6] assignment.AssignProcedure:
Starting pid=192, ppid=145, state=RUNNABLE:REGION_TRANSITION_QUEUE,
locked=true; AssignProcedure table=hbase:meta, region=1588230740;
rit=OFFLINE, location=null; forceNewPlan=true, retain=false target svr=null
2020-06-18 21:59:14,965 INFO  [WALProcedureStoreSyncThread]
wal.WALProcedureStore: Remove the oldest log
hdfs://ha:8020/apps/hbase/data/MasterProcWALs/pv2-00219476.log
2020-06-18 21:59:14,965 INFO  [WALProcedureStoreSyncThread]
wal.ProcedureWALFile: Archiving
hdfs://ha:8020/apps/hbase/data/MasterProcWALs/pv2-00219476.log
to hdfs://ha:8020/apps/hbase/data/oldWALs/pv2-00219476.log
2020-06-18 21:59:15,109 INFO  [PEWorker-1] zookeeper.MetaTableLocator:
Setting hbase:meta (replicaId=0) location in ZooKeeper as
persp-30.persp.net,16020,1592295446723

2020-06-18 21:59:15,110 INFO  [PEWorker-1]
assignment.RegionTransitionProcedure: Dispatch pid=192, ppid=145,
state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true; AssignProcedure
table=hbase:meta, region=1588230740
2020-06-18 21:59:15,313
INFO  [RpcServer.priority.FPBQ.Fifo.handler=19,queue=1,port=16000]
assignment.AssignProcedure: Retry=625297 of max=2147483647; pid=192,
ppid=145, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true;
AssignProcedure table=hbase:meta, region=1588230740; rit=OPENING, location=
persp-30.persp.net,16020,1592295446723

2020-06-18 21:59:15,313 INFO  [PEWorker-14] assignment.AssignProcedure:
Starting pid=192, ppid=145, state=RUNNABLE:REGION_TRANSITION_QUEUE,
locked=true; AssignProcedure table=hbase:meta, region=1588230740;
rit=OFFLINE, location=null; forceNewPlan=true, retain=false target svr=null




Re: Hbase WAL Class

2020-06-16 Thread Josh Elser
`hbase wal` requires you to provide options. You provided none, so the 
command printed you the help message.


Please read the help message and provide the necessary "" 
argument(s).


On 6/16/20 11:57 AM, Govindhan S wrote:

Hello Hbase Users,
I am a newbie to hbase. I do have a HDInsight HDP2.6.2.3 cluster running with 
hbase version 1.1.2
When i try to use the WAL utility its failing with the below issues :
$ hbase wal -p > hbase_WAL_output.txt$ cat hbase_WAL_output.txtusage: WAL  [-h] [-j] [-p] [-r 
] [-s ] [-w ] -h,--help             Output help message -j,--json             Output JSON 
-p,--printvals        Print values -r,--region      Region to filter by. Pass encoded region name; e.g.           
            '9192caead6a5a20acb4454ffbc79fa14' -s,--sequence    Sequence to filter by. Pass sequence number. 
-w,--row         Row to filter by. Pass row name.

I thought the issue with the capital WAL needs to be defined, but when i tried 
that:
$ hbase WAL -p > hbase_WAL_output.txtError: Could not find or load main class 
WAL
Warm Regards,Govindhan S



Re: Too many connections from / - max is 60

2020-06-02 Thread Josh Elser
HBase (daemons) try to use a single connection for themselves. A RS also 
does not need to mutate state in ZK to handle things like gets and puts.


Phoenix is probably the thing you need to look at more closely 
(especially if you're using an old version of Phoenix that matches the 
old HBase 1.1 version). Internally, Phoenix acts like an HBase client 
which results in a new ZK connection. There have certainly been bugs 
like that in the past (speaking generally, not specifically).


On 6/1/20 5:59 PM, anil gupta wrote:

Hi Folks,

We are running in HBase problems due to hitting the limit of ZK
connections. This cluster is running HBase 1.1.x and ZK 3.4.6.x on I3en ec2
instance type in AWS. Almost all our Region server are listed in zk logs
with "Too many connections from / - max is 60".
2020-06-01 21:42:08,375 - WARN  [NIOServerCxn.Factory:
0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@193] - Too many connections from
/ - max is 60

  On a average each RegionServer has ~250 regions. We are also running
Phoenix on this cluster. Most of the queries are short range scans but
sometimes we are doing full table scans too.

   It seems like one of the simple fix is to increase maxClientCnxns
property in zoo.cfg to 300, 500, 700, etc. I will probably do that. But, i
am just curious to know In what scenarios these connections are
created/used(Scans/Puts/Delete or during other RegionServer operations)?
Are these also created by hbase clients/apps(my guess is NO)? How can i
calculate optimal value of maxClientCnxns for my cluster/usage?



Re: DISCUSS: Move hbase-thrift and hbase-rest out of core to hbase-connectors project?

2020-04-27 Thread Josh Elser

+1 to the idea, -0 to the implied execution

I agree hbase-connectors is a better place for REST and thrift, long term.

My concern is that I read this thread as suggesting:

1. Remove rest/thrift from 2.3
1a. Proceed with 2.3.0 rc's
2. Add rest/thrift to hbase-connectors
...
n. Release hbase-connectors

I'm not a fan of removing anything which was previously there until 
there is are new releases and documentation to tell me how to do it. I'm 
still trying to help dig out another project who did the 'remove and 
then migrate" and left a pile of busted.


If that's not what you were suggesting, let me shirk back into the 
shadows ;)


On 4/25/20 7:44 PM, Stack wrote:

On Fri, Apr 24, 2020 at 10:06 PM Sean Busbey  wrote:


By "works with it" do you mean has documented steps to work with it or do
you mean that the convenience binary that ships for 2.3.0 will have the
same deployment model as prior 2.y releases where I can run those services
directly from the download?



Former. Not the latter. They would no-longer be part of the hbase-2.3.x
distribution.
S






Re: Weakly Configured XML External Entity for Java JAXBContext

2020-03-11 Thread Josh Elser
Per the guidance on the HBase book preface[1], I'll forward Barani's 
question to the HBase private list. I'd kindly request no further 
communication here until the question can be properly evaluated.


Thanks.

[1] https://hbase.apache.org/book.html#_preface

On 3/10/20 1:07 PM, Barani Bikshandi wrote:

I was notified of a security issue recently in the below package. Is there a 
plan to fix this vulnerability in near future?

Risk Name
Weakly Configured XML External Entity for Java JAXBContext

Vulnerability
An attacker can inject untrusted data into applications which may result in the 
disclosure of confidential data, denial of service, server side request 
forgeries or port scanning.

Code:
/hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/rest/client/RemoteAdmin.java

Mitigation:
We require that XML processors need to be configured properly to prevent XXE 
(XML External Entity) attack when an application handles data from untrusted 
source.



Re: The HBase issue urls are very slow to open recently

2020-02-15 Thread Josh Elser
Hi Junhong,

We don't run the Jira instance at issues.apache.org, we just use it. I
would suggest you contact the ASF infra team at us...@infra.apache.org.
They will have the ability to help debug this with you.

On Thu, Feb 13, 2020, 04:56 Junhong Xu  wrote:

> Hello, guys:
>I am in China, and it is very slow for me to browser the hbase issue
> urls. In most cases, they failed.I check the url, and find there are some
> javascripts and png are missing. How can I fix it? Thanks.
>


Re: HBase 2.1 client intermittent Kerberos failure in IBM JVM

2020-02-10 Thread Josh Elser
There have been multiple issues filed in Hadoop relating to the 
implementation differences of IBM Java compared to Oracle Java and 
OpenJDK [1]. Make sure that you're not running into any of them as a 
first step.


After that, you'd want to compare the differences of the Java platforms, 
with krb5 JVM level debugging *and* org.apache.hadoop.security debugging 
enabled, and understand what is fundamentally different in their 
runtimes. From there, hopefully it becomes obvious what the solution is. 
Sometimes it's just different JAAS options that will need to be added 
into UserGroupInformation.


Having done it before, it's not a super-fun experience. Your other 
solution is to just not use IBM Java. Good luck.


[1] https://www.google.com/search?q=hadoop+jaas+ibm+site%3Aissues.apache.org

On 2/7/20 10:57 AM, kyip wrote:

Hi,

I have an application  that has been working with HBase 1.x servers using
Kerberos authentication for a while.

I upgraded the application to support HBase 2.1 servers recently. The
application is working fine in Oracle JVM but not in IBM JVM (both Java
1.8).

In IBM JVM, after the successful UserGroupInformation.loginUserFromKeytab(),
it always fails to find the javax.security.auth.Subject during the
PROCESS_TGS step and the TGS_REQ was never sent for the /hbase service. So,
in order to address this, I made use of
UserGroupInformation.getCurrentUser().doAs() where  can be HBase available check, connection creation, get
table names, table scan, put, get, etc. This approach seems to work except I
am facing intermittent failures where the following error is logged:

[2/7/20 6:50:20:682 GMT] 014e SystemErr
R javax.security.sasl.SaslException: Call to
eng-bigbang-hadoop01.rpega.com/10.20.204.19:16020 failed on local exception:
javax.security.sasl.SaslException: Failure to initialize security context
[Caused by org.ietf.jgss.GSSException, major code: 11, minor code: 0
 major string: General failure, unspecified at GSSAPI level
 minor string: Cannot get credential for principal default principal]
[Caused by javax.security.sasl.SaslException: Failure to initialize security
context [Caused by org.ietf.jgss.GSSException, major code: 11, minor code: 0
 major string: General failure, unspecified at GSSAPI level
 minor string: Cannot get credential for principal default
principal]]

This is the same error that consistently happens before I used the
UserGroupInformation.getCurrentUser().doAs() technique.
It seems to me somehow the "login context" was lost occasionally and that is
why the logged in Subject cannot be found.

Not sure how this is relevant to the issue here. From my debugging sessions,
I notice is that HBase 1.x performs the PROCESS_TGS step in the same thread
as the initial steps while HBase 2.1 performs the step in a separate thread.

Since my application has been working with HBase 1.x servers (in both Oracle
and IBM JVM's) and my application also works properly with HDFS services in
Kerberos configuration in both Oracle and IBM JVM's, this seems to be a
HBase 2.x issue. (I also tried HBase 2.2 client jars which did not help.)

Any suggestion on how to address or troubleshoot this issue is greatly
appreciated.


Best Regards,

Kai





--
Sent from: http://apache-hbase.679495.n3.nabble.com/HBase-User-f4020416.html



Re: How to avoid write hot spot, While using cross row transactions.

2020-02-04 Thread Josh Elser
They are not dead. I have personally gone through the efforts to keep 
them alive under the Apache Phoenix PMC.


If you have an interest in them, please get involved :)

On 2/3/20 9:13 PM, Kang Minwoo wrote:

I looked around Apache Omid and Apache Tephra.
It seems like the dead.
Are there projects improving?

Best regards,
Minwoo Kang


보낸 사람: Kang Minwoo 
보낸 날짜: 2020년 1월 10일 금요일 15:37
받는 사람: hbase-user
제목: Re: How to avoid write hot spot, While using cross row transactions.

Thanks for your reply.

I will look around phoenix and tephra.

Best regards,
Minwoo Kang


보낸 사람: 张铎(Duo Zhang) 
보낸 날짜: 2020년 1월 10일 금요일 15:14
받는 사람: hbase-user
제목: Re: How to avoid write hot spot, While using cross row transactions.

Maybe you need Phoenix?

You need to use special algorithm to get cross region transactions on
HBase. IIRC, Phoenix has a sub project call Txxx, which implements the
algorithm described in the google percolator paper.

Thanks.

Reid Chan  于2020年1月10日周五 下午1:47写道:


I think you need some more coding works for fulfilling Atomicity in cross
region scenario, by aid of some third party softwares, like Zookeeper.

AFAIK, Procedure framework in Master may also have ability to do that, but
I'm not sure the details of it and if it supports client customized
procedure (I remember the answer is negative).

Last but not lease, what about trying Phoenix?



--

Best regards,
R.C




From: Kang Minwoo 
Sent: 10 January 2020 12:51
To: user@hbase.apache.org
Subject: How to avoid write hot spot, While using cross row transactions.

Hello, users.

I use MultiRowMutationEndpoint coprocessor for cross row transactions.
It has a constraint that is rows must be located in the same region.
I removed random hash bytes in the row key.
After that, I suffer write hot-spot.

But cross row transactions are a core feature in my application. When I
put a new data row, I put an index row.

Before I use MultiRowMutationEndpoint coprocessor, I had a mismatch
between the data row and the index row.

Is there any best practice in that situation?
I want to avoid write hot-spot and use an index.


Best regards,
Minwoo Kang



[ANNOUNCE] Creation of user-zh mailing list

2020-01-31 Thread Josh Elser

Hi,

Apache HBase is blessed to have a diverse community made up of 
individuals from around the world. On behalf of the HBase PMC, I'm 
pleased to announce the creation of a new mailing list 
user...@hbase.apache.org.


The intent of this mailing list is to act as a place for users to ask 
questions about Apache HBase. Individuals who feel more comfortable 
communicating in Chinese should feel welcome to ask questions in Chinese 
on this list.


We hope that this list will be well received by our users and that it 
will help those who do not speak English as a first language engage with 
others more effectively. As with all things, please be mindful that 
questions may take some time to be answered on this list as only a 
subset of our HBase community can communicate in Chinese effectively.


The mailing list is available NOW, and the website will be updated 
shortly to include this mailing list[1]. Users can subscribe to this 
list by standard approach: mailto:user-zh-subscr...@hbase.apache.org.


- Josh (on behalf of the HBase PMC)

[1] https://hbase.apache.org/mail-lists.html


Re: replication fix and hbase 1.4.13?

2020-01-27 Thread Josh Elser
That fix is already slated for 1.4.13. There would be an exceptional 
issue for it to not be included at this point. The "-1" you saw is just 
an automated, nightly test which are not always perfect.


The general cadence for releases is about 1 month. It looks like the 
Christmas season got in the way and we're due for 1.4.13 soon. You can 
watch the chatter on the dev mailing list to stay in the loop.


Your question about "stable" is not so easy to answer. It's a label 
assigned by the HBase developers to try to give a simple answer to a 
very complex question. Developers decide on what we feel is acceptable 
to call "stable", and will move it when we think it's 
appropriate/necessary. Our goal is to try to make the best experience 
for our users to save you all pain/heartache/grief.


One more thing I'd ask is that you remember that we're all volunteers 
here. Day jobs, families, and all sorts of other things can get in the 
way. This doesn't mean that something isn't, persay, urgent or critical 
to "do" (fix, investigate, release), but just that we don't have enough 
bandwidth to do everything we're asked to do.


Specifically around releases, we're always looking for more people to 
help drive the release process. Those who can corral Jira issues, do 
testing, and stage release candidates are very welcome and desired to 
help make our releases happen on a regular cadence. If you have the 
time/resources to help our, let us know on the dev list.


- Josh

On 1/27/20 11:04 AM, Whitney Jackson wrote:

Hi,

I've been running 1.4.12 with replication and experiencing the "random
region server aborts" as described here:

https://issues.apache.org/jira/browse/HBASE-23169

The underlying problem and fix (woohoo!) seems to be here:

https://issues.apache.org/jira/browse/HBASE-23205

I do see there is a failed jdk7 check noted at the end of 23205. Will that
prevent the fix from getting into 1.4.13? Also, when is 1.4.13 likely to
drop?

On a related note, how does the "stable" label work? I'm running 1.4.12 in
large part because it has that label. But as I discovered it also has this
known bug with a core feature (replication). It seems serious to me but the
urgency on fixing it seems to be low. That surprises me given the
importance of the bug and the "stable" label. Would I be better off keeping
up with the latest and greatest 2.x if my biggest concern is stability?

Whitney



Re: 【HBase】hive查询hbase数据重复

2020-01-15 Thread Josh Elser

Hi Peng,

While we recognize that the Apache communities are global communities 
where people speak all languages, the ASF requests that communication is 
done in English[1]


Could you translate your original message for us, please?

[1] https://www.apache.org/foundation/policies/conduct#diversity-statement

On 1/15/20 8:06 AM, Peng Luo wrote:

Hi all,

    HBase里只有一行记录,查询row_key只有一行记录。

    Hive创建外部表关联到HBase的这个表,能查询到2行一模一样的数据。

*1,HBase建表语句:*

hbase(main):003:0> describe 'test:table_name1'

COLUMN FAMILIES DESCRIPTION

{NAME => 'cf', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', 
REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '1', TTL => 
'FOREVER',


MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'FALSE', BLOCKSIZE => 
'65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}


1 row(s) in 0.1370 seconds

*2,HBase只有一条数据*

hbase(main):002:0> get ' test:table_name1','7772809'

COLUMN CELL

  cf:id timestamp=1579067194137, 
value=777280


*3,Hive建表语句*

drop table `test.hive_table_name1`;

CREATE EXTERNAL TABLE `test.hive_table_name1`(`id` string )

ROW FORMAT SERDE

   'org.apache.hadoop.hive.hbase.HBaseSerDe'

STORED BY

   'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES ('hbase.columns.mapping'=':key')

    TBLPROPERTIES ('hbase.table.name'=' test:table_name1')

*4,Hive查询结果*

我现在应该从哪些方面去尝试定位这个问题出在哪?



Re: How to avoid write hot spot, While using cross row transactions.

2020-01-10 Thread Josh Elser
Minor clarification -- Phoenix out of the box doesn't actually need 
Tephra or Omid to support transactional index updates, but both of them 
are options you can choose to use.


The implementation of this has recently changed as well -- read up at 
https://issues.apache.org/jira/browse/PHOENIX-5156


On 1/10/20 1:14 AM, 张铎(Duo Zhang) wrote:

Maybe you need Phoenix?

You need to use special algorithm to get cross region transactions on
HBase. IIRC, Phoenix has a sub project call Txxx, which implements the
algorithm described in the google percolator paper.

Thanks.

Reid Chan  于2020年1月10日周五 下午1:47写道:


I think you need some more coding works for fulfilling Atomicity in cross
region scenario, by aid of some third party softwares, like Zookeeper.

AFAIK, Procedure framework in Master may also have ability to do that, but
I'm not sure the details of it and if it supports client customized
procedure (I remember the answer is negative).

Last but not lease, what about trying Phoenix?



--

Best regards,
R.C




From: Kang Minwoo 
Sent: 10 January 2020 12:51
To: user@hbase.apache.org
Subject: How to avoid write hot spot, While using cross row transactions.

Hello, users.

I use MultiRowMutationEndpoint coprocessor for cross row transactions.
It has a constraint that is rows must be located in the same region.
I removed random hash bytes in the row key.
After that, I suffer write hot-spot.

But cross row transactions are a core feature in my application. When I
put a new data row, I put an index row.

Before I use MultiRowMutationEndpoint coprocessor, I had a mismatch
between the data row and the index row.

Is there any best practice in that situation?
I want to avoid write hot-spot and use an index.


Best regards,
Minwoo Kang





Re: Completing a bulk load from HFiles stored in S3

2019-11-12 Thread Josh Elser
Thanks for the info, Austin. I'm guessing that's how 1.x works since you 
mention EMR?


I think this code has changed in 2.x with the SecureBulkLoad stuff 
moving into "core" (instead of external as a coproc endpoint).


On 11/12/19 10:39 AM, Austin Heyne wrote:
Sorry for the late reply. You should be able to bulk load files from S3 
as it will detect that they're not the same filesystem and have the 
regionservers copy the files locally and then up to HDFS. This is 
related to a problem I reported a while ago when using HBase on S3 with 
EMR.


https://issues.apache.org/jira/browse/HBASE-20774

-Austin

On 11/1/19 8:04 AM, Wellington Chevreuil wrote:
Ah yeah, didn't realise it would assume same FS, internally. Indeed, 
no way

to have rename working between different FSes.

Em qui, 31 de out de 2019 às 16:25, Josh Elser  
escreveu:


Short answer: no, it will not work and you need to copy it to HDFS 
first.


IIRC, the bulk load code is ultimately calling a filesystem rename from
the path you provided to the proper location in the hbase.rootdir's
filesystem. I don't believe that an `fs.rename` is going to work across
filesystems because you can't do this atomically, which HDFS guarantees
for the rename method [1]

Additionally, for Kerberos-secured clusters, the server-side bulk load
logic expects that the filesystem hosting your hfiles is HDFS (in order
to read the files with the appropriate authentication). This fails right
now, but is something our PeterS is looking at.

[1]

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html#boolean_rename.28Path_src.2C_Path_d.29 



On 10/31/19 6:55 AM, Wellington Chevreuil wrote:

I believe you can specify your s3 path for the hfiles directly, as hdfs
FileSystem does support s3a scheme, but you would need to add your s3
access and secret key to your completebulkload configuration.

Em qua, 30 de out de 2019 às 19:43, Gautham Acharya <
gauth...@alleninstitute.org> escreveu:

If I have Hfiles stored in S3, can I run CompleteBulkLoad and 
provide an

S3 Endpoint to run a single command, or do I need to first copy the S3
Hfiles to HDFS first? The documentation is not very clear.



Re: HBase table snapshots compatibility between 1.0 and 2.1

2019-11-12 Thread Josh Elser

Hey Shuai,

You're likely to get some more traction with this question via 
contacting Cloudera's customer support channels. We try to keep this 
forum focused on Apache HBase versions.


If you are not seeing records after restoring, it sounds like there is 
some (missing?) metadata in the old version which is not handled in the 
newer versions.


As far as your procedurces, you could combine your 
create/disable/restore_snapshot to just use clone_snapshot instead. 
However, if there is some incompatibility, this is of little consequence.


You could try to use CopyTable instead of snapshots.

On 11/11/19 1:47 AM, Shuai Lin wrote:

Hi all,

TL;DR Could table snapshots taken in hbase 1.0 be used in hbase 2.1?

We have an existing production hbase 1.0 cluster (CDH 5.4) , and we're
setting up a new cluster with hbase 2.1 (CDH 6.3). Let's call the old
cluster C1 and new one C2.

To migrate the existing data from C1 to C2, we plan to use the "snapshot +
replication" approach (snapshot would capture the existing part, and
replication would do the incremental part) . However when I was testing the
feasibility of this approach locally, I found that the snapshot could be
successfully export to c2, but but the restored table on C2 has no data.

Here is a minimal reproducible example:

1. on C1: take the snapshot and export it to C2

hbase shell:
 create "t1", {"NAME"=>"f1", "REPLICATION_SCOPE" => 1}
 put "t1", "r1", "f1:c1", "value"
 snapshot 't1', 't1s1'

sudo -u hdfs hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot
-snapshot t1s1 \
 -copy-to hdfs://c2:8020/hbase -mappers 1

2. Then on C2 restore the table

hbase shell:
 create "t1", {"NAME"=>"f1", "REPLICATION_SCOPE" => 1}
 disable "t1"
 restore_snapshot "t1s1"
 enable "t1"
 scan "t1"

All these steps succeeds, except that the final "scan" command shows no
data at all. Also worth noting that on the master web ui on C2 it shows the
table t1 has two regions and one is not assigned - It shall have only one
region obviously.

So my question is:  Could table snapshots taken in hbase 1.0 be used in
hbase 2.1?
- If yes, anything I'm doing wrong here?
- If no, is there any workaround? (e.g. performing some preprocessing on
the snapshot data on hbase 2.1 side before restoring it?)

If this can't work, the only alternative way to migrate the data is too
install hbase 1.0 on C2 (so it could use the snapshot from C1), and upgrade
it to hbase 2.1 after restoring the snapshot. I I'd like to avoid going
this way as much as possible because it would be too cumbersome.

Any information would be much appreciated, thx!



Re: Completing a bulk load from HFiles stored in S3

2019-10-31 Thread Josh Elser

Short answer: no, it will not work and you need to copy it to HDFS first.

IIRC, the bulk load code is ultimately calling a filesystem rename from 
the path you provided to the proper location in the hbase.rootdir's 
filesystem. I don't believe that an `fs.rename` is going to work across 
filesystems because you can't do this atomically, which HDFS guarantees 
for the rename method [1]


Additionally, for Kerberos-secured clusters, the server-side bulk load 
logic expects that the filesystem hosting your hfiles is HDFS (in order 
to read the files with the appropriate authentication). This fails right 
now, but is something our PeterS is looking at.


[1] 
https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/filesystem/filesystem.html#boolean_rename.28Path_src.2C_Path_d.29


On 10/31/19 6:55 AM, Wellington Chevreuil wrote:

I believe you can specify your s3 path for the hfiles directly, as hdfs
FileSystem does support s3a scheme, but you would need to add your s3
access and secret key to your completebulkload configuration.

Em qua, 30 de out de 2019 às 19:43, Gautham Acharya <
gauth...@alleninstitute.org> escreveu:


If I have Hfiles stored in S3, can I run CompleteBulkLoad and provide an
S3 Endpoint to run a single command, or do I need to first copy the S3
Hfiles to HDFS first? The documentation is not very clear.





Re: Maintaining materialized views in Hbase

2019-09-25 Thread Josh Elser
You might get some more traction on user@phoenix since you're not really 
asking an HBase specific question here.


Phoenix doesn't have any native capabilities to create/maintain 
materialized views for you, but, if your data sets infrequently change, 
you could manage that aspect on your own.


On 9/20/19 1:18 PM, Gautham Acharya wrote:

Hi,

Currently I'm using Hbase to store large, sparse matrices of 50,000 columns 10+ 
million rows of integers.

This matrix is used for fast, random access - we need to be able to fetch 
random row/column subsets, as well as entire columns. We also want to very 
quickly fetch aggregates (Mean, median, etc) on this matrix.

The data does not change very often for these matrices, so pre-computing is 
very feasible here. What I would like to do is maintain a column store (store 
the column names as row keys, and a compressed list of all the row values) for 
the use case where we select an entire column. Additionally, I would like to 
maintain a separate table for each precomputed aggregate (median table, mean 
table, etc).

The query time for all these use cases needs to be low latency - under 100ms.

When the data does change for a certain matrix, it would be nice to easily 
update the optimized table. Ideally, I would like the column store/aggregation 
tables to just be materialized views of the original matrix. It doesn't look 
like Apache Phoenix supports materialized views. It looks like Hive does, but 
unfortunately Hive doesn't normally offer low latency queries.

Maybe Hive can create the materialized view, and we can just query the 
underlying Hbase store for lower latency responses?

What would be a good solution for this?

--gautham




Re: HBase Scan consumes high cpu

2019-09-10 Thread Josh Elser
Deletes are held in memory. They represent data you have to traverse 
until that data is flushed out to disk. When you write a new cell with a 
qualifier of 10, that sorts, lexicographically, "early" with respect to 
the other qualifiers you've written.


By that measure, if you are only scanning for the first column in this 
row which you've loaded with deletes, it would make total sense to me 
that the first case is slow and the second fast is fast


Can you please share exactly how you execute your "query" for both(all) 
scenarios?


On 9/10/19 11:35 AM, Solvannan R M wrote:

Hi,

We have been using HBase (1.4.9) for a case where timeseries data is 
continuously inserted and deleted (high churn) against a single rowkey. The column 
keys would represent timestamp more or less. When we scan this data using 
ColumnRangeFilter for a recent time-range, scanner for the stores (memstore & 
storefiles) has to go through contiguous deletes, before it reaches the requested 
timerange data. While using this scan, we could notice 100% cpu usages in single 
core by the regionserver process.

So, for our case, most of the cells with older timestamps will be in deleted 
state. While traversing these deleted cells, the regionserver process causing 
100% cpu usage in single core.

We tried to trace the code for scan and we observed the following behaviour.

1. While scanner is initialized, it seeked all the store-scanners to the start 
of the rowkey.
2. Then it traverses the deleted cells and discards it (as it was deleted) one 
by one.
3. When it encounters a valid cell (put type), it applies the filter and it 
returns SEEK_TO_NEXT_USING_HINT.
4. Now the scanner seeks to the required key directly and returning the results 
quickly then.

For confirming the mentioned behaviour, we have done a test:
1. We have populated a single rowkey with column qualifier as a range of 
integers of 0 to 150 with random data.
2. We then deleted the column qualifier range of 0 to 1499000.
3. Now the data is only in memsore. No store file exists.
4. Now we scanned the rowkey with ColumnRangeFilter[1499000, 1499010).
5. The query took 12 seconds to execute. During this query, a single core is 
completely used
6. Then we put a new cell with qualifier 10.
7. Executed the same query, it took 0.018 seconds to execute.

Kindly check this and advise !.

Regards,
Solvannan R M



Re: Access request to dev and users slack channel

2019-07-22 Thread Josh Elser

Luoc was added already.

It's not clear to me if Nestor was also asking for an invitation, but I 
sent one to them anyways.


On 7/21/19 8:11 PM, Néstor Boscán wrote:

On Sun, Jul 21, 2019 at 10:33 AM luoc  wrote:


Hello,


I am interested to start to make contributions and I want to request access
to the Hbase slack channels for the email address
luocoo...@qq.com


Thank you.




Re: Reporting Error: FSHLog

2019-06-12 Thread Josh Elser

Hi Rebekah,

By default, images get stripped out of emails.

Any chance you could host these on some external service or, better yet, 
provide the errors you're seeing in text form?


If I had to guess, I'd make sure that you have the Hadoop 3.1.2 JARs 
included on the HBase classpath. By default, you'll get Hadoop 2.x JARs 
with our binary tarball. If this is what happens, you'll want to build 
your own binary tarball using the src release. Something close to `mvn 
package assembly:single -Dhadoop.profile=3.0 
-Dhadoop-three.version=3.1.2 -DskipTests`. The resulting tarball will be 
in `hbase-assembly/target`.


- Josh

On 6/12/19 10:35 AM, Rebekah K. wrote:

Hello,

I was recently trying to install and run hbase with my hadoop 
installation and was wanting to report the following error I ran into 
and how I was able to solve it...


Hadoop Version: Hadoop 3.1.2
Hbase Version: 2.1.5

I was able to get Hbase up and running and bring up the Hbase shell. I 
was able to successfully run a 'list' command right off the bat but when 
I tried to run a 'create' command I would get the following error


image.png
I would also get the following error in my regionserver logs...
image.png
I found from your guy's git hub 
(https://github.com/apache/hbase/blob/master/src/main/asciidoc/_chapters/troubleshooting.adoc#trouble.rs.startup) 
a related error with FSHLog to add the following to the hbase config file...



   hbase.wal.provider
   filesystem


After adding that, and resetting hbase the error not longer occurred.

Thank you guys for your help, just wanted to report this in case it is 
useful :)


Best,

Rebekah Kambara


Re: Disk hot swap for data node while hbase use short-circuit

2019-06-01 Thread Josh Elser
Reminds me of https://issues.apache.org/jira/browse/HBASE-21915 too. 
Agree with Wei-Chiu that I'd start by ruling out HDFS issues first, and 
then start worrying about HBase issues :)


On 6/1/19 8:05 PM, Wei-Chiu Chuang wrote:

I think i found a similar bug report that matches your symptom: HDFS-12204
 (Dfsclient Do not close
file descriptor when using shortcircuit)

On Wed, May 29, 2019 at 11:37 PM Kang Minwoo 
wrote:


I think these file opened for reads. because that block is finalized.

---
ls -al /proc/regionserver_pid/fd
902 -> /data_path/current/finalized/~/blk_1 (deleted)
946 -> /data_path/current/finalized/~/blk_2 (deleted)
947 -> /data_path/current/finalized/~/blk_3.meta (deleted)
---

I think it is not an HBase bug. This is because DFSClient checks stale fd
when the fetch method invoked.

Best regards,
Minwoo Kang


보낸 사람: Wei-Chiu Chuang 
보낸 날짜: 2019년 5월 29일 수요일 20:51
받는 사람: user@hbase.apache.org
제목: Re: Disk hot swap for data node while hbase use short-circuit

Do you have a list of files that was being opened? I'd like to know if
those are files opened for writes or for reads.

If you are on the more recent version of Hadoop (2.8.0 and above),
there's a HDFS command to interrupt ongoing writes to DataNodes (HDFS-9945
)


https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HDFSCommands.html#dfsadmin
hdfs dfsadmin -evictWriters

Looking at HDFS hotswap implementation, it looks like DataNode doesn't
interrupt writers when a volume is removed. That sounds like a bug.

On Tue, May 28, 2019 at 9:39 PM Kang Minwoo 
wrote:


Hello, Users.

I use JBOD for data node. Some times the disk in the data node has a
problem.

The first time, I shut down all instance include data node and region
server in the machine that has a disk problem.
But It is not a good solution. So I improve the process.

When I detect disk problem in the server. I just perform disk hot swap.

But System administrator complains of some FD that still open so they
cannot remove the disk.
Regionserver has an FD, I use short circuit reads feature. (HBase version
1.2.9)

When we first met this issue, we force unmount disk and remount.
But after this process, kernel report error[1].

So we avoid this issue. purge stale FD.

I think this issue is common.
I want to know how hbase-users deal with this issue.

Thank you very much for sharing your experience.

Best regards,
Minwoo Kang

[1]:


https://www.thegeekdiary.com/xfs_log_force-error-5-returned-xfs-error-centos-rhel-7/








Re: Scan vs TableInputFormat to process data

2019-06-01 Thread Josh Elser

Hi Guillermo,

Yes, you are missing something.

TableInputFormat uses the Scan API just like Spark would.

Bypassing the RegionServer and reading from HFiles directly is 
accomplished by using the TableSnapshotInputFormat. You can only read 
from HFiles directly when you are using a Snapshot, as there are 
concurrency issues WRT the lifecycle of HFiles managed by HBase. It is 
not safe to try to HFiles underneath HBase on your own unless you are 
confident you understand all the edge cases in how HBase manages files.


On 5/29/19 2:54 AM, Guillermo Ortiz Fernández wrote:

Just to be sure, if I execute Scan inside Spark, the execution is goig
through RegionServers and I get all the features of HBase/Scan (filters and
so on), all the parallelization is in charge of the RegionServers (even
I'm  running the program with spark)
If I use TableInputFormat I read all the column families (even If I don't
want to) , not previous filter either, it's just open the files of a hbase
table and process them completly. All te parallelization is in Spark and
don't use HBase at all, it's just read in HDFS the files what HBase stored
for a specific table.

Am I missing something?



NoSQL Day on May 21st in Washington D.C.

2019-05-09 Thread Josh Elser
For those of you in/around the Washington D.C. area, NoSQL day is fast 
approaching.


If you've not already signed up, please check out the agenda and 
consider joining us for a fun and technical day with lots of talks from 
Apache committers and big names in industry:


https://dataworkssummit.com/nosql-day-2019/

For those still on the fence, please the code NSD50 to get 50% off the 
registration fee.


Thanks and see you there!

- Josh


Re: Why HBase client retry even though AccessDeniedException

2019-05-07 Thread Josh Elser

Sounds like a bug to me.

On 5/7/19 5:52 AM, Kang Minwoo wrote:

Why do not use "doNotRetry" value in RemoteWithExtrasException?


보낸 사람: Kang Minwoo 
보낸 날짜: 2019년 5월 7일 화요일 18:23
받는 사람: user@hbase.apache.org
제목: Why HBase client retry even though AccessDeniedException

Hello User.

(HBase version: 1.2.9)

Recently, I am testing about DoNotRetryIOException.

I expected when RegionServer send a DoNotRetryIOException (or 
AccessDeniedException), Client does not retry.
But, In Spark or MR, Client retries even though they receive 
AccessDeniedException.

Here is a call stack.

Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed 
after attempts={}, exceptions: {time}, null, java.net.SocketTimeoutException: 
{detail info}
...
Caused by: 
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.security.AccessDeniedException):
 org.apache.hadoop.hbase.security.AccessDeniedException: the client is not 
authorized
 at (... coprocessor throw AccessDeniedException)
 at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$50.call(RegionCoprocessorHost.java:1300)
 at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost$RegionOperation.call(RegionCoprocessorHost.java:1673)
 at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperation(RegionCoprocessorHost.java:1749)
 at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.execOperationWithResult(RegionCoprocessorHost.java:1722)
 at 
org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.preScannerOpen(RegionCoprocessorHost.java:1295)
 at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2468)
 at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:33770)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2216)
 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112)
 at 
org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133)
 at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108)
 at java.lang.Thread.run(Thread.java:748)

 at 
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1272)
 at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
 at 
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
 at 
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:34216)
 at 
org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:400)
 ... 10 more

The client can not aware of AccessDeniedException because the exception is 
RemoteWithExtrasException.
I wonder it is a bug.

Best regards,
Minwoo Kang



Re: HBase connection refused after random time delays

2019-04-10 Thread Josh Elser
Are you reading the log messages? I'm really struggling to understand 
what is unclear given what you just included.



2019-04-04 04:27:14,029 FATAL [db-2:16000.activeMasterManager] master.HMaster: 
Failed to become active master
org.apache.hadoop.security.AccessControlException: Permission denied: user=root, 
access=WRITE, inode="/hbase":hduser:supergroup:drwxr-xr-x



org.apache.hadoop.security.AccessControlException: Permission denied: user=root, 
access=WRITE, inode="/hbase/WALs":hduser:supergroup:drwxr-xr-x


HDFS has a notion of permissions like every other filesystem. You 
started HBase as "root" (which is in itself is a bad idea), and "root" 
does not have permission to write to HDFS as you've configured it. Fix 
the HDFS permissions so that HBase can actually write to it.


Re: HBase connection refused after random time delays

2019-04-08 Thread Josh Elser




On 4/7/19 10:44 PM, melank...@synergentl.com wrote:



On 2019/04/04 15:15:37, Josh Elser  wrote:

Looks like your RegionServer process might have died if you can't
connect to its RPC port.

Did you look in the RegionServer log for any mention of an ERROR or
FATAL log message?

On 4/4/19 8:20 AM, melank...@synergentl.com wrote:

I have installed Hadoop single node 
http://intellitech.pro/tutorial-hadoop-first-lab/ and Hbase 
http://intellitech.pro/hbase-installation-on-ubuntu/  successfully. I am using 
a Java agent to connect to the Hbase. After a random time period Hbase stop 
working and the java agent gives following error message.

Call exception, tries=7, retries=7, started=8321 ms ago, cancelled=false, 
msg=Call to db-2.c.xxx-dev.internal/xx.xx.0.21:16201 failed on connection 
exception: 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: db-2.c.xxx-dev.internal/xx.xx.0.21:16201, details=row 
'xxx,001:155390400,99' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, 
hostname=db-2.c.xxx-dev.internal,16201,1553683263844, seqNum=-1
Here are the Hbase and zookeeper logs

hbase-hduser-regionserver-db-2.log

[main] zookeeper.ZooKeeperMain: Processing delete 2019-03-30 02:11:44,089 DEBUG 
[main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Reading reply 
sessionid:0x169bd98c099006e, packet:: clientPath:null serverPath:null 
finished:false header:: 1,2 replyHeader:: 1,300964,0 request:: 
'/hbase/rs/db-2.c.stl-cardio-dev.internal%2C16201%2C1553683263844,-1 response:: 
null
hbase-hduser-zookeeper-db-2.log

server.FinalRequestProcessor: sessionid:0x169bd98c099004a type:getChildren 
cxid:0x28e3ad zxid:0xfffe txntype:unknown reqpath:/hbase/splitWAL
my hbase-site.xml file is as follows


   //Here you have to set the path where you want HBase to store its files.
   
   hbase.rootdir
   hdfs://localhost:9000/hbase
   
   
   hbase.zookeeper.quorum
   localhost
   
   //Here you have to set the path where you want HBase to store its built in 
zookeeper files.
   
   hbase.zookeeper.property.dataDir
   ${hbase.tmp.dir}/zookeeper
   
   
   hbase.cluster.distributed
   true
   
   
   hbase.zookeeper.property.clientPort
   2181
   
   
when I restart the Hbase it will start working again and stop working after few 
days. I am wondering what would be the fix for this.

Thanks.
BR,
Melanka


Hi Josh,

Sorry for the late reply. I restarted the Hbase on 05/04/2019 and it was again 
down on 06/04/2019 at 00.06 AM.

Log from  hbase-root-regionserver-db-2 is as follows.

2019-04-04 04:42:26,047 DEBUG [main-SendThread(localhost:2181)] 
zookeeper.ClientCnxn: Reading reply sessionid:0x169d86a879b00bf, packet:: 
clientPath:null serverPath:null finished:false header:: 67,2  replyHeader:: 
67,776370,0  request:: 
'/hbase/rs/db-2.c.stl-cardio-dev.internal%2C16201%2C1554352093266,-1  
response:: null
2019-04-04 04:42:26,047 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher: 
regionserver:16201-0x169d86a879b00bf, quorum=localhost:2181, baseZNode=/hbase 
Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected, 
path=/hbase/rs/db-2.c.stl-cardio-dev.internal,16201,1554352093266
2019-04-04 04:42:26,047 DEBUG [main-EventThread] zookeeper.ZooKeeperWatcher: 
regionserver:16201-0x169d86a879b00bf, quorum=localhost:2181, baseZNode=/hbase 
Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected, 
path=/hbase/rs
2019-04-04 04:42:26,050 DEBUG 
[regionserver/db-2.c.xxx-dev.internal/xx.xxx.0.21:16201] zookeeper.ZooKeeper: 
Closing session: 0x169d86a879b00bf
2019-04-04 04:42:26,050 DEBUG 
[regionserver/db-2.c.xxx-dev.internal/xx.xx.0.21:16201] zookeeper.ClientCnxn: 
Closing client for session: 0x169d86a879b00bf
2019-04-04 04:42:26,056 DEBUG [main-SendThread(localhost:2181)] 
zookeeper.ClientCnxn: Reading reply sessionid:0x169d86a879b00bf, packet:: 
clientPath:null serverPath:null finished:false header:: 68,-11  replyHeader:: 
68,776371,0  request:: null response:: null
2019-04-04 04:42:26,056 DEBUG 
[regionserver/db-2.c.xxx-dev.internal/xx.xxx.0.21:16201] zookeeper.ClientCnxn: 
Disconnecting client for session: 0x169d86a879b00bf
2019-04-04 04:42:26,056 INFO  
[regionserver/db-2.c.xxx-dev.internal/xxx.xxx.0.21:16201] zookeeper.ZooKeeper: 
Session: 0x169d86a879b00bf closed
2019-04-04 04:42:26,056 INFO  
[regionserver/db-2.c.xxx-dev.internal/xxx.xxx.0.21:16201] 
regionserver.HRegionServer: stopping server 
db-2.c.xxx-dev.internal,16201,1554352093266; zookeeper connection closed.
2019-04-04 04:42:26,056 INFO  
[regionserver/db-2.c.xxx-dev.internal/xxx.0.21:16201] 
regionserver.HRegionServer: regionserver/db-2.c.xxx-dev.internal/xxx.0.21:16201 
exiting
2019-04-04 04:42:26,056 ERROR [main] regionserver.HRegionServerCommandLine: 
Region server exiting
java.lang.RuntimeException: HRegionServer Aborted
 at 
org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.sta

Re: HBase connection refused after random time delays

2019-04-04 Thread Josh Elser
Looks like your RegionServer process might have died if you can't 
connect to its RPC port.


Did you look in the RegionServer log for any mention of an ERROR or 
FATAL log message?


On 4/4/19 8:20 AM, melank...@synergentl.com wrote:

I have installed Hadoop single node 
http://intellitech.pro/tutorial-hadoop-first-lab/ and Hbase 
http://intellitech.pro/hbase-installation-on-ubuntu/  successfully. I am using 
a Java agent to connect to the Hbase. After a random time period Hbase stop 
working and the java agent gives following error message.

Call exception, tries=7, retries=7, started=8321 ms ago, cancelled=false, 
msg=Call to db-2.c.xxx-dev.internal/xx.xx.0.21:16201 failed on connection 
exception: 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: db-2.c.xxx-dev.internal/xx.xx.0.21:16201, details=row 
'xxx,001:155390400,99' on table 'hbase:meta' at 
region=hbase:meta,,1.1588230740, 
hostname=db-2.c.xxx-dev.internal,16201,1553683263844, seqNum=-1
Here are the Hbase and zookeeper logs

hbase-hduser-regionserver-db-2.log

[main] zookeeper.ZooKeeperMain: Processing delete 2019-03-30 02:11:44,089 DEBUG 
[main-SendThread(localhost:2181)] zookeeper.ClientCnxn: Reading reply 
sessionid:0x169bd98c099006e, packet:: clientPath:null serverPath:null 
finished:false header:: 1,2 replyHeader:: 1,300964,0 request:: 
'/hbase/rs/db-2.c.stl-cardio-dev.internal%2C16201%2C1553683263844,-1 response:: 
null
hbase-hduser-zookeeper-db-2.log

server.FinalRequestProcessor: sessionid:0x169bd98c099004a type:getChildren 
cxid:0x28e3ad zxid:0xfffe txntype:unknown reqpath:/hbase/splitWAL
my hbase-site.xml file is as follows


  //Here you have to set the path where you want HBase to store its files.
  
  hbase.rootdir
  hdfs://localhost:9000/hbase
  
  
  hbase.zookeeper.quorum
  localhost
  
  //Here you have to set the path where you want HBase to store its built in 
zookeeper files.
  
  hbase.zookeeper.property.dataDir
  ${hbase.tmp.dir}/zookeeper
  
  
  hbase.cluster.distributed
  true
  
  
  hbase.zookeeper.property.clientPort
  2181
  
  
when I restart the Hbase it will start working again and stop working after few 
days. I am wondering what would be the fix for this.

Thanks.
BR,
Melanka



2 weeks remaining for NoSQL Day abstract submission

2019-04-04 Thread Josh Elser
There are just *two weeks* remaining to submit abstracts for NoSQL Day 
2019, in Washington D.C. on May 21st. Abstracts are due April 19th.


https://dataworkssummit.com/nosql-day-2019/

Abstracts don't need to be more than a paragraph or two. Please the time 
sooner than later to submit your abstract. Of course, those talks which 
are selected will receive a complimentary pass to attend the event.


Please reply to a single user list or to me directly with any questions. 
Thanks!


- Josh


[CVE-2019-0212] Apache HBase REST Server incorrect user authorization

2019-03-27 Thread Josh Elser

CVE-2019-0212: HBase REST Server incorrect user authorization

Description: In all previously released Apache HBase 2.x versions, 
authorization was incorrectly applied to users of the HBase REST server. 
Requests sent to the HBase REST server were executed with the 
permissions of the REST server itself, not with the permissions of the 
end-user. This issue is only relevant when HBase is configured with 
Kerberos authentication, HBase authorization is enabled, and the REST 
server is configured with SPNEGO authentication. This issue does not 
extend beyond the HBase REST server.


Versions affected: 2.0.0-2.0.4, 2.1.0-2.1.3

Mitigation: Stop the HBase REST server until your installation is 
upgraded to HBase 2.0.5, 2.1.4, or any other later release. Upon 
upgrading to a newer version, no other action is required.


Credit: This issue was discovered by Gaurav Kanade

- The Apache HBase PMC


Re: : Investigate hbase superuser permissions in the face of quota violation-HBASE-17978

2019-03-13 Thread Josh Elser
A superuser should be able to still initiate a compaction: 
https://issues.apache.org/jira/browse/HBASE-17978


If the compaction didn't actually happen, that's a problem.

On 3/13/19 3:09 AM, Uma wrote:

-- Forwarded message -
From: Uma 
Date: Wed 13 Mar, 2019, 6:54 AM
Subject: Investigate hbase superuser permissions in the face of quota
violation-HBASE-17978
To: user-subscr...@hbase.apache.org 


Hi Users,

I observed that in case quota policy was enabled that disallowed
compaction, Super User is able to issue compaction command and no error is
thrown to user. But actually compaction is not happening for that table. In
debug log below message is printed:

“as an active space quota violation policy disallows compactions.”

Is it correct behaviour?



Thanks,

Uma

Sent from Mail  for Windows
10




Virus-free.
www.avast.com

<#m_7450450616560611580_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>



Re: HBase client spent most time in ThreadPoolExecutor

2019-02-27 Thread Josh Elser

Do you have any regions in transition? Does HBCK report any problems?

It sounds to me that a client is stuck polling meta to look for the 
location of a Region which it cannot find for some reason. Finding the 
location of a Region from meta should not take more than a second.


On 2/27/19 12:34 AM, Kang Minwoo wrote:

MetaScan is so slow.
When I invoked `regionLocator.getAllRegionLocations()` method, It throw 
`org.apache.hadoop.hbase.client.RetriesExhaustedException: Can't get the 
locations` Exception.

Best regards,
Minwoo Kang


보낸 사람: Kang Minwoo 
보낸 날짜: 2019년 2월 27일 수요일 13:48
받는 사람: user@hbase.apache.org
제목: Re: HBase client spent most time in ThreadPoolExecutor

Thank you for your reply.

I looked around thread dumps.

My Client has 256 connections.
but only one connection state is RUNNABLE. others state are TIMED_WAITING 
(parking).

Best regards,
Minwoo Kang


보낸 사람: Josh Elser 
보낸 날짜: 2019년 2월 27일 수요일 01:32
받는 사람: user@hbase.apache.org
제목: Re: HBase client spent most time in ThreadPoolExecutor

Minwoo,

You have found an idle thread in the threadpool that is waiting for
work. This is not the source of your slowness. The thread is polling the
internal queue of work, waiting for the next "unit" of something to do.

You should include threads like these from your analysis.

On 2/26/19 8:32 AM, Kang Minwoo wrote:

Hello Users,

I have a question.

My client complains to me, HBase scan spent too much time.
So I started to debug.

I profiled the HBase Client Application using hprof.
The App spent the most time in below stack trace.


java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:Unknown line)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:Unknown
 line)
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:Unknown 
line)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:Unknown 
line)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:Unknown
 line)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:Unknown
 line)
java.lang.Thread.run(Thread.java:Unknown line)
---

---
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.waitForWork(RpcClientImpl.java:Unknown
 line)
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.run(RpcClientImpl.java:Unknown
 line)
---

they spent 92% of total time.

I don't understand why they spent a lot of time.
Do you have any idea?

Best regards,
Minwoo Kang



Re: HBase client spent most time in ThreadPoolExecutor

2019-02-26 Thread Josh Elser

Minwoo,

You have found an idle thread in the threadpool that is waiting for 
work. This is not the source of your slowness. The thread is polling the 
internal queue of work, waiting for the next "unit" of something to do.


You should include threads like these from your analysis.

On 2/26/19 8:32 AM, Kang Minwoo wrote:

Hello Users,

I have a question.

My client complains to me, HBase scan spent too much time.
So I started to debug.

I profiled the HBase Client Application using hprof.
The App spent the most time in below stack trace.


java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:Unknown line)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:Unknown
 line)
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:Unknown 
line)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:Unknown 
line)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:Unknown
 line)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:Unknown
 line)
java.lang.Thread.run(Thread.java:Unknown line)
---

---
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.waitForWork(RpcClientImpl.java:Unknown
 line)
org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.run(RpcClientImpl.java:Unknown
 line)
---

they spent 92% of total time.

I don't understand why they spent a lot of time.
Do you have any idea?

Best regards,
Minwoo Kang



Re: Custom security check

2019-02-19 Thread Josh Elser

Hi Jagan,

Right now, Authorization checks inside of the RegionServer aren't 
well-quantified, but it is possible. One example of software that does 
this today is Apache Ranger.


However, your plan to provide custom client-side data is going to take a 
bit more effort as you'll also need to figure out client-side logic to 
compute and (somehow) send this extra data with every RPC. I would guess 
that this would take some aggressive hacking, but I haven't looked at 
the client-side code with this in mind before.


I think your best course of action is to look at the code yourself and 
come back with more specific questions as to what exists in HBase. 
Perhaps you can start looking at what Ranger does and go from there.


Good luck!

On 2/19/19 3:19 AM, Jagan R wrote:

Dear All,

Is there any way we can plug-in custom security checks in HBase?

We are exploring ways to do security checks at a more finer level where we
want to control/check if the user can access this data or not. Once we do
the authorization checks, want to pass an access token to the hbase region
servers for any request.

In the region server, we want to check for the validity of the access token
via a call to the data source (say redis) where the access token is stored.

So want to pass an additional argument say token for the hbase requests and
get a handle for validating the token before performing the operation. If
the token is invalid, should abort the operation and throw error to the
client.

Regards,
Jagan



Re: hbase version

2019-02-01 Thread Josh Elser
Compared to 2.0.4, I believe you'll be better off moving onto HBase 
2.1.2 at this point. IIRC, the consensus was to shift focus onto 2.1 
(eventually, 2.2, and onward) instead of letting people get stuck on 
"old" versions.


In general, I'd expect the 1.4 line to be rather bullet-proof, but 
getting onto HBase 2 will benefit you in the long run.


I'd refer you to the Hadoop documentation to get the "stable" Hadoop 
release, or refer to the section in the HBase book which covers this 
information as well.


On 2/1/19 5:17 AM, Michael wrote:

I'm able to switch from a test cluster to a bigger cluster (some 8 to 10
machines). The test cluster runs hbase version 1.2.6 with hadoop 2.7.2

What is the actual recommended stable version to upgrade to, 1.4.9 with
hadoop 2.9.2?

How stable are 2.0.4 or 2.1.2?

Thanks
  Michael








Re: HBase-Connectors Library on Maven

2018-11-28 Thread Josh Elser

Hi Davis,

I don't think we have a release planned yet for the hbase-connectors 
library. I know our mighty Stack has been doing lots of the heavy 
lifting around lately.


If you're interested/willing, I'm sure we'd all be gracious if you have 
the cycles to help out testing what we have in the repository now and 
help us drive towards a first official release?


If you're interested, I'd suggest we move over to the 
d...@hbase.apache.org mailing list and see if we can sketch out a plan of 
action.


Thanks for the nice note, regardless!

On 11/28/18 3:27 PM, Silverman, Davis wrote:

Hello, I noticed that the HBase-Connectors library[1] has recently been moved 
into its own repo on GitHub. However, there is no Maven release for this, and I 
would love to use it! When is there going to be an official release on Maven 
Central for this library?


I am looking to use the spark connector, and there is seemingly no recently 
updated library to do this on Maven. If there is another way, please let me 
know! Thanks!


[1]: https://github.com/apache/hbase-connectors

[https://avatars3.githubusercontent.com/u/47359?s=400&v=4]

apache/hbase-connectors
github.com
Apache HBase Connectors. Contribute to apache/hbase-connectors development by 
creating an account on GitHub.




[https://digitalglobe-marketing.s3.amazonaws.com/files/signature/rs_maxar_email_logo_wide.png]

Davis R. Silverman
Associate Software Development Engineer
+1.240.543.4339 office
davis.silverman@radiantsolutionscom

This electronic communication and any attachments may contain confidential and 
proprietary information of Radiant Solutions, Inc. If you are not the intended 
recipient, or an agent or employee responsible for delivering this 
communication to the intended recipient, or if you have received this 
communication in error, please do not print, copy, retransmit, disseminate or 
otherwise use the information. Please indicate to the sender that you have 
received this communication in error, and delete the copy you received.

Radiant Solutions reserves the right to monitor any electronic communication 
sent or received by its employees, agents or representatives.



Re: Master-Master Replication problem in hbase2.0

2018-10-18 Thread Josh Elser

Please do not cross-post lists. I've dropped dev@hbase.

This doesn't seem like a replication issue. As you have described it, it 
reads more like a data-correctness issue. However, I'd guess that it's 
more related to timestamps rather than be an issue on your cluster.


If there was no error, HBase guarantees that the write was successful. 
If the write was not visible in the cluster after you made it, that must 
mean that it was not the "latest" cell for that row+column.


Use a RAW=>true `scan` the next time you see this. I would guess that 
there was some clock skew on your nodes which left the update in #4 
"masking" the update from #5. With a raw-scan, you should see all of the 
versions of the cell.


On 10/18/18 7:36 AM, Winter wrote:

Cluster A and B-both are HDP 3.0.0(hadoop3.0, hbase2.0);

I want to replicate a table named T(only one CF named 'f') on cluster A and B 
with each other, that is, the table T data of A changes will be synchronized to 
B, and the table T data of B changes will be synchronized to A. I configured 
replication both on Cluster A and B for table T using 'add_peer' and 
'enable_table_replication' by Hbase shell(firstly A to B,2ndly B to A).Then,I 
did test in Hbase shell as below,
1.Put a record by typing "put 'T','r1','f:a','1'" on A,then Scan table T,it's 
no problem,the record can be found on both A and B;
2.Put a record by typing "put 'T','r2','f:a','1'" on B,no prwoblem,both found 
on A and B;
3.Put a record by typing "put 'T','r1','f:a','2'" to update the value to '2' on 
A,no problem, updated successfully on both A and B;
4.Put a record by typing "put 'T','r1','f:a','3'" to update the value to '3' on 
B,no problem, updated successfully on both A and B;

5.Put a record by typing "put 'T','r1','f:a','4'" to update the value to '4' on 
A,the problem was coming, there is no update both on A and B, that means this 'Put'is not 
effected,the value is still '3' on A and B although the Hbase shell does not give an 
error when I 'put'.

6.after about 1 minute, I typing this 'Put' again to update the value to '4' on 
A, now, it's successful, the value is updated to '4' both on A and B.
Is it a bug in v2.0?May I ask what's the reason? Anything I missed to 
configure?Thx.




| |
Winter
|
|
邮箱:ldy...@163.com
|

签名由 网易邮箱大师 定制



Re: Regions Stuck PENDING_OPEN

2018-10-03 Thread Josh Elser
Thanks so much for sharing this, Austin! This is a wonderful write-up of 
what you did. I'm sure it will be frequented from the archives.


If you have the old state from meta saved and have the ability to 
sanitize some of the data/server-names, we might be able to give you 
some suggestions about what happened. Let me know if that would be 
helpful. If we can't get to the bottom of how this happened, maybe we 
can figure out why hbck couldn't fix it.


On 10/3/18 12:53 PM, Austin Heyne wrote:
Josh: Thanks for all your help! You got us going down a path that lead 
to a solution.


Thought I would just follow up with what we ended up doing (I do not 
recommend anyone attempt this).


When this problems started I'm not sure what issues were wrong with the 
hbase:meta table but one of the steps we tried was to migrate the 
database to a new cluster. I believe this cleared up the original issue 
we were having but for some reason, perhaps because of the initial 
problem, the migration didn't work correctly. This left us with 
references to servers in hbase:meta that didn't exists, for regions that 
were marked as PENDING_OPEN. When HBase would come online it was 
behaving as if it had already asked those region servers to open those 
regions so they were not getting reassigned to good servers. Another 
symptom of this was that there were dead region servers listed in the 
HBase web UI from the previous cluster (the servers that were listed in 
the hbase:meta).


Since in the hbase:meta table regions were marked as PENDING_OPEN on 
region servers that didn't exist we were unable to close or move the 
regions since master couldn't communicate with the region server. For 
some reason hbck -fix wasn't able to repair the assignments or didn't 
realize it needed to. This might be due to some other meta 
inconsistencies like overlapping regions or duplicated start keys. I'm 
unsure why it couldn't clear things up.


To repair this we initially backed up the meta directory in s3 while 
everything was offline. Then while HBase was online and the tables were 
disabled we used a Scala REPL to rewrite the hbase:meta entries for each 
effected region (~4500 regions); replacing the 'server' and 'sn' with 
valid values and setting 'state' to 'OPEN'. We then flushed/compacted 
the meta table and took down HBase. After nuking /hbase in ZK we brought 
everything back up. There initially was a lot of churn with region 
assignments but after things settled everything was online. I think this 
worked because of the state the meta table was in when HBase stopped. I 
think it looked like a crash and HBase went through it's normal repair 
cycle of re-opening regions and using previous assignments.


Like I said, I don't recommend manually rewriting the hbase:meta table 
but it did work for us.


Thanks,
Austin

On 10/01/2018 01:28 PM, Josh Elser wrote:
That seems pretty wrong. The Master should know that old RS's are no 
longer alive and not try to assign regions to them. I don't have much 
familiarity with 1.4 to say if, hypothetically, that might be fixed in 
a release 1.4.5-1.4.7.


I don't have specific suggestions, but can share how I'd approach it.

I'd pick one specific region and try to trace the logic around just 
that one region. Start with the state in hbase:meta -- see if there is 
a column in meta for this old server. Expand out to WALs in HDFS. 
Since you can wipe ZK and this still happens, it seems clear it's not 
coming from ZK data.


Compare the data you find with what DEBUG logging in the Master says, 
see if you can figure out some more about how the Master chose to make 
the decision it did. That will help lead you to what the appropriate 
"fix" should be.


On 10/1/18 10:46 AM, Austin Heyne wrote:
I'm running HBase 1.4.4 on EMR. In following your suggestions I 
realized that the master is trying to assign the regions to 
dead/non-existant region servers. While trying to fix this problem I 
had killed the EMR cluster and started a new one. It's still trying 
to assign some regions to those region servers in the previous 
cluster. I tried to manually move one of the regions to a good region 
server but I'm getting 'ERROR: No route to host' when I try to close 
the region.


I've tried nuking the /hbase directory in Zookeeper but that didn't 
seem to help so I'm not sure where it's getting these references from.


-Austin


On 09/30/2018 02:38 PM, Josh Elser wrote:
First off: You're on EMR? What version of HBase you're using? (Maybe 
Zach or Stephen can help here too). Can you figure out the 
RegionServer(s) which are stuck opening these PENDING_OPEN regions? 
Can you get a jstack/thread-dump from those RS's?


In terms of how the system is supposed to work: the PENDING_OPEN 
state for a Region "R&q

Re: Regions Stuck PENDING_OPEN

2018-10-01 Thread Josh Elser
That seems pretty wrong. The Master should know that old RS's are no 
longer alive and not try to assign regions to them. I don't have much 
familiarity with 1.4 to say if, hypothetically, that might be fixed in a 
release 1.4.5-1.4.7.


I don't have specific suggestions, but can share how I'd approach it.

I'd pick one specific region and try to trace the logic around just that 
one region. Start with the state in hbase:meta -- see if there is a 
column in meta for this old server. Expand out to WALs in HDFS. Since 
you can wipe ZK and this still happens, it seems clear it's not coming 
from ZK data.


Compare the data you find with what DEBUG logging in the Master says, 
see if you can figure out some more about how the Master chose to make 
the decision it did. That will help lead you to what the appropriate 
"fix" should be.


On 10/1/18 10:46 AM, Austin Heyne wrote:
I'm running HBase 1.4.4 on EMR. In following your suggestions I realized 
that the master is trying to assign the regions to dead/non-existant 
region servers. While trying to fix this problem I had killed the EMR 
cluster and started a new one. It's still trying to assign some regions 
to those region servers in the previous cluster. I tried to manually 
move one of the regions to a good region server but I'm getting 'ERROR: 
No route to host' when I try to close the region.


I've tried nuking the /hbase directory in Zookeeper but that didn't seem 
to help so I'm not sure where it's getting these references from.


-Austin


On 09/30/2018 02:38 PM, Josh Elser wrote:
First off: You're on EMR? What version of HBase you're using? (Maybe 
Zach or Stephen can help here too). Can you figure out the 
RegionServer(s) which are stuck opening these PENDING_OPEN regions? 
Can you get a jstack/thread-dump from those RS's?


In terms of how the system is supposed to work: the PENDING_OPEN state 
for a Region "R" means: the active Master has asked a RegionServer to 
open R. That RS should have an active thread which is trying to open 
R. Upon success, the state of R will move from PENDING_OPEN to OPEN. 
Otherwise, the Master will try to assign R again.


In absence of any custom coprocessors (including Phoenix), this would 
mean some subset of RegionServers are in a bad state. Figuring out 
what those RS's are trying to do will be the next step in figuring out 
why they're stuck like that. It might be obvious from the UI, or you 
might have to look at hbase:meta or the master log to figure it out.


One caveat, it's possible that the Master is just not doing the right 
thing as described above. If the steps described above don't seem to 
be matching what your system is doing, you might have to look closer 
at the Master log. Make sure you have DEBUG on to get anything of 
value out of the system.


On 9/30/18 1:43 PM, Austin Heyne wrote:
I'm having a strange problem that my usual bag of tricks is having 
trouble sorting out. On Friday queries stoped returning for some 
reason. You could see them come in and there would be a resource 
utilization spike that would fade out after an appropriate amount of 
time, however, the query would never actually return. This could be 
related to our client code but I wasn't able to dig into it since 
this was the middle of the day on a production system. Since this had 
happened before and bouncing HBase cleared it up, I proceeded to 
disable tables and restart HBase. Upon bringing HBase backup a few 
thousand regions are stuck in PENDING_OPEN state and refuse to move 
from that state. I've run hbck -repair a number of times under a few 
conditions (even the offline repair), have deleted everything out of 
/hbase in zookeeper and even migrated the cluster to new servers 
(EMR) with no luck. When I spin HBase up the regions are already at 
PENDING_OPEN even though the tables are offline.


Any ideas on what's going on here would be a huge help.

Thanks,
Austin





Re: Regions Stuck PENDING_OPEN

2018-09-30 Thread Josh Elser
First off: You're on EMR? What version of HBase you're using? (Maybe 
Zach or Stephen can help here too). Can you figure out the 
RegionServer(s) which are stuck opening these PENDING_OPEN regions? Can 
you get a jstack/thread-dump from those RS's?


In terms of how the system is supposed to work: the PENDING_OPEN state 
for a Region "R" means: the active Master has asked a RegionServer to 
open R. That RS should have an active thread which is trying to open R. 
Upon success, the state of R will move from PENDING_OPEN to OPEN. 
Otherwise, the Master will try to assign R again.


In absence of any custom coprocessors (including Phoenix), this would 
mean some subset of RegionServers are in a bad state. Figuring out what 
those RS's are trying to do will be the next step in figuring out why 
they're stuck like that. It might be obvious from the UI, or you might 
have to look at hbase:meta or the master log to figure it out.


One caveat, it's possible that the Master is just not doing the right 
thing as described above. If the steps described above don't seem to be 
matching what your system is doing, you might have to look closer at the 
Master log. Make sure you have DEBUG on to get anything of value out of 
the system.


On 9/30/18 1:43 PM, Austin Heyne wrote:
I'm having a strange problem that my usual bag of tricks is having 
trouble sorting out. On Friday queries stoped returning for some reason. 
You could see them come in and there would be a resource utilization 
spike that would fade out after an appropriate amount of time, however, 
the query would never actually return. This could be related to our 
client code but I wasn't able to dig into it since this was the middle 
of the day on a production system. Since this had happened before and 
bouncing HBase cleared it up, I proceeded to disable tables and restart 
HBase. Upon bringing HBase backup a few thousand regions are stuck in 
PENDING_OPEN state and refuse to move from that state. I've run hbck 
-repair a number of times under a few conditions (even the offline 
repair), have deleted everything out of /hbase in zookeeper and even 
migrated the cluster to new servers (EMR) with no luck. When I spin 
HBase up the regions are already at PENDING_OPEN even though the tables 
are offline.


Any ideas on what's going on here would be a huge help.

Thanks,
Austin



Re: HConnection in TIMED_WATING

2018-09-28 Thread Josh Elser
That thread is a part of the ThreadPool that HConnection uses and that 
thread is simply waiting for a task to execute. It's not indicative of 
any problem.


See how the thread is inside of a call to LinkedBlockingQueue#poll()

On 9/28/18 3:02 AM, Lalit Jadhav wrote:

While load testing in the application, In Thread dump, collected when the
application was in its highest utilization and found that hconnection was
in TIMED_WATING (Below log occurred continuously)

hconnection-0x52cf832d-shared--pool1-t110 - priority:5 -
threadId:0x5651030a9800 - nativeId:0x11b - state:TIMED_WAITING
stackTrace:
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0003f8200ca0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None--

Can anyone explain this what's going wrong here?

Regards,
*Lalit Jadhav,*
*Database Group Lead.*
*Everything happens to everybody sooner or later if there is time enough*



Re: Migrating from Apache Cassandra to Hbase

2018-09-11 Thread Josh Elser
Please be patient in getting a response to questinos you post to this 
list as we're all volunteers.


On 9/8/18 2:16 AM, onmstester onmstester wrote:
Hi, Currently I'm using Apache Cassandra as backend for my restfull application. Having a cluster of 30 nodes (each having 12 cores, 64gb ram and 6 TB disk which 50% of the disk been used) write and read throughput is more than satisfactory for us. The input is a fixed set of long and int columns which we need to query it based on every column, so having 8 columns there should be 8 tables based on Cassandra query plan recommendation. The cassandra keyspace schema would be someting like this: Table 1 (timebucket,col1, ...,col8, primary key(timebuecket,col1)) to handle select * from input where timebucket = X and col1 = Y  Table 8 (timebucket,col1, ...,col8, primary key(timebuecket,col8)) So for each input row, there would be 8X insert in Cassandra (not considering RF) and using TTL of 12 months, production cluster should keep about 2 Peta Bytes of data With recommended node density for Cassandra cluster (2 TB per node), i need a cluster with more than 1000 nodes (which i can not afford) So long story short: I'm looking for an alternative to Apache Cassandra for this application. How HBase would solve these problem: 



1. 8X data redundancy due to needed queries 


HBase provides one intrinsic "index" over the data in your table and 
that is the "rowkey". If you need to access the same data 8 different 
ways, you would need to come up with 8 indexes.


FWIW, this is not what I commonly see. Usually there are 2 or 3 lookups 
that need to happen in the "fast path", not 8. Perhaps you need to take 
another look at your application needs?


2. nodes with large data density (30 TB data on each node if No.1 could not be solved in HBase), how HBase would handle compaction and node join-remove problems while there is only 5 * 6 TB 7200 SATA Disk available on each node? How much Hbase needs as empty space for template files of compaction? 


HBase uses a distributed filesystem to ensure that data is available to 
be read by any RegionServer. Obviously, that filesystem needs to have 
sufficient capacity to write a new file which is approximately the sum 
of the file sizes being compacted.


3. Also i read in some documents (including datastax's) that HBase is more 
of a offline & data-lake backend that better not to be used as web 
application backendd which needs less than some seconds QoS in response 
time. Thanks in advance Sent using Zoho Mail


Sounds like marketing trash to me. The entire premise around HBase's 
architecture is:


* Low latency random writes/updates
* Low latency random reads
* High throughput writes via batch tools (e.g. Bulk loading)

IIRC, many early adopters of HBase were using it in the critical-path 
for web applications.


Re: questions regarding hbase major compaction

2018-09-10 Thread Josh Elser

1. Yes
2. HDFS NN pressure, read slow down, general poor performance
3. Default configuration is weekly, if you don't explicitly know some 
reasons why weekly doesn't work, this is what you should follow ;)

4. No

I would be surprised if you need to do anything special with S3, but I 
don't know for sure.


On 9/10/18 2:19 PM, Antonio Si wrote:

Hello,

As I understand, the deleted records in hbase files do not get removed
until a major compaction is performed.

I have a few questions regarding major compaction:

1.   If I set a TTL and/or a max number of versions, the records are older
than the TTL or the
   expired versions will still be in the hbase files until the major
compaction is performed?
   Is my understanding correct?

2.   If a major compaction is never performed on a table, besides the size
of the table keep
   increasing, eventually, we will have too many hbase files and the
cluster will slow down.
   Is there any other implications?

3.   Is there any guidelines about how often should we run major compaction?

4.   During major compaction, do we need to pause all read/write operations
until major
   compaction is finished?

   I realize that if using S3 as the storage, after I run major
compaction, there is inconsistencies
   between s3 metadata and s3 file system and I need to run a "emrfs
sync" to synchronize them
   after major compaction is completed. Does it mean I need to pause all
read/write operations
   during this period?

Thanks.

Antonio.



Re: Query on rowkey distribution || Does RS and number of Region related with each other

2018-09-04 Thread Josh Elser
Manjeet -- you are still missing the fact that if you do not split your 
table into multiple regions, your data will not be distributed.


Why do you think that your rowkey design means you can't split your table?

On 9/3/18 6:09 AM, Manjeet Singh wrote:

Hi Josh

Sharing steps and my findings for better understanding:


I have tested on below table creation policy (considering that I am 100%
aware of pre-splitting but can't use as per our rowkey design)

I have to opt some different policy which can evenly distribute the data to
all Regions

#1
hbase org.apache.hadoop.hbase.util.RegionSplitter test_table HexStringSplit
-c 10 -f f1
alter 'test_table', { NAME => 'si', DATA_BLOCK_ENCODING => 'FAST_DIFF' }
alter 'test_table', {NAME => 'si', COMPRESSION => 'SNAPPY'


#2
create 'TEST_TABLE_KeyPrefixRegionSplitPolicy', {NAME => 'si'}, CONFIG =>
{'KeyPrefixRegionSplitPolicy.prefix_length'=> '5'}
alter 'TEST_TABLE_KeyPrefixRegionSplitPolicy', { NAME => 'si',
DATA_BLOCK_ENCODING => 'FAST_DIFF' }
alter 'TEST_TABLE_KeyPrefixRegionSplitPolicy', {NAME => 'si', COMPRESSION
=> 'SNAPPY'



#3 Currently I am consdring it and want to distribute data only based on
rowkey
create 'TEST_TABLE','si',{ NAME => 'si', COMPRESSION => 'SNAPPY' }
alter 'TEST_TABLE', { NAME => 'si', DATA_BLOCK_ENCODING => 'FAST_DIFF' }
alter 'TEST_TABLE', {NAME => 'si', COMPRESSION => 'SNAPPY' }


Thanks
Manjeet Singh



On Fri, Aug 31, 2018 at 6:49 PM, Josh Elser  wrote:


I'd like to remind you again that we're all volunteers and we're helping
you because we choose to do so. Antagonizing those who are helping you is a
great way to stop receiving any free help.

If you do not create more than one Region, HBase will not distribute your
data on more than one RegionServer. Full stop.


On 8/30/18 2:16 PM, Manjeet Singh wrote:


Hi Elser

I have clearly total about rowkey does I am talking about data? see below
what I have told about rowkey

SALT_ID_DayStartTimestamp_DayEndTimeStamp_IDTimeStamp

Problem is this you are not understanding the question and just telling
what you know, even on slack you are saying same thing.
Question is simple if I put salt (which can be any arbit char or genrated
hash any thing) at the begning of the rowkey why my data not getting
distributed
Please note this is not pre splitted table.

Thanks
Manjeet Singh

On Thu, Aug 30, 2018 at 9:11 PM Josh Elser  wrote:

As I've been trying to explain in Slack:


1. Are you including the salt in the data that you are writing, such
that you are spreading the data across all Regions per their boundaries?
Or, as I think you are, just creating split points with this arbitrary
"salt" and not including it when you write data?

If, as I am assuming, you are not, all of your data will go into the
first or last region. If you are still not getting my point, I'd suggest
that you share the exact splitpoints and one rowkey that you are writing
to HBase. That will make it quite clear if my guess is correct or not.

2. The number of Regions controls the number of RegionServers that will
be involved with reads/writes against that table. This is a calculation
that you need to figure out based on your cluster configuration and the
magnitude of your workload.

On 8/30/18 1:11 AM, Manjeet Singh wrote:


Hi All,



I have two Question

*Question 1 : *

I want to understand how rowkey distribution happen if I create my table
with out applying any policy but opting prefix salting.

Example I have rowkey like

SALT_ID_DayStartTimestamp_DayEndTimeStamp_IDTimeStamp

So it will look like as below

*_99_1516838400_1516924800_1516865160

Question is : now I can not see that my data is getting distributed only
because of salt.

So does I have only choice of pre splitting? Or do I have any other


option?



I have seen two more approaches

i.e.

hbase org.apache.hadoop.hbase.util.RegionSplitter test_table


HexStringSplit


-c 10 -f f1

I guess its scope is limited as number of region created at the time


table


creation and it will fix? Not sure.

and

*UniformSplit
<


https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbas
e/util/RegionSplitter.UniformSplit.html


*



*Second 2: Does number of split point anywhere related to the number of


RS


in cluster, If yes what is the calculation? *












Re: HBase unable to connect to zookeeper error

2018-08-31 Thread Josh Elser
If it was related to maxClientCnxns, you would see sessions being 
torn-down and recreated in HBase on that node, as well as a clear 
message in the ZK server log that it's denying requests because the 
number of outstanding connections from that host exceeds the limit.


ConnectionLoss is a transient ZooKeeper state; more often than not, I 
see this manifest as a result of unplanned pauses in HBase itself. 
Typically this is a result of JVM garbage collection pauses, other times 
from Linux kernel/OS-level pauses. The former you can diagnose via the 
standard JVM GC logging mechanisms, the latter usually via your syslog 
or dmesg.


When looking for unexpected pauses, remember that you also need to look 
at what was happening in ZK. A JVM GC pause in ZK would exhibit the same 
kind of symptoms in HBase.


One final suggestion is to correlate it against other batch jobs (e.g. 
YARN, Spark) which may be running on the same node. It's possible that 
the node is not experiencing any explicit problems, but there is some 
transient workload which happens to run and slows things down.


Have fun digging!

On 8/31/18 3:19 PM, Srinidhi Muppalla wrote:

Hello all,

Our production application has recently experienced a very high spike in the 
following exception along with very large read times to our hbase cluster.

“org.apache.hadoop.hbase.shaded.org.apache.zookeeper.KeeperException$ConnectionLossException:
 KeeperErrorCode = ConnectionLoss for /hbase/meta-region-server\n\tat 
org.apache.hadoop.hbase.shaded.org.apache.zookeeper.KeeperException.create(KeeperException.java:99)\n\tat
 
org.apache.hadoop.hbase.shaded.org.apache.zookeeper.KeeperException.create(KeeperException.java:51)\n\tat
 
org.apache.hadoop.hbase.shaded.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)\n\tat
 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:359)\n\tat
 org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:623)\n\tat 
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionState(MetaTableLocator.java:487)\n\tat
 
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionLocation(MetaTableLocator.java:168)\n\tat
 
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:605)\n\tat
 
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:585)\n\tat
 
org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:564)\n\tat
 
org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:61)\n\tat
 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1211)\n\tat
 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1178)\n\tat
 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.relocateRegion(ConnectionManager.java:1152)\n\tat
 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1357)\n\tat
 
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1181)\n\tat
 
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:305)\n\tat
 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:156)\n\tat
 
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)\n\tat
 
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)\n\tat
 org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:326)\n\tat 
org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:301)\n\tat
 
org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:166)\n\tat
 
org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:161)\n\tat
 org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:794)\n\tat”

This error is not happening consistently as some reads to our table are 
happening successfully, so I am unable to narrow the issue down to a single 
configuration or connectivity failure.

Things I’ve tried are:
Using hbase zkcli to connect to our zookeeper server from the master instance. 
It is able to successfully connect and when running ‘ls’, the 
“/hbase/meta-region-server” znode is present.
Checking the number of connections that are occurring to our zookeeper instance 
using the HBase web UI. The number of connections is currently 162. I double 
checked our hbase config and the value for 
‘hbase.zookeeper.property.maxClientCnxns’ is 300.

Any insight into the cause or other steps that I could take to debug this issue 
would be greatly appreciated.

Thank you,
Srinidhi



Re: HMerge Status

2018-08-31 Thread Josh Elser

Hey JMS,

No, that's not my understanding. I'm not sure how the Normalizer would 
change the size of the Regions but keep the number of Regions the same :)


IIRC, the Normalizer works by looking at adjacent Regions, merging them 
together when their size is under the given threshold. The caveat is 
that it does this relatively slowly to avoid causing duress on users 
actively doing things on the system.


Having an active mode (do merges fast) and passive mode (do merges 
slowly) sounds like a nice addition now that I think about it.


On 8/31/18 9:22 AM, Jean-Marc Spaggiari wrote:

If I'm not mistaken the Normalizer will keep the same number of regions,
but will uniform the size, right? So if the goal is to reduce the number of
region, the Normalizer might not help?

JMS

Le ven. 31 août 2018 à 09:16, Josh Elser  a écrit :


There's the Region Normalizer which I'd presume would be in an HBase 1.4
release

https://issues.apache.org/jira/browse/HBASE-13103

On 8/30/18 3:50 PM, Austin Heyne wrote:

I'm using HBase 1.4.4 (AWS/EMR) and I'm looking for an automated
solution because I believe there are going to be a few hundred if not
thousand merges. It's also challenging to find candidate pairs.

-Austin


On 08/30/2018 03:45 PM, Jean-Marc Spaggiari wrote:

Hi Austin,

Which version are you using? Why not just using the shell merge command?

JMS

Le jeu. 30 août 2018 à 15:41, Austin Heyne  a écrit :


We're currently sitting at a very high number of regions due to an
initially poor value for hbase.regionserver.regionSplitLimit and would
like to reign in our region count. Additionally, we have a
spatio-temporal key structure and our region pre-splitting was done
evenly, without regard to the spatial distribution of our data and thus
have a lot of small and empty regions we'd like to clean up. I've found
the HMerge class [1], and it seems it would do something reasonable for
our used case. However, it's marked as Private and doesn't seem to be
used anywhere so I thought I'd ask if anyone knows the status of this
class and how safe it is.

Thanks,
Austin

[1]



https://github.com/apache/hbase/blob/branch-1.4/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java












Re: Query on rowkey distribution || Does RS and number of Region related with each other

2018-08-31 Thread Josh Elser
I'd like to remind you again that we're all volunteers and we're helping 
you because we choose to do so. Antagonizing those who are helping you 
is a great way to stop receiving any free help.


If you do not create more than one Region, HBase will not distribute 
your data on more than one RegionServer. Full stop.


On 8/30/18 2:16 PM, Manjeet Singh wrote:

Hi Elser

I have clearly total about rowkey does I am talking about data? see below
what I have told about rowkey

SALT_ID_DayStartTimestamp_DayEndTimeStamp_IDTimeStamp

Problem is this you are not understanding the question and just telling
what you know, even on slack you are saying same thing.
Question is simple if I put salt (which can be any arbit char or genrated
hash any thing) at the begning of the rowkey why my data not getting
distributed
Please note this is not pre splitted table.

Thanks
Manjeet Singh

On Thu, Aug 30, 2018 at 9:11 PM Josh Elser  wrote:


As I've been trying to explain in Slack:

1. Are you including the salt in the data that you are writing, such
that you are spreading the data across all Regions per their boundaries?
Or, as I think you are, just creating split points with this arbitrary
"salt" and not including it when you write data?

If, as I am assuming, you are not, all of your data will go into the
first or last region. If you are still not getting my point, I'd suggest
that you share the exact splitpoints and one rowkey that you are writing
to HBase. That will make it quite clear if my guess is correct or not.

2. The number of Regions controls the number of RegionServers that will
be involved with reads/writes against that table. This is a calculation
that you need to figure out based on your cluster configuration and the
magnitude of your workload.

On 8/30/18 1:11 AM, Manjeet Singh wrote:

Hi All,



I have two Question

*Question 1 : *

I want to understand how rowkey distribution happen if I create my table
with out applying any policy but opting prefix salting.

Example I have rowkey like

SALT_ID_DayStartTimestamp_DayEndTimeStamp_IDTimeStamp

So it will look like as below

*_99_1516838400_1516924800_1516865160

Question is : now I can not see that my data is getting distributed only
because of salt.

So does I have only choice of pre splitting? Or do I have any other

option?


I have seen two more approaches

i.e.

hbase org.apache.hadoop.hbase.util.RegionSplitter test_table

HexStringSplit

-c 10 -f f1

I guess its scope is limited as number of region created at the time

table

creation and it will fix? Not sure.

and

*UniformSplit
<

https://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/util/RegionSplitter.UniformSplit.html

*



*Second 2: Does number of split point anywhere related to the number of

RS

in cluster, If yes what is the calculation? *








Re: HMerge Status

2018-08-31 Thread Josh Elser
There's the Region Normalizer which I'd presume would be in an HBase 1.4 
release


https://issues.apache.org/jira/browse/HBASE-13103

On 8/30/18 3:50 PM, Austin Heyne wrote:
I'm using HBase 1.4.4 (AWS/EMR) and I'm looking for an automated 
solution because I believe there are going to be a few hundred if not 
thousand merges. It's also challenging to find candidate pairs.


-Austin


On 08/30/2018 03:45 PM, Jean-Marc Spaggiari wrote:

Hi Austin,

Which version are you using? Why not just using the shell merge command?

JMS

Le jeu. 30 août 2018 à 15:41, Austin Heyne  a écrit :


We're currently sitting at a very high number of regions due to an
initially poor value for hbase.regionserver.regionSplitLimit and would
like to reign in our region count. Additionally, we have a
spatio-temporal key structure and our region pre-splitting was done
evenly, without regard to the spatial distribution of our data and thus
have a lot of small and empty regions we'd like to clean up. I've found
the HMerge class [1], and it seems it would do something reasonable for
our used case. However, it's marked as Private and doesn't seem to be
used anywhere so I thought I'd ask if anyone knows the status of this
class and how safe it is.

Thanks,
Austin

[1]

https://github.com/apache/hbase/blob/branch-1.4/hbase-server/src/main/java/org/apache/hadoop/hbase/util/HMerge.java 








Re: Query on rowkey distribution || Does RS and number of Region related with each other

2018-08-30 Thread Josh Elser

As I've been trying to explain in Slack:

1. Are you including the salt in the data that you are writing, such 
that you are spreading the data across all Regions per their boundaries? 
Or, as I think you are, just creating split points with this arbitrary 
"salt" and not including it when you write data?


If, as I am assuming, you are not, all of your data will go into the 
first or last region. If you are still not getting my point, I'd suggest 
that you share the exact splitpoints and one rowkey that you are writing 
to HBase. That will make it quite clear if my guess is correct or not.


2. The number of Regions controls the number of RegionServers that will 
be involved with reads/writes against that table. This is a calculation 
that you need to figure out based on your cluster configuration and the 
magnitude of your workload.


On 8/30/18 1:11 AM, Manjeet Singh wrote:

Hi All,



I have two Question

*Question 1 : *

I want to understand how rowkey distribution happen if I create my table
with out applying any policy but opting prefix salting.

Example I have rowkey like

SALT_ID_DayStartTimestamp_DayEndTimeStamp_IDTimeStamp

So it will look like as below

*_99_1516838400_1516924800_1516865160

Question is : now I can not see that my data is getting distributed only
because of salt.

So does I have only choice of pre splitting? Or do I have any other option?

I have seen two more approaches

i.e.

hbase org.apache.hadoop.hbase.util.RegionSplitter test_table HexStringSplit
-c 10 -f f1

I guess its scope is limited as number of region created at the time table
creation and it will fix? Not sure.

and

*UniformSplit
*



*Second 2: Does number of split point anywhere related to the number of RS
in cluster, If yes what is the calculation? *



Re: Phoenix CsvBulkLoadTool fails with java.sql.SQLException: ERROR 103 (08004): Unable to establish connection

2018-08-20 Thread Josh Elser

(-cc user@hbase, +bcc user@hbase)

How about the rest of the stacktrace? You didn't share the cause.

On 8/20/18 1:35 PM, Mich Talebzadeh wrote:


This was working fine before my Hbase upgrade to 1.2.6

I have Hbase version 1.2.6 and Phoenix 
version apache-phoenix-4.8.1-HBase-1.2-bin


This command bulkloading into Hbase through phoenix failsnow fails

HADOOP_CLASSPATH=${HOME}/jars/hbase-protocol-1.2.6.jar:${HBASE_HOME}/conf hadoop 
jar ${HBASE_HOME}/lib/phoenix-4.8.1-HBase-1.2-client.jar 
org.apache.phoenix.mapreduce.CsvBulkLoadTool --table ${TABLE_NAME} 
--input hdfs://rhes75:9000/${REFINED_HBASE_SUB_DIR}/${FILE_NAME}_${dir}.txt


hadoop jar 
/data6/hduser/hbase-1.2.6/lib/phoenix-4.8.1-HBase-1.2-client.jar 
org.apache.phoenix.mapreduce.CsvBulkLoadTool --table 
MARKETDATAHBASEBATCH --input 
hdfs://rhes75:9000//data/prices/2018-08-20_refined/populate_Phoenix_table_MARKETDATAHBASEBATCH_2018-08-20.txt
+ 
HADOOP_CLASSPATH=/home/hduser/jars/hbase-protocol-1.2.6.jar:/data6/hduser/hbase-1.2.6/conf



With the following error

2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client 
environment:java.library.path=/home/hduser/hadoop-3.1.0/lib
2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client 
environment:java.io.tmpdir=/tmp
2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client 
environment:java.compiler=
2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client 
environment:os.name =Linux
2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client 
environment:os.arch=amd64
2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client 
environment:os.version=3.10.0-862.3.2.el7.x86_64
2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client 
environment:user.name =hduser
2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client 
environment:user.home=/home/hduser
2018-08-20 18:29:47,248 INFO  [main] zookeeper.ZooKeeper: Client 
environment:user.dir=/data6/hduser/streaming_data/2018-08-20
2018-08-20 18:29:47,249 INFO  [main] zookeeper.ZooKeeper: Initiating 
client connection, connectString=rhes75:2181 sessionTimeout=9 
watcher=hconnection-0x493d44230x0, quorum=rhes75:2181, baseZNode=/hbase
2018-08-20 18:29:47,261 INFO  [main-SendThread(rhes75:2181)] 
zookeeper.ClientCnxn: Opening socket connection to server 
rhes75/50.140.197.220:2181 . Will not 
attempt to authenticate using SASL (unknown error)
2018-08-20 18:29:47,264 INFO  [main-SendThread(rhes75:2181)] 
zookeeper.ClientCnxn: Socket connection established to 
rhes75/50.140.197.220:2181 , initiating session
2018-08-20 18:29:47,281 INFO  [main-SendThread(rhes75:2181)] 
zookeeper.ClientCnxn: Session establishment complete on server 
rhes75/50.140.197.220:2181 , sessionid = 
0x1002ea99eed0077, negotiated timeout = 4
Exception in thread "main" java.sql.SQLException: ERROR 103 (08004): 
Unable to establish connection.
     at 
org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:455)


Any thoughts?

Thanks

Dr Mich Talebzadeh

LinkedIn 
/https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/


http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk.Any and all responsibility for any 
loss, damage or destruction of data or any other property which may 
arise from relying on this email's technical content is explicitly 
disclaimed. The author will in no case be liable for any monetary 
damages arising from such loss, damage or destruction.




Re: Region Server Crashes with below ERROR

2018-08-13 Thread Josh Elser

Nothing in here indicates why the RegionServers actually failed.

If the RegionServer crashed, there is very likely a log message at 
FATAL. You want to find that to understand what actually caused it.


On 8/13/18 4:22 PM, Adep, Karankumar (ETW - FLEX) wrote:

Hi,

Region Server Crashes with below ERROR, Looks like some issues with GC 
configuration ?

2018-08-09 14:54:26,106 INFO 
org.apache.hadoop.hbase.regionserver.RSRpcServices: Scanner 2847160913172185436 
lease expired on region 
pi,ea00,1519782669886.393add40963aadf9d6a3ceeceaee1106.
2018-08-09 14:54:26,104 INFO org.apache.hadoop.hbase.util.JvmPauseMonitor: 
Detected pause in JVM or host machine (eg GC): pause of approximately 6173ms
GC pool 'ParNew' had collection(s): count=2 time=114ms
GC pool 'ConcurrentMarkSweep' had collection(s): count=1 time=6262ms
2018-08-09 14:54:26,100 INFO org.apache.hadoop.hbase.ScheduledChore: Chore: 
CompactionChecker missed its start time
2018-08-09 14:54:26,106 INFO 
org.apache.hadoop.hbase.regionserver.RSRpcServices: Scanner 2847160913172185429 
lease expired on region 
pi,9c00,1519782669886.962035d50bed0ead188dc55484587c5c.
2018-08-09 14:54:26,106 INFO 
org.apache.hadoop.hbase.regionserver.RSRpcServices: Scanner 2847160913172185442 
lease expired on region 
pt,b800,1519942253050.7cb5a194271207df2456220c1bfacda8.

Thank You,
Karan Adep | Platform Operations Team | Cell: 917-873-6163
 



Re: [DISCUSS] Separate Git Repository for HBCK2

2018-07-25 Thread Josh Elser
Thanks, Umesh. Seems like you're saying it's not a problem now, but 
you're not sure if it would become a problem. Regardless of that, it's a 
goal to not be version-specific (and thus, we can have generic hbck-v1 
and hbck-v2 tools). LMK if I misread, please :)


One more thought, it would be nice to name this repository as 
"operator-tools" or similar (instead of hbck). A separate repo on its 
own release cadence is a nice vehicle for random sorts of recovery, 
slice-and-dice, one-off tools. I think HBCK is one example of 
administrator/operator tooling we provide (certainly, the most used), 
but we have the capacity to provide more than just that.


On 7/24/18 5:55 PM, Umesh Agashe wrote:

Thanks Stack, Josh and Andrew for your suggestions and concerns.

I share Stack's suggestions. This would be similar to hbase-thirdparty. The
new repo could be hbase-hbck/hbase-hbck2. As this tool will be used by
hbase users/ developers, hbase JIRA can be used for hbck issues.

bq. How often does HBCK need to re-use methods and constants from code
in hbase-common, hbase-server, etc?
bq. Is it a goal to firm up API stability around this shared code.

bq. If we do this can we also move out hbck version 1?

As HBCK2 tool will be freshly written, we can try to achieve this goal. I
think its great idea to move hbck1 to new repo as well. Though I think its
more involved with hbck1 as the existing code already uses what it can from
hbase-common and hbase-server etc. modules.

bq. How often does HBCK make decisions on how to implement a correction
based on some known functionality (e.g. a bug) in a specific version(s)
of HBase. Concretely, would HBCK need to make corrections to an HBase
installation that are specific to a subset of HBase 2.x.y versions that
may not be valid for other 2.x.y versions?

I see if this happens too often, compatibility metrics will be complicated.

Thanks,
Umesh


On Tue, Jul 24, 2018 at 10:27 AM Andrew Purtell  wrote:


If we do this can we also move out hbck version 1? It would be really weird
in my opinion to have v2 in a separate repo but v1 shipping with the 1.x
releases. That would be a source of understandable confusion.

I believe our compatibility guidelines allow us to upgrade interface
annotations from private to LP or Public and from LP to Public. These are
not changes that impact source or binary compatibility. They only change
the promises we make going forward about their stability. I believe we can
allow these in new minors, so we could potentially move hbck out in a
1.5.0.


On Mon, Jul 23, 2018 at 4:46 PM Stack  wrote:


On Thu, Jul 19, 2018 at 2:09 PM Umesh Agashe




wrote:


Hi,

I've had the opportunity to talk about HBCK2 with a few of you. One of

the

suggestions is to to have a separate git repository for HBCK2. Lets

discuss

about it.

In the past when bugs were found in hbck, there is no easy way to

release

patched version of just hbck (without patching HBase). If HBCK2 has a
separate git repo, HBCK2 versions will not be tightly related to HBase
versions. Fixing and releasing hbck2, may not require patching HBase.
Though tight coupling will be somewhat loosened, HBCK2 will still

depend

on

HBase APIs/ code. Caution will be required going forward regarding
compatibility.

What you all think?



I think this the way to go.

We'd make a new hbase-hbck2 repo as we did for hbase-thirdparty?

We'd use the hbase JIRA for hbase-hbck2 issues?

We'd make hbase-hbck2 releases on occasion that the PMC voted on?

Sounds great!
St.Ack

Thanks,

Umesh

JIRA:  https://issues.apache.org/jira/browse/HBASE-19121.
Doc:





https://docs.google.com/document/d/1NxSFu4TKQ6lY-9J5qsCcJb9kZOnkfX66KMYsiVxBy0Y/edit?usp=sharing







--
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
- A23, Crosstalk





Re: how to get the last row inserted into a hbase table

2018-07-11 Thread Josh Elser

Unless you are including the date+time in the rowKey yourself, no.

HBase has exactly one index for fast lookups, and that is the rowKey. 
Any other query operation is (essentially) an exhaustive search.


On 7/11/18 12:07 PM, Ming wrote:

Hi, all,

  


Is there a way to get the last row put/delete into a HBase table?

In other words, how can I tell the last time a HBase table is changed? I was
trying to check the HDFS file stats, but HBase has memstore, so that is not
a good way, and HFile location is internal to HBase.

  


My purpose is to quickly check the last modified timestamp for a given HBase
table.

  


Thanks,

Ming




Re: HBase 2.0.1 with Hadoop 2.8.4 causes NoSuchMethodException

2018-07-05 Thread Josh Elser
You might also need hbase.wal.meta_provider=filesystem (if you haven't 
already realized that)


On 7/2/18 5:43 PM, Andrey Elenskiy wrote:


hbase.wal.provider
filesystem


Seems to fix it, but would be nice to actually try the fanout wal with
hadoop 2.8.4.

On Mon, Jul 2, 2018 at 1:03 PM, Andrey Elenskiy 
wrote:


Hello, we are running HBase 2.0.1 with official Hadoop 2.8.4 jars and
hadoop 2.8.4 client (http://central.maven.org/maven2/org/apache/hadoop/
hadoop-client/2.8.4/). Got the following exception on regionserver which
brings it down:

18/07/02 18:51:06 WARN concurrent.DefaultPromise: An exception was thrown by 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete()
java.lang.Error: Couldn't properly initialize access to HDFS internals. Please 
update your WAL Provider to not make use of the 'asyncfs' provider. See 
HBASE-16110 for more information.
  at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.(FanOutOneBlockAsyncDFSOutputSaslHelper.java:268)
  at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.initialize(FanOutOneBlockAsyncDFSOutputHelper.java:661)
  at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper.access$300(FanOutOneBlockAsyncDFSOutputHelper.java:118)
  at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:720)
  at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputHelper$13.operationComplete(FanOutOneBlockAsyncDFSOutputHelper.java:715)
  at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:507)
  at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:500)
  at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:479)
  at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:420)
  at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
  at 
org.apache.hbase.thirdparty.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
  at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.fulfillConnectPromise(AbstractEpollChannel.java:638)
  at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:676)
  at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:552)
  at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:394)
  at 
org.apache.hbase.thirdparty.io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:304)
  at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
  at 
org.apache.hbase.thirdparty.io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
  at java.lang.Thread.run(Thread.java:748)
  Caused by: java.lang.NoSuchMethodException: 
org.apache.hadoop.hdfs.DFSClient.decryptEncryptedDataEncryptionKey(org.apache.hadoop.fs.FileEncryptionInfo)
  at java.lang.Class.getDeclaredMethod(Class.java:2130)
  at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.createTransparentCryptoHelper(FanOutOneBlockAsyncDFSOutputSaslHelper.java:232)
  at 
org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutputSaslHelper.(FanOutOneBlockAsyncDFSOutputSaslHelper.java:262)
  ... 18 more

  FYI, we don't have encryption enabled. Let me know if you need more info
about our setup.





CVE-2018-8025 on Apache HBase

2018-06-22 Thread Josh Elser
CVE-2018-8025 describes an issue in Apache HBase that affects the 
optional "Thrift 1" API server when running over HTTP. There is a 
race-condition which could lead to authenticated sessions being 
incorrectly applied to users, e.g. one authenticated user would be 
considered a different user or an unauthenticated user would be treated 
as an authenticated user.


https://issues.apache.org/jira/browse/HBASE-20664 implements a fix for 
this issue, and this fix is contained in the following releases of 
Apache HBase:


* 1.2.6.1
* 1.3.2.1
* 1.4.5
* 2.0.1

This vulnerability affects all 1.x and 2.x release lines (except 1.0.0).

- The Apache HBase PMC


Re: Alternative to native client c++

2018-06-21 Thread Josh Elser

Use `mvn package`, not `compile`.

On 6/21/18 10:41 AM, Andrzej wrote:

W dniu 21.06.2018 o 19:01, Andrzej pisze:

Is any alternative to fast control HBase from C++ sources?
Or is Java client?
Native Client C++ (HBASE-14850) sources are old and mismatch to folly 
library (Futures.h)




Now I first tried compiling hbase-client in master branch with Maven...
next, I try compile whole repository with Maven:
command "mvn compile":

[INFO] 


[INFO] Reactor Summary:
[INFO]
[INFO] Apache HBase ... SUCCESS 
[01:41 min]
[INFO] Apache HBase - Checkstyle .. SUCCESS [ 
0.753 s]
[INFO] Apache HBase - Build Support ... SUCCESS [ 
0.233 s]
[INFO] Apache HBase - Error Prone Rules ... SUCCESS [ 
11.211 s]
[INFO] Apache HBase - Annotations . SUCCESS [ 
0.327 s]
[INFO] Apache HBase - Build Configuration . SUCCESS [ 
0.419 s]
[INFO] Apache HBase - Shaded Protocol . SUCCESS [ 
53.398 s]
[INFO] Apache HBase - Common .. SUCCESS [ 
25.850 s]
[INFO] Apache HBase - Metrics API . SUCCESS [ 
2.835 s]
[INFO] Apache HBase - Hadoop Compatibility  SUCCESS [ 
3.432 s]
[INFO] Apache HBase - Metrics Implementation .. SUCCESS [ 
7.296 s]
[INFO] Apache HBase - Hadoop Two Compatibility  SUCCESS [ 
12.773 s]
[INFO] Apache HBase - Protocol  SUCCESS [ 
18.311 s]
[INFO] Apache HBase - Client .. SUCCESS [ 
10.863 s]
[INFO] Apache HBase - Zookeeper ... SUCCESS [ 
4.104 s]
[INFO] Apache HBase - Replication . FAILURE [ 
2.351 s]

[INFO] Apache HBase - Resource Bundle . SKIPPED
[INFO] Apache HBase - HTTP  SKIPPED
[INFO] Apache HBase - Procedure ... SKIPPED
[INFO] Apache HBase - Server .. SKIPPED
[INFO] Apache HBase - MapReduce ... SKIPPED
[INFO] Apache HBase - Testing Util  SKIPPED
[INFO] Apache HBase - Thrift .. SKIPPED
[INFO] Apache HBase - RSGroup . SKIPPED
[INFO] Apache HBase - Shell ... SKIPPED
[INFO] Apache HBase - Coprocessor Endpoint  SKIPPED
[INFO] Apache HBase - Backup .. SKIPPED
[INFO] Apache HBase - Integration Tests ... SKIPPED
[INFO] Apache HBase - Rest  SKIPPED
[INFO] Apache HBase - Examples  SKIPPED
[INFO] Apache HBase - Shaded .. SKIPPED
[INFO] Apache HBase - Shaded - Client (with Hadoop bundled) SKIPPED
[INFO] Apache HBase - Shaded - Client . SKIPPED
[INFO] Apache HBase - Shaded - MapReduce .. SKIPPED
[INFO] Apache HBase - External Block Cache  SKIPPED
[INFO] Apache HBase - Spark ... SKIPPED
[INFO] Apache HBase - Spark Integration Tests . SKIPPED
[INFO] Apache HBase - Assembly  SKIPPED
[INFO] Apache HBase Shaded Packaging Invariants ... SKIPPED
[INFO] Apache HBase Shaded Packaging Invariants (with Hadoop bundled) 
SKIPPED

[INFO] Apache HBase - Archetypes .. SKIPPED
[INFO] Apache HBase - Exemplar for hbase-client archetype . SKIPPED
[INFO] Apache HBase - Exemplar for hbase-shaded-client archetype SKIPPED
[INFO] Apache HBase - Archetype builder ... SKIPPED
[INFO] 


[INFO] BUILD FAILURE
[INFO] 


[INFO] Total time: 04:19 min
[INFO] Finished at: 2018-06-21T19:37:37+02:00
[INFO] Final Memory: 100M/458M
[INFO] 

[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce 
(hadoop-profile-min-maven-min-java-banned-xerces) on project 
hbase-replication: Execution 
hadoop-profile-min-maven-min-java-banned-xerces of goal 
org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce failed: 
org.apache.maven.shared.dependency.graph.DependencyGraphBuilderException: Could 
not resolve following dependencies: 
[org.apache.hbase:hbase-zookeeper:jar:tests:3.0.0-SNAPSHOT (test)]: 
Could not resolve dependencies for project 
org.apache.hbase:hbase-replication:jar:3.0.0-SNAPSHOT: Could not find 
artifact org.apache.hbase:hbase-zookeeper:jar:tests:3.0.0-SNAPSHOT in 
project.local 
(file:/home/andrzej/code/hbase/hbase-replication/src/site/resources/repo) -> 
[Help 1]

[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with

Re: Problem starting region server with Hbase version hbase-2.0.0

2018-06-08 Thread Josh Elser
You shouldn't be putting the phoenix-client.jar on the HBase server 
classpath.


There is specifically the phoenix-server.jar which is specifically built 
to be included in HBase (to avoid issues such as these).


Please remove all phoenix-client jars and provide the 
phoenix-5.0.0-server jar instead.


On 6/7/18 5:06 PM, Mich Talebzadeh wrote:

Thanks.

under $HBASE_HOME/lib for version 2 I swapped the phoenix client jar file
as below

phoenix-5.0.0-alpha-HBase-2.0-client.jar_ori
phoenix-4.8.1-HBase-1.2-client.jar

I then started HBASE-2 that worked fine.

For Hbase clients, i.e. the Hbase  connection from edge nodes etc, I will
keep using HBASE-1.2.6 which is the stable version and it connects
successfully to Hbase-2. This appears to be a working solution for now.

Regards

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 7 June 2018 at 21:03, Sean Busbey  wrote:


Your current problem is caused by this phoenix jar:



hduser@rhes75: /data6/hduser/hbase-2.0.0> find ./ -name '*.jar' -print
-exec jar tf {} \; | grep -E "\.jar$|StreamCapabilities" | grep -B 1
StreamCapabilities
./lib/phoenix-5.0.0-alpha-HBase-2.0-client.jar
org/apache/hadoop/hbase/util/CommonFSUtils$StreamCapabilities.class
org/apache/hadoop/fs/StreamCapabilities.class
org/apache/hadoop/fs/StreamCapabilities$StreamCapability.class


I don't know what version of Hadoop it's bundling or why, but it's one
that includes the StreamCapabilities interface, so HBase takes that to
mean it can check on capabilities. Since Hadoop 2.7 doesn't claim to
implement any, HBase throws its hands up.

I'd recommend you ask on the phoenix list how to properly install
phoenix such that you don't need to copy the jars into the HBase
installation. Hopefully the jar pointed out here is meant to be client
facing only and not installed into the HBase cluster.


On Thu, Jun 7, 2018 at 2:38 PM, Mich Talebzadeh
 wrote:

Hi,

Under Hbase Home directory I get

hduser@rhes75: /data6/hduser/hbase-2.0.0> find ./ -name '*.jar' -print
-exec jar tf {} \; | grep -E "\.jar$|StreamCapabilities" | grep -B 1
StreamCapabilities
./lib/phoenix-5.0.0-alpha-HBase-2.0-client.jar
org/apache/hadoop/hbase/util/CommonFSUtils$StreamCapabilities.class
org/apache/hadoop/fs/StreamCapabilities.class
org/apache/hadoop/fs/StreamCapabilities$StreamCapability.class
--
./lib/hbase-common-2.0.0.jar
org/apache/hadoop/hbase/util/CommonFSUtils$StreamCapabilities.class

for Hadoop home directory I get nothing

hduser@rhes75: /home/hduser/hadoop-2.7.3> find ./ -name '*.jar' -print
-exec jar tf {} \; | grep -E "\.jar$|StreamCapabilities" | grep -B 1
StreamCapabilities


Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=

AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

*




http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising

from

such loss, damage or destruction.



On 7 June 2018 at 15:39, Sean Busbey  wrote:


Somehow, HBase is getting confused by your installation and thinks it
can check for wether or not the underlying FileSystem implementation
(i.e. HDFS) provides hflush/hsync even though that ability is not
present in Hadoop 2.7. Usually this means there's a mix of Hadoop
versions on the classpath. While you do have both Hadoop 2.7.3 and
2.7.4, that mix shouldn't cause this kind of failure[1].

Please run this command and copy/paste the output in your HBase and
Hadoop installation directories:

find . -name '*.jar' -print -exec jar tf {} \; | grep -E
"\.jar$|StreamCapabilities" | grep -B 1 StreamCapabilities



[1]: As an aside, you should follow the guidance in our reference
guide from the section "Replace the Hadoop Bundled With HBase!" in the
Hadoop chapter: http://hbase.apache.org/book.html#hadoop

But as I mentioned, I don't think it's the underlying cause in this

case.


On Thu, Jun 7, 2018 at 8:41 AM, Mich Talebzadeh
 wrote:

Hi,

Please find below

*bin/hbase version*
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/data6/hduser/hbase-2.0.0/lib/phoenix-5.0.0-alpha-

HBase-2.0-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in
[jar:file:/dat

Re: HBase - REST API - Table Schema PUT vs POST

2018-05-14 Thread Josh Elser

Yep, you got it :)

Easy doc fix we can get in place.

On 5/14/18 2:25 PM, Kevin Risden wrote:

Looks like this might have triggered
https://issues.apache.org/jira/browse/HBASE-20581

Kevin Risden

On Mon, May 14, 2018 at 8:46 AM, Kevin Risden  wrote:


We are using HDP 2.5 with HBase 1.2.x. We think we found that the PUT vs
POST documentation on the HBase book [1] website is incorrect.

POST - Create a new table, or replace an existing table’s schema

PUT - Update an existing table with the provided schema fragment



This contradicts what is in the original HBase 1.2 API javadocs [2].

PUT //schema

POST //schema
Uploads table schema. PUT or POST creates table as necessary. PUT fully
replaces schema. POST modifies schema (add or modify column family). Supply
the full table schema for PUT or a well formed schema fragment for POST in
the desired encoding. Set Content-Type header to text/xml if the desired
encoding is XML. Set Content-Type header to application/json if the desired
encoding is JSON. Set Content-Type header to application/x-protobuf if the
desired encoding is protobufs. If not successful, returns appropriate HTTP
error status code. If successful, returns HTTP 200 status.



The result of the two conflicting documentation pages is that PUT either
updates or replaces and POST either updates or replaces the table schema.
This can cause problems like setting the table max versions back to the
default of 1.

Does this make sense? Is it possible the documentation is incorrect here?

The newest versions of HBase apidocs point to the HBase book. I have not
checked if the behavior changed between HBase versions.

1. https://hbase.apache.org/book.html#_rest
2. https://hbase.apache.org/1.2/apidocs/org/apache/hadoop/
hbase/rest/package-summary.html#operation_create_schema

Kevin Risden





Re: pheonix client

2018-04-19 Thread Josh Elser

This question is better asked on the Phoenix users list.

The phoenix-client.jar is the one you need and is unique from the 
phoenix-core jar. Logging frameworks are likely not easily 
relocated/shaded to avoid issues which is why you're running into this.


Can you provide the error you're seeing with the play framework? 
Specifics here will help..


On 4/19/18 1:56 PM, Lian Jiang wrote:

I am using HDP 2.6 hbase and pheonix. I created a play rest service using
hbase as the backend. However, I have trouble to get a working pheonix
client.

I tried the pheonix-client.jar given by HDP but its logging dependency
conflicts with play's. Then I tried:

libraryDependencies += "org.apache.phoenix" % "phoenix-core" %
"4.13.1-HBase-1.1"

libraryDependencies += "org.apache.phoenix" % "phoenix-server-client" %
"4.7.0-HBase-1.1"

libraryDependencies += "org.apache.phoenix" % "phoenix-queryserver-client"
% "4.13.1-HBase-1.1"

None of them worked: "No suitable driver found".

Any idea will be highly appreciated!



HBaseCon 2018 CFP extended until Friday

2018-04-17 Thread Josh Elser
We've received some requests to extend the CFP a few more days. The new 
day of closing will be this Friday 2018/04/20, end of day.


Please keep them coming in!

On 4/15/18 9:23 PM, Josh Elser wrote:
The HBaseCon 2018 call for proposals is scheduled to close Monday, April 
16th. If you have an idea for a talk, make sure you get it submitted ASAP!


Submit your talks at https://easychair.org/conferences/?conf=hbasecon2018

If you need more information, please see 
https://hbase.apache.org/hbasecon-2018/


- Josh (on behalf of the HBase PMC)


One more day for HBaseCon 2018 abstracts!

2018-04-15 Thread Josh Elser
The HBaseCon 2018 call for proposals is scheduled to close Monday, April 
16th. If you have an idea for a talk, make sure you get it submitted ASAP!


Submit your talks at https://easychair.org/conferences/?conf=hbasecon2018

If you need more information, please see 
https://hbase.apache.org/hbasecon-2018/


- Josh (on behalf of the HBase PMC)


Re: HBase Developer - Questions

2018-04-11 Thread Josh Elser

(-to dev, +bcc dev, +to user)

Hi Stefano,

Moving your question over to the user@ mailing list as it's not so much 
about development of HBase, instead development when using HBase.


Q1: what do you mean by the "latest field"? Are you talking about the 
latest version of a Cell for a column (which is row + cf/cq)?


When you write a new cell without explicitly setting a timestamp, it 
will use the current time from that RegionServer which is effectively 
making it the "latest".


Q2: Not sure here :)

On 4/10/18 9:33 AM, Stefano Manca wrote:

Dear,

My name is Stefano, I am a young software developer.
In these months I am using HBase as database in a Hadoop cluster, to write
data from Apache Spark with scala language.
May I ask you any question for specific information?

*Q1*: there is a way to insert/update a specific cell as latest field into
the same record?

Example:
ROW COLUMN+CELL
row1  column=CF1:QUAL1, timestamp=1523350296746, value=0.0
row1  column=CF1:QUAL2, timestamp=1523350296746, value=test
row1  column=CF1:QUAL3, timestamp=1523350296746, value=2700
row1  *column=CF2:QUAL4, timestamp=1523350296746, value=01218*
row1  column=CF2:QUAL5, timestamp=1523350296746, value=example

I would like to be sure that the value of the key QUAL4 will be insert as
last field, respect to the other values of the same rowkey (row1).

*Q2*: What is the best way to write in bulk with Spark 2.2? Is the
HBaseContext class available to use the bulkload with Hfiles?


Thank you very much in advance.

Best regards,



*__Stefano Manca*

*Phone: +39 349216059*
*E-mail: ste.manc...@gmail.com *



Re: One week left for HBaseCon 2018 abstracts!

2018-04-09 Thread Josh Elser

Oh, and the most important part:

Submit your talks here: https://easychair.org/conferences/?conf=hbasecon2018

On 4/9/18 10:26 PM, Josh Elser wrote:

Hi folks!

A gentle reminder that the HBaseCon 2018 call for proposals remains open 
for just one more week -- until April 16th. The event is held in San 
Jose, CA on June 18th.


We've have some great proposals already submitted, but we look forward 
to many, many more. All levels of complexity, content, and audience are 
welcome, with just a few paragraphs required to tell us about what you 
want to speak about.


Please feel free to reach out if there are any questions!

- Josh (on behalf of the HBase PMC)


One week left for HBaseCon 2018 abstracts!

2018-04-09 Thread Josh Elser

Hi folks!

A gentle reminder that the HBaseCon 2018 call for proposals remains open 
for just one more week -- until April 16th. The event is held in San 
Jose, CA on June 18th.


We've have some great proposals already submitted, but we look forward 
to many, many more. All levels of complexity, content, and audience are 
welcome, with just a few paragraphs required to tell us about what you 
want to speak about.


Please feel free to reach out if there are any questions!

- Josh (on behalf of the HBase PMC)


[ANNOUNCE] Apache HBase Thirdparty 2.1.0 released

2018-03-28 Thread Josh Elser

All,

It's my pleasure to announce the 2.1.0 release of the Apache HBase 
Thirdparty project. This project is used by the Apache HBase project to 
encapsulate a number of dependencies that HBase relies upon and ensure 
that they are properly isolated from HBase users, e.g. Google Protocol 
Buffers, Google Guava, and the like.


This 2.1.0 release contains a number of updates in support of the 
upcoming Apache HBase 2.0.0 release. The release is available through 
dist.a.o[1] (as well as the mirror framework) and Maven central[2]. 
Release notes are also available [3].


- Josh (on behalf of the Apache HBase PMC)

[1] https://dist.apache.org/repos/dist/release/hbase/hbase-thirdparty-2.1.0/
[2] 
http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.hbase.thirdparty%22%20AND%20v%3A%222.1.0%22
[3] 
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12310753&version=12342967


Re: Does HBase bulkload support Incremental data?

2018-03-27 Thread Josh Elser

Yes, you can bulk load into a table which already contains data.

The ideal case is that you generate HFiles which map exactly to the 
distribution of Regions on your HBase cluster. However, given that we 
know that Region boundaries can change, the bulk load client 
(LoadIncrementalHFiles) has the ability to handle HFiles which no longer 
fit into a single Region. This is done client-side and then the 
resulting files are automatically resubmitted.


Beware: this is a very expensive and slow process (e.g. consider how 
long it would take to rewrite 100GB of data in a single process because 
you did not use the correct Region split points when creating the data). 
Most bulk loading issues I encounter are related to incorrect split 
points being used which causes the bulk load process to take hours to 
days to complete (instead of seconds to minutes).


On 3/27/18 9:15 AM, Jone Zhang wrote:

Does HBase bulkload support Incremental data?
How does it work if the incremental data key-range overlap with the data
already exists?

Thanks



[ANNOUNCE] HBaseCon 2018 CFP now open!

2018-03-24 Thread Josh Elser

All,

I'm pleased to announce HBaseCon 2018 which is to be held in San Jose, 
CA on June 18th.


A call for proposals is available now[1], and we encourage all HBase 
users and developers to contribute a talk and plan to attend the event 
(however, event registration is not yet available).


Please find all available details for the event at [2], and feel free to 
ask the d...@hbase.apache.org mailing list or myself any questions.


Thanks and start planning those talks!

- Josh (on behalf of the HBase PMC)

[1] https://easychair.org/conferences/?conf=hbasecon2018
[2] https://hbase.apache.org/hbasecon-2018/


Re: Understanding log message

2018-03-09 Thread Josh Elser
There was an HBase RPC connection from a client at the host (identified 
by the IP:port you redacted). IIRC, the "read count=-1" is essentially 
saying that the server tried to read from the socket and read no data 
which means that the client has hung up. There were 33 other outstanding 
HBase RPC connections after this connection was closed.


It has nothing to do with ZooKeeper.

On 3/9/18 4:19 PM, A D wrote:

All,

I am trying to understand the below message that I see in RS logs:

DEBUG org.apache.hadoop.hbase.ipc.RpcServer: RpcServer.listener,port=16020: 
DISCONNECTING client x.x.x.x:x because read count=-1. Number of active 
connections: 33

Is this disconnect because of ZooKeeper connections limit maxing out?
Is "Number of active connections: 33" message is just information?

Thank you,
AD
ask...@live.com



Re: Region not initializing in 2.0.0-beta-1

2018-03-01 Thread Josh Elser
Yeah, definitely watch out what version you built HBase using as that 
affects the Hadoop jars you get on the classpath. I think I've seen 
issues using Hadoop 2.7, 2.8, and 3.0 jars with the wrong "server-side" 
version. One of those cases where you want to get the exact version 
lining up :)


On 3/1/18 7:17 AM, sahil aggarwal wrote:

Upgrading to hadoop 2.7 worked. Now i have cluster up :)

Will try fixing the asynchbase.

Thanks guys.

On 1 March 2018 at 16:42, sahil aggarwal  wrote:


Past that error now on moving to branch-2, getting following this time.
Looks like it doesn't work with hadoop 2.6, should work with 2.7 i guess as
it is build with the same.

  ERROR [regionserver/perf-cosmos-hdn-a-364357:16020] regions
erver.HRegionServer: * ABORTING region server perf-cosmos-hdn-a-364357,
16020,15
19901697928: Unhandled: org/apache/hadoop/fs/StorageType *
java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StorageType
 at org.apache.hadoop.hbase.io.asyncfs.AsyncFSOutputHelper.
createOutput(Asyn
cFSOutputHelper.java:51)
 at org.apache.hadoop.hbase.regionserver.wal.
AsyncProtobufLogWriter.initOutp
ut(AsyncProtobufLogWriter.java:168)
 at org.apache.hadoop.hbase.regionserver.wal.
AbstractProtobufLogWriter.init(
AbstractProtobufLogWriter.java:167)
 at org.apache.hadoop.hbase.wal.AsyncFSWALProvider.
createAsyncWriter(AsyncFS
WALProvider.java:99)
 at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.
createWriterInstance
(AsyncFSWAL.java:612)
 at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.
createWriterInstance
(AsyncFSWAL.java:124)
 at org.apache.hadoop.hbase.regionserver.wal.
AbstractFSWAL.rollWriter(Abstra
ctFSWAL.java:759)
 at org.apache.hadoop.hbase.regionserver.wal.
AbstractFSWAL.rollWriter(Abstra
ctFSWAL.java:489)
 at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.<
init>(AsyncFSWAL.ja
va:251)
 at org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createWAL(
AsyncFSWALProvi
der.java:69)
 at org.apache.hadoop.hbase.wal.AsyncFSWALProvider.createWAL(
AsyncFSWALProvi
der.java:44)
 at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(
AbstractFSWALPr
ovider.java:138)
 at org.apache.hadoop.hbase.wal.AbstractFSWALProvider.getWAL(
AbstractFSWALPr
ovider.java:57)
 at org.apache.hadoop.hbase.wal.WALFactory.getWAL(WALFactory.
java:252)
 at org.apache.hadoop.hbase.regionserver.HRegionServer.
getWAL(HRegionServer.
java:2074)
 at org.apache.hadoop.hbase.regionserver.HRegionServer.
buildServerLoad(HRegi
onServer.java:1300)


On 1 March 2018 at 06:18, stack  wrote:


Thank you for offer of help.  Just trying it and talking out loud when
problems is a great help. Please try tip of branch-2. It should be beta-2
soon.

Tsdb won't work against hbase2. Someone of us needs to fix asynchbase to
do
reverse scan instead of closestBefore.  I filed an issue a while back but
they are too busy over there. Having tsdb working would be great help
testing hbase2.

Thanks for help sahil,
S



On Feb 28, 2018 8:17 AM, "sahil aggarwal"  wrote:

Stack,

It didn't work with zookeeper.version=3.4.10 too.

If that's the case then will try what Ted suggested i.e trying out 2.0
SNAPSHOT.

Moreover,  while I am at it can I help you guys with testing anything else
that may you guys have in mind or any other grunt work to give you guys
more room to work on stable release?

Otherwise I was just gonna do the perf test of off heap block cache with
openTSDB.


Thanks,
Sahil

On 28 February 2018 at 20:59, Stack  wrote:


Any progress Sahil?

There was an issue fixed where we'd write the clusterid with server zk
client but then would have trouble picking it up with the new zk

read-only

client seen in tests and fixed subsequent to beta-1. This looks like it.

Thanks for trying the beta.

S




On Thu, Feb 22, 2018 at 11:39 PM, sahil aggarwal 
It doesn't have hbase-shaded-client in classpath. I realized my build

was

run with -Dzookeeper.version=3.4.6 but in pom we have 3.4.10. I am

gonna

try rebuilding it with 3.4.10.

On 23 February 2018 at 00:29, Josh Elser  wrote:


This sounds like something I've seen in the past but was unable to

get

past. I think I was seeing it when the hbase-shaded-client was on

the

classpath. Could you see if the presence of that artifact makes a
difference one way or another?


On 2/22/18 12:52 PM, sahil aggarwal wrote:


Yes, it is a clean setup.

Here are logs on region startup

2018-02-22 22:17:22,259 DEBUG [main] zookeeper.ClientCnxn:
zookeeper.disableAutoWatchReset is false
2018-02-22 22:17:22,401 INFO  [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Opening socket connection to server

perf-zk-1/

10.33.225.67:2181. Will not attempt to authenticate using SASL

(unknown

error)
2018-02-22 22:17:22,407 INFO  [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Socket connection estab

Re: Region not initializing in 2.0.0-beta-1

2018-02-22 Thread Josh Elser
This sounds like something I've seen in the past but was unable to get 
past. I think I was seeing it when the hbase-shaded-client was on the 
classpath. Could you see if the presence of that artifact makes a 
difference one way or another?


On 2/22/18 12:52 PM, sahil aggarwal wrote:

Yes, it is a clean setup.

Here are logs on region startup

2018-02-22 22:17:22,259 DEBUG [main] zookeeper.ClientCnxn:
zookeeper.disableAutoWatchReset is false
2018-02-22 22:17:22,401 INFO  [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Opening socket connection to server perf-zk-1/
10.33.225.67:2181. Will not attempt to authenticate using SASL (unknown
error)
2018-02-22 22:17:22,407 INFO  [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Socket connection established to perf-zk-1/
10.33.225.67:2181, initiating session
2018-02-22 22:17:22,409 DEBUG [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Session establishment request sent on perf-zk-1/
10.33.225.67:2181
2018-02-22 22:17:22,415 INFO  [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Session establishment complete on server perf-zk-1/
10.33.225.67:2181, sessionid = 0x36146d5de4467de, negotiated timeout = 2
2018-02-22 22:17:22,423 DEBUG [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Reading reply sessionid:0x36146d5de4467de, packet::
clientPath:null serverPath:null finished:false header:: 1,3  replyHeader::
1,111751355931,0  request:: '/hbase-unsecure2/master,T  response::
s{111750564873,111750564873,1519309851875,1519309851875,0,0,0,171496145189271239,74,0,111750564873}
2018-02-22 22:17:22,426 DEBUG [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Reading reply sessionid:0x36146d5de4467de, packet::
clientPath:null serverPath:null finished:false header:: 2,4  replyHeader::
2,111751355931,0  request:: '/hbase-unsecure2/master,T  response::
#000146d61737465723a36303030304c11fff11646ffe12effd450425546a25a18706572662d636f736d6f732d686e6e2d612d33363433363010ffe0ffd4318ff9fff88ffb2ffefff9b2c10018ffeaffd43,s{111750564873,111750564873,1519309851875,1519309851875,0,0,0,171496145189271239,74,0,111750564873}
2018-02-22 22:17:22,428 DEBUG [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Reading reply sessionid:0x36146d5de4467de, packet::
clientPath:null serverPath:null finished:false header:: 3,3  replyHeader::
3,111751355931,0  request:: '/hbase-unsecure2/running,T  response::
s{111750565002,111750565002,1519309853317,1519309853317,0,0,0,0,59,0,111750565002}
2018-02-22 22:17:22,430 DEBUG [main-SendThread(perf-zk-1:2181)]
zookeeper.ClientCnxn: Reading reply sessionid:0x36146d5de4467de, packet::
clientPath:null serverPath:null finished:false header:: 4,4  replyHeader::
4,111751355931,0  request:: '/hbase-unsecure2/running,T  response::
#000146d61737465723a363030303021ffea7f3eff8a28576450425546a1c546875204665622032322032303a30303a3533204953542032303138,s{111750565002,111750565002,1519309853317,1519309853317,0,0,0,0,59,0,111750565002}
2018-02-22 22:17:22,459 DEBUG [main] ipc.RpcExecutor: Started 0
default.FPBQ.Fifo handlers, qsize=10 on port=16020
2018-02-22 22:17:22,475 DEBUG [main] ipc.RpcExecutor: Started 0
priority.FPBQ.Fifo handlers, qsize=2 on port=16020
2018-02-22 22:17:22,478 DEBUG [main] ipc.RpcExecutor: Started 0
replication.FPBQ.Fifo handlers, qsize=1 on port=16020
2018-02-22 22:17:22,524 INFO  [main] util.log: Logging initialized @3325ms
2018-02-22 22:17:22,625 INFO  [main] http.HttpRequestLog: Http request log
for http.requests.regionserver is not defined
2018-02-22 22:17:22,651 INFO  [main] http.HttpServer: Added global filter
'safety' (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter)
2018-02-22 22:17:22,651 INFO  [main] http.HttpServer: Added global filter
'clickjackingprevention'
(class=org.apache.hadoop.hbase.http.ClickjackingPreventionFilter)
2018-02-22 22:17:22,654 INFO  [main] http.HttpServer: Added filter
static_user_filter
(class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter)
to context regionserver
2018-02-22 22:17:22,654 INFO  [main] http.HttpServer: Added filter
static_user_filter
(class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter)
to context static
2018-02-22 22:17:22,654 INFO  [main] http.HttpServer: Added filter
static_user_filter
(class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter)
to context logs
2018-02-22 22:17:22,691 INFO  [main] http.HttpServer: Jetty bound to port
60030
2018-02-22 22:17:22,693 INFO  [main] server.Server: jetty-9.3.19.v20170502
2018-02-22 22:17:22,765 INFO  [main] handler.ContextHandler: Started
o.e.j.s.ServletContextHandler@7435a578
{/logs,file:///var/log/hbase/,AVAILABLE}
2018-02-22 22:17:22,765 INFO  [main] handler.ContextHandler: Started
o.e.j.s.ServletContextHandler@13047d7d
{/static,file:///usr/lib/hbase/hbase-webapps/static/,AVAILABLE}
2018-02-22 22:17:22,912 INFO  [main] handler.ContextHandler: Started
o.e.j.w.

[ANNOUNCE] Apache Phoenix 5.0.0-alpha released

2018-02-15 Thread Josh Elser
The Apache Phoenix PMC is happy to announce the release of Phoenix 
5.0.0-alpha for Apache Hadoop 3 and Apache HBase 2.0. The release is 
available for download at here[1].


Apache Phoenix enables OLTP and operational analytics in Hadoop for low 
latency applications by combining the power of standard SQL and JDBC 
APIs with full ACID transaction capabilities as well as the flexibility 
of late-bound, schema-on-read capabilities provided by HBase.


This is a "preview" release of Apache Phoenix 5.0.0. This release is 
specifically designed for users who want to use the newest versions of 
Hadoop and HBase while the quality of Phoenix is still incubating. This 
release should be of sufficient quality for most users, but it is not of 
the same quality as most Phoenix releases.


Please refer to the release notes[2] of this release for a full list of 
known issues. The Phoenix developers would be extremely receptive to any 
and all that use this release and report any issues as this will 
directly increase the quality of the 5.0.0 release.


-- The Phoenix PMC

[1] https://phoenix.apache.org/download.html
[2] https://phoenix.apache.org/release_notes.html


Re: Inconsistent rows exported/counted when looking at a set, unchanged past time frame.

2018-02-12 Thread Josh Elser

Hi Andrew,

Yes. The answer is, of course, that you should see consistent results 
from HBase if there are no mutations in flight to that table. Whether 
you're reading "current" or "back-in-time", as long as you're not 
dealing with raw scans (where compactions may persist delete 
tombstones), this should hold just the same.


Are you modifying older cells with newer data when you insert data? 
Remember that MAX_VERSIONS for a table defaults to 1. Consider the 
following:


* Timestamps are of the form "tX", and t1 < t2 < t3 < ..
* You are querying from the time range: [t1, t5].
* You have a cell for "row1" with at t3 with value "foo".
* RowCounter over [t1, t5] would return "1"
* Your ingest writes a new cell for "row1" of "bar" at t6.
* RowCounter over [t1, t5] would return "0" normally, or "1" is you use 
RAW scans ***

* A compaction would run over the region containing "row1"
* RowCounter over [t1, t5] would return "0" (RAW or normal)

It's also possible that you're hitting some sort of bug around missing 
records at query time. I'm not sure what the CDH versions you're using 
line up to, but there have certainly been issues in the past around 
query-time data loss (e.g. scans on RegionServers stop prematurely 
before all of the data is read).


Good luck!

*** Going off of memory here. I think this is how it works, but you 
should be able to test easily ;)


On 2/9/18 5:30 PM, Andrew Kettmann wrote:

A simpler question would be this:

Given:


   *   a set timeframe in the past (2-3 days roughly a year ago)
   *   we are NOT removing records from the table at all
   *   We ARE inserting into this table actively

Should I expect two consecutive runs of the rowcounter mapreduce job to return 
an identical number?


Andrew Kettmann
Consultant, Platform Services Group

From: Andrew Kettmann
Sent: Thursday, February 08, 2018 11:35 AM
To: user@hbase.apache.org
Subject: Inconsistent rows exported/counted when looking at a set, unchanged 
past time frame.

First the version details:

Running HBASE/Yarn/HDFS using Cloudera manager 5.12.1.
Hbase: Version 1.2.0-cdh5.8.0
HDFS/YARN: Hadoop 2.6.0-cdh5.8.0
Hbck and hdfs fsck return healthy

15 nodes, sized down recently from 30 (other service requirements reduced. 
Solr, etc)


The simplest example of the inconsistency is using rowcounter. If I run the 
same mapreduce job twice in a row, I get different counts:

hbase org.apache.hadoop.hbase.mapreduce.Driver rowcounter 
-Dmapreduce.map.speculative=false TABLENAME --starttime=148590720 
--endtime=148605840

Looking at 
org.​apache.​hadoop.​hbase.​mapreduce.​RowCounter​$RowCounterMapper​$Counters:
Run 1: 4876683
Run 2: 4866351

Similarly with exports of the same date/time. Consecutive runs of the export 
get different results:
hbase org.apache.hadoop.hbase.mapreduce.Export \
-Dmapred.map.tasks.speculative.execution=false \
-Dmapred.reduce.tasks.speculative.execution=false \
TABLENAME \
HDFSPATH 1 148590720 148605840

 From Map Input/output records:
Run 1: 4296778
Run 2: 4297307

None of the results show anything for spilled records, no failed maps. 
Sometimes the row count increases, sometimes it decreases. We aren’t using any 
row filter queries, we just want to export chunks of the data for a specific 
time range. This table is actively being read/written to, but I am asking about 
a date range in early 2017 in this case, so that should have no impact I would 
have thought. Another point is that the rowcount job and the export return 
ridiculously different numbers. There should be no older versions of rows 
involved as we are set to only keep the newest, and I can confirm that there 
are rows that are consistently missing from the exports. Table definition is 
below.

hbase(main):001:0> describe 'TABLENAME'
Table TABLENAME is ENABLED
TABLENAME
COLUMN FAMILIES DESCRIPTION
{NAME => 'text', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'ROW', REPLICATION_SCOPE => 
'0', COMPRESSION => 'SNAPPY', VERSIONS => '1', MIN_VERSIONS => '0', TTL => 'FOREVER', 
KEEP_DELETED_CELLS => 'FALSE', BLO
CKSIZE => '65536', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
1 row(s) in 0.2800 seconds

Any advice/suggestions would be greatly appreciated, are some of my assumptions 
wrong regarding import/export and that it should be consistent given consistent 
date/times?


Andrew Kettmann
Platform Services Group



Re: HBase Thrift - HTTP - Kerberos & SPNEGO

2018-01-11 Thread Josh Elser

Hey Kevin!

Looks like you got some good changes in here.

IMO, the HBase Thrift2 "implementation" makes more sense to me (I'm sure 
there was a reason for having HTTP be involved at one point, but Thrift 
today has the ability to do all of this RPC work for us). I'm not sure 
what the HBase API implementations look like between the two versions.


If you'd like to open up a JIRA and throw up a patch, you'd definitely 
have my attention if no one else's :)


On 1/11/18 9:31 AM, Kevin Risden wrote:

I'm not 100% sure this should be posted to user list, but starting here
before dev list/JIRA.

I've been working on setting up the Hue HBase and it requires HBase Thrift
v1 server. To support impersonation/proxyuser, the documentation states
that this must be done with HTTP and not binary mode. The cluster has
Kerberos and so the final setup ends up being HBase Thrift in HTTP mode
with Kerberos.

While setting up the HBase Thrift server with HTTP, there were a
significant amount of 401 errors where the HBase Thrift wasn't able to
handle the incoming Kerberos request. Documentation online is sparse when
it comes to setting up the principal/keytab for HTTP Kerberos.

I noticed that the HBase Thrift HTTP implementation was missing SPNEGO
principal/keytab like other Thrift based servers (HiveServer2). It looks
like HiveServer2 Thrift implementation and HBase Thrift v1 implementation
were very close to the same at one point. I made the following changes to
HBase Thrift v1 server implementation to make it work:
* add SPNEGO principal/keytab if in HTTP mode
* return 401 immediately if no authorization header instead of waiting for
try/catch down in program flow

The code changes are available here:
https://github.com/risdenk/hortonworks-hbase-release/compare/HDP-2.5.3.126-base...fix_hbase_thrift_spnego

Does this seem like the right approach?

The same types of changes should apply to master as well. If this looks
reasonable, I can create a JIRA and generate patch against Apache HBase
master.

Side note: I saw the notes about HBase Thrift v1 was meant to go away at
some point but looks like it is still being depended on.

Kevin Risden



Re: Not able to use HBaseTestingUtility with CDH 5.7

2018-01-05 Thread Josh Elser
There is no such artifact with the groupId & artifactId 
org.apache.hbase:hbase for Apache HBase. I assume would be the same for CDH.


You need the test jar from hbase-server if you want the 
HBaseTestingUtility class.


On 1/5/18 10:23 AM, Debraj Manna wrote:

Cross posting from


- stackoverflow


- cloudera forum




I am trying to use HBaseTestingUtility with CDH 5.7 as mentioned in the
below blog and github

-

http://blog.cloudera.com/blog/2013/09/how-to-test-hbase-applications-using-popular-tools/
- https://github.com/sitaula/HBaseTest

I have modified my pom.xml for CDH 5.7 like below

http://maven.apache.org/POM/4.0.0";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
 4.0.0
 HBaseTest
 Test
 0.0.1-SNAPSHOT
 Test Project
 
 
 cloudera
 
https://repository.cloudera.com/artifactory/cloudera-repos/
 
 
 
 2.6.0-cdh5.7.1
 1.2.0-cdh5.7.0
 
 
 
 org.apache.hadoop
 hadoop-core
 2.6.0-mr1-cdh5.7.0.2
 
 
 org.apache.hbase
 hbase
 ${hbase.version}
 compile
 
 
 junit
 junit
 4.11
 test
 
 
 org.apache.hadoop
 hadoop-common
 ${hadoop.version}
 compile
 
 
 org.mockito
 mockito-all
 1.9.5
 
 
 org.apache.mrunit
 mrunit
 0.9.0-incubating
 hadoop2
 
 
 org.apache.hadoop
 hadoop-common
 ${hadoop.version}
 test-jar
 test
 
 
 org.apache.hbase
 hbase
 ${hbase.version}
 test-jar
 test
 
 
 org.apache.hadoop
 hadoop-hdfs
 ${hadoop.version}
 test-jar
 test
 
 
 org.apache.hadoop
 hadoop-hdfs
 ${hadoop.version}
 test
 
 

But on trying to do mvn clean install it is failing with the below error

[ERROR] Failed to execute goal on project Test: Could not resolve
dependencies for project HBaseTest:Test:jar:0.0.1-SNAPSHOT: The
following artifacts could not be resolved:
org.apache.hbase:hbase:jar:1.2.0-cdh5.7.0,
org.apache.hbase:hbase:jar:tests:1.2.0-cdh5.7.0: Failure to find
org.apache.hbase:hbase:jar:1.2.0-cdh5.7.0

Can someone let me know what is going wrong?



Re: WAL Fsync - how do you live without it

2017-12-12 Thread Josh Elser
Redundant power supplies are probably your next-best bet for running 
without fsync (hsync). Something that can prevent a node from going down 
hard will mitigate this issue for the most part.


The importance of this is often a multi-variable equation. The small 
chance for data loss that exists with hflush can be a non-issue for some 
(e.g. ability to replay original data, acceptability of minor data loss, 
knowledge to rebuild hbase:meta as required)


On 12/7/17 6:37 PM, Daniel Połaczański wrote:

Hi,
fsync mode on WAL is not supported currenlty. So theoritically it is
possible that in case of power failure the data is lost because DataNodes
didn't flush it to physical disk.

I know that probability is small and data center should have additional
power source, but still it is possible. How do you deal with it? Do you
care about it? How to check after the disaster later from which moment the
data is lost? How often DataNodes flush buffers to disk?
What is your solution?

Regards



Re: Running major compactions in controlled manner

2017-11-02 Thread Josh Elser

Thanks for sharing, Sahil.

A couple of thoughts at a glance:

* You should add a LICENSE to your project so people know how they can 
(re)use your project.
* You have a dependency against 1.0.3, and I see at least one thing that 
will not work against 2.0.0. Would be great if you wrote up what 
versions you tested it against and expect it to function.


On 11/2/17 4:32 AM, sahil aggarwal wrote:

Hi,

Sharing small utility  i
wrote to run major compactions in tightly controlled manner to delete the
data. It has proved to be very helpful so far in compacting cluster having
around ~60k regions without impacting prod load.

Please do share if you have anything better :)

Thanks,
Sahil



Re: How to stream data out of hbase

2017-10-24 Thread Josh Elser
The most reliably way (read-as, likely to continue working across HBase 
releases) would probably be to implement a custom ReplicationEndpoint.


This would abstract away the logic behind "tail'ing of WALs" and give 
you some nicer APIs to leverage. Beware that this would still be a 
rather significant undertaking that would likely require you to dig into 
HBase internals to get correct.


On 10/24/17 4:02 PM, yeshwanth kumar wrote:

Hi

i am searching for a way to stream data from hbase,
one way to do is with filters , but i need to query hbase continously,
another way is to read directly from WAL, (i am searching for sample code,
and i found WALReader and WAL.Entry API's.  can i use them directly without
any side effects)

can anyone suggest me a good way to stream data out of hbase, as the write
happens, i want the same data to be pushed to another data source.
please let me know


-Yeshwanth
Can you Imagine what I would do if I could do all I can - Art of War



Re: Configuring HBASE in HDP with version HBASE-1.2.6

2017-09-19 Thread Josh Elser

Lalit,

Typically, questions about "vendor products" are best reserved for their 
respective forums. This question is not relevant to the Apache HBase 
community.


Please consider asking your question on 
https://community.hortonworks.com/ instead.


- Josh

On 9/19/17 2:17 AM, Lalit Jadhav wrote:

Hi,

  I am using *HDP-2.4* with *HBASE-1.1.2*. My question is

1. Can I configure *HBASE-1.2.6* in HDP also can I use Ambari UI for
monitoring.
2. If not I need a UI monitor. Please Suggest any.



Re: Offheap config question for Hbase 1.1.2

2017-09-12 Thread Josh Elser
FWIW, last I looked into this, 
https://issues.apache.org/jira/browse/HBASE-15154 would be the long-term 
solution to the Master also requiring the MaxDirectMemorySize 
configuration (even when it is not acting as a RegionServer).


Obviously, it's a lower priority fix as there is a simple workaround.

On 9/12/17 12:50 PM, Arul Ramachandran wrote:

Ted,

For #1, I was following:

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_data-access/content/ch_managing-hbase.html#ref-db219cd6-c586-49c1-bc56-c9c1c5475276.1

++

For #2, I got the error "Caused by: java.lang.OutOfMemoryError: Direct
buffer memory" when starting hbase master.

https://community.hortonworks.com/questions/129825/hbase-bucket-cache-causing-hmaster-to-go-down.html

describes the fix.

++

On Tue, Sep 12, 2017 at 9:10 AM, Ted Yu  wrote:


Looks like the config you meant should be hbase.bucketcache.size

As the refguide says:

A float that EITHER represents a percentage of total heap memory size to
give to the cache (if < 1.0) OR, it is the total capacity in megabytes of
BucketCache. Default: 0.0

If you specify the size as capacity, -XX:MaxDirectMemorySize should be
bigger than the capacity.

For #2, did you encounter some error ?

Cheers

On Tue, Sep 12, 2017 at 8:52 AM, Arul Ramachandran 
wrote:


In HBase 1.1.2, I am setting up bucket cache. I set MaxDirectMemory size
greater than hbase.bucket.cache.size  - only then it would work.

1) Does HBASE_REGIONSERVER_OPTS -XX:MaxDirectMemorySize needs to be

greater

than hbase.bucket.cache.size?
2) It seems with hbase 1.1.2, HBASE_MASTER_OPTS also needs the
  -XX:MaxDirectMemorySize setting?

IIRC, in Hbase 0.98, I had to set -XX:MaxDirectMemorySize less than
hbase.bucket.cache.size --and-- I did not have to set
  -XX:MaxDirectMemorySize for HBASE_MASTER_OPTS.


Thanks,
Arul







Re: Fast search by any column

2017-08-31 Thread Josh Elser

Well put, Dave! And yes, same for Phoenix.

HBase provides exactly one "index" (on rowkey). Thus it's the 
application's responsibility to build other indexes to support different 
kind of lookups. This is exactly what projects like Trafodian and 
Phoenix (not to mention, others) do.


On 8/30/17 2:09 PM, Dave Birdsall wrote:

Trafodion (and I think Phoenix also) both support secondary indexes. So you can 
create indexes on any attribute that you wish to search upon.

Trafodion in addition allows one to pack multiple columns (logically speaking) 
into the HBase key. It also has a feature that allows intelligent use of 
multiple-column indexes. For example, if I have an index on STATE, CITY, and I 
have a predicate of the form CITY = 'St. Louis' (without a predicate on STATE), 
Trafodion can implicitly materialize the distinct STATE values efficiently and 
access CITY = 'St. Louis' directly using that index. So one does not have to 
create as many indexes as one might otherwise. This feature is useful when the 
table continues to grow or is updated; fewer indexes requires less overhead for 
index maintenance. But if the data set is static, you may as well just create 
indexes until your heart is content (space permitting of course).

-Original Message-
From: Andrzej [mailto:borucki_andr...@wp.pl]
Sent: Wednesday, August 30, 2017 11:03 AM
To: user@hbase.apache.org
Subject: Re: Fast search by any column

W dniu 30.08.2017 o 19:54, Dave Birdsall pisze:

As Josh Elser mentioned, you might try Apache Phoenix.
You could try any SQL-on-HBase solution, actually. Apache Trafodion 
(incubating) is another example.


As I understand, Apache Phoenix and Apache Trafodion are highest layer than 
HBase and both uses HBase. How they can fast seach, since HBase does not allow 
this?



Re: Need help with Row Key design

2017-08-30 Thread Josh Elser

You may find Apache Phoenix to be of use as you explore your requirements.

Phoenix provides a much higher-level API which provides logic to build 
composite rowkeys (e.g. primary key constraints over multiple columns) 
for you automatically. This would help you iterate much faster as you 
better understand the storage and query requirements of your application.


On 8/30/17 8:21 AM, deepaksharma25 wrote:

Hello,
I am new to HBase DB and currently evaluating it for one of the requirement
we have from Customer.
We are going to write TBs of data in HBase daily and we need to fetch
specifc data based on filter.
  
I came to know that it is very important to design the row key in such a

manner, so that it effectively uses it to fetch the data from the specific
node instead of scanning thru all the records in the database, based on the
type of row key we design.
  
The problem with our requirement is that, we don't have any specific field

which can be used to define the rowkey. We have around 7-8 fields available
on the frontend, which can be used to filter the records from HBase.
  
Can you please suggest, what should be the design of my row key, which will

help in faster retrieval of the data from TBs of data?
Attaching here the sample screen I am referring in this
 .
  
Thanks,

Deepak Sharma



--
Sent from: http://apache-hbase.679495.n3.nabble.com/HBase-User-f4020416.html



  1   2   >