Some people probably can share more lights. I recall no hbase built-in
construct to limit or filter the Audit logs, and I agree it can be very
verbose depending on the access operations.
We can propose two approaches to access and fix the issues.
1 Some low hanging cleanup and simplification of
Hi,
Just to give a background, recently I enabled Audit logs in HBase and I
could see all operations done in HBase is being logged in
SecurityAuth.audit. And our application is running as hbase user and all put
and get done by this user is also being audited. Due to this audit logs are
flooding
I'll paste a thread dump later, writing this from my phone :-)
So the same issue has happened at different times for different regions,
but I couldn't see that the region in question was the one being compacted,
either this time or earlier. Although I might have missed an earlier
correlation in
I think it is for HBASE itself. But I'll have to wait for more details as
they haven't shared the source code with us. I imagine they want to do a
bunch more testing and other process stuff.
Saad
On Wed, Feb 28, 2018 at 9:45 PM Ted Yu wrote:
> Did the vendor say
Did the vendor say whether the patch is for hbase or some other component ?
Thanks
On Wed, Feb 28, 2018 at 6:33 PM, Saad Mufti wrote:
> Thanks for the feedback, so you guys are right the bucket cache is getting
> disabled due to too many I/O errors from the underlying
One additional data point, I tried to manually re-assign the region in
question from the shell, that for some reason caused the region server to
restart and the region did get assigned to another region server. But then
the problem moved to that region server almost immediately.
Does that just
bq. timing out trying to obtain write locks on rows in that region.
Can you confirm that the region under contention was the one being major
compacted ?
Can you pastebin thread dump so that we can have better idea of the
scenario ?
For the region being compacted, how long would the compaction
Thanks, see my other reply. We have a patch from the vendor but until it
gets promoted to open source we still don't know the real underlying cause,
but you're right the cache got disabled due to too many I/O errors in a
short timespan.
Cheers.
Saad
On Mon, Feb 26, 2018 at 12:24 AM,
Thanks for the feedback, so you guys are right the bucket cache is getting
disabled due to too many I/O errors from the underlying files making up the
bucket cache. Still do not know the exact underlying cause, but we are
working with our vendor to test a patch they provided that seems to have
Hi,
We are running on Amazon EMR based HBase 1.4.0 . We are currently seeing a
situation where sometimes a particular region gets into a situation where a
lot of write requests to any row in that region timeout saying they failed
to obtain a lock on a row in a region and eventually they
Thank you for offer of help. Just trying it and talking out loud when
problems is a great help. Please try tip of branch-2. It should be beta-2
soon.
Tsdb won't work against hbase2. Someone of us needs to fix asynchbase to do
reverse scan instead of closestBefore. I filed an issue a while back
Stack,
It didn't work with zookeeper.version=3.4.10 too.
If that's the case then will try what Ted suggested i.e trying out 2.0
SNAPSHOT.
Moreover, while I am at it can I help you guys with testing anything else
that may you guys have in mind or any other grunt work to give you guys
more room
Any progress Sahil?
There was an issue fixed where we'd write the clusterid with server zk
client but then would have trouble picking it up with the new zk read-only
client seen in tests and fixed subsequent to beta-1. This looks like it.
Thanks for trying the beta.
S
On Thu, Feb 22, 2018
13 matches
Mail list logo