[jira] [Reopened] (HBASE-23595) HMaster abort when write to meta failed

2021-05-14 Thread Esteban Gutierrez (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez reopened HBASE-23595:
---

> HMaster abort when write to meta failed
> ---
>
> Key: HBASE-23595
> URL: https://issues.apache.org/jira/browse/HBASE-23595
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.2.2
>Reporter: Lijin Bin
>Priority: Major
>
> RegionStateStore
> {code}
>   private void updateRegionLocation(RegionInfo regionInfo, State state, Put 
> put)
>   throws IOException {
> try (Table table = 
> master.getConnection().getTable(TableName.META_TABLE_NAME)) {
>   table.put(put);
> } catch (IOException e) {
>   // TODO: Revist Means that if a server is loaded, then we will 
> abort our host!
>   // In tests we abort the Master!
>   String msg = String.format("FAILED persisting region=%s state=%s",
> regionInfo.getShortNameToLog(), state);
>   LOG.error(msg, e);
>   master.abort(msg, e);
>   throw e;
> }
>   }
> {code}
> When regionserver (carry meta) stop or crash, if the ServerCrashProcedure 
> have not start process, write to meta will fail and abort master.   



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-19352) Port HADOOP-10379: Protect authentication cookies with the HttpOnly and Secure flags

2020-09-03 Thread Esteban Gutierrez (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-19352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-19352.
---
Fix Version/s: 2.2.6
   2.4.0
   2.3.3
   3.0.0-alpha-1
 Tags: security
   Resolution: Fixed

> Port HADOOP-10379: Protect authentication cookies with the HttpOnly and 
> Secure flags
> 
>
> Key: HBASE-19352
> URL: https://issues.apache.org/jira/browse/HBASE-19352
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
>Priority: Major
> Fix For: 3.0.0-alpha-1, 2.3.3, 2.4.0, 2.2.6
>
> Attachments: HBASE-19352.master.v0.patch
>
>
> This came via a security scanner, since we have a fork of HttpServer2 in 
> HBase we should include it too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24041) [regression] Increase RESTServer buffer size back to 64k

2020-03-27 Thread Esteban Gutierrez (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-24041.
---
Fix Version/s: 2.2.5
   2.4.0
   2.3.0
   3.0.0
   Resolution: Fixed

> [regression]  Increase RESTServer buffer size back to 64k
> -
>
> Key: HBASE-24041
> URL: https://issues.apache.org/jira/browse/HBASE-24041
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 3.0.0, 2.2.0, 2.3.0, 2.4.0
>Reporter: Esteban Gutierrez
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 2.4.0, 2.2.5
>
>
> HBASE-14492 is not longer present in our current releases after HBASE-12894. 
> Unfortunately our RESTServer is not extending HttpServer which means that 
> {{DEFAULT_MAX_HEADER_SIZE}} is not being set and HTTP requests with a very 
> large header can still cause connection issues for clients. A quick fix is 
> just to add the settings to the {{HttpConfiguration}} configuration object. A 
> long term solution should be to re-factor services that create an HTTP server 
> and normalize all configuration settings across all of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24041) [regression] Increase RESTServer buffer size back to 64k

2020-03-24 Thread Esteban Gutierrez (Jira)
Esteban Gutierrez created HBASE-24041:
-

 Summary: [regression]  Increase RESTServer buffer size back to 64k
 Key: HBASE-24041
 URL: https://issues.apache.org/jira/browse/HBASE-24041
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.2.0, 3.0.0, 2.3.0, 2.4.0
Reporter: Esteban Gutierrez


HBASE-14492 is not longer present in our current releases after HBASE-12894. 
Unfortunately our RESTServer is not extending HttpServer which means that 
{{DEFAULT_MAX_HEADER_SIZE}} is not being set and HTTP requests with a very 
large header can still cause connection issues for clients. A quick fix is just 
to add the settings to the {{HttpConfiguration}} configuration object. A long 
term solution should be to re-factor services that create an HTTP server and 
normalize all configuration settings across all of them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-22926) REST server should return 504 Gateway Timeout Error on scanner timeout

2019-08-26 Thread Esteban Gutierrez (Jira)
Esteban Gutierrez created HBASE-22926:
-

 Summary: REST server should return 504 Gateway Timeout Error on 
scanner timeout
 Key: HBASE-22926
 URL: https://issues.apache.org/jira/browse/HBASE-22926
 Project: HBase
  Issue Type: Bug
  Components: REST
Affects Versions: 2.2.0, 2.1.0, 3.0.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


Currently when a scanner timeout error occurs on the RS side, a client will get 
a RetriesExhaustedException that will make the client to fail, however from the 
REST server point of view that is just an IOE:

org.apache.hadoop.hbase.rest.ScannerResultGenerator#next
{code}
} else {
Result result = null;
try {
  result = scanner.next();
} catch (UnknownScannerException e) {
  throw new IllegalArgumentException(e);
} catch (TableNotEnabledException tnee) {
  throw new IllegalStateException(tnee);
} catch (TableNotFoundException tnfe) {
  throw new IllegalArgumentException(tnfe);
} catch (IOException e) {
  LOG.error(StringUtils.stringifyException(e));
}
{code}

Now, with that empty result (will handle this as an HTTP 204 response back to 
the client:

org.apache.hadoop.hbase.rest.ScannerInstanceResource#get
{code}
...
  Cell value = null;
  try {
value = generator.next();
  } catch (IllegalStateException e) {
...
  } catch (IllegalArgumentException e) {
...
  }
...
if (value == null) {
if (LOG.isTraceEnabled()) {
  LOG.trace("generator exhausted");
}
// respond with 204 (No Content) if an empty cell set would be
// returned
if (count == limit) {
  return Response.noContent().build();
}
break;
{code}

Obviously this is wrong, since a RetriesExhaustedException is most likely due a 
failure in the RS side. The correct behavior should be a 504 Gateway Timeout 
Error.






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (HBASE-22253) An AuthenticationTokenSecretManager leader won't step down if another RS claims to be a leader

2019-04-16 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-22253:
-

 Summary: An AuthenticationTokenSecretManager leader won't step 
down if another RS claims to be a leader
 Key: HBASE-22253
 URL: https://issues.apache.org/jira/browse/HBASE-22253
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 2.1.0, 3.0.0, 2.2.0
Reporter: Esteban Gutierrez


We ran into a situation were a rouge Lily HBase Indexer [SEP 
Consumer|https://github.com/NGDATA/hbase-indexer/blob/master/hbase-sep/hbase-sep-impl/src/main/java/com/ngdata/sep/impl/SepConsumer.java#L169]
 sharing the same {{zookeeper.znode.parent}} claimed to be 
AuthenticationTokenSecretManager for an HBase cluster. This situation 
undesirable since the leader running on the HBase cluster doesn't steps down 
when the rouge leader registers in the HBase cluster and both will start 
rolling keys with the same IDs causing authentication errors. Even a reasonable 
"fix" is to point to a different {{zookeeper.znode.parent}}, we should make 
sure that we step down as leader correctly.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-22019) Ability to remotely connect to hbase when hbase/zook is hosted on dynamic IP addresses

2019-03-08 Thread Esteban Gutierrez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-22019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-22019.
---
Resolution: Invalid

Thanks for reporting this, [~toopt4]. Please reach out the HBase user [mailing 
list|https://hbase.apache.org/mailing-lists.html] for this type of this 
problems since many users have been able to connect to HBase clusters without 
any problem regardless if the HBase is on different networks. 

> Ability to remotely connect to hbase when hbase/zook is hosted on dynamic IP 
> addresses
> --
>
> Key: HBASE-22019
> URL: https://issues.apache.org/jira/browse/HBASE-22019
> Project: HBase
>  Issue Type: New Feature
>  Components: IPC/RPC, Zookeeper
>Reporter: t oo
>Priority: Major
>
> Our team's need for this is purely for remote connections (ie personal 
> laptops) to HBASE (hosted on EC2) to work as hbase connections under the 
> cover connect to zookeeper (also running on EC2) and attempt to resolve the 
> hostname (not DNS!) of the machine running zookeeper. From what I've read 
> others  re facing the issue:
> https://forums.aws.amazon.com/thread.jspa?threadID=119915
> https://stackoverflow.com/questions/30751187/unable-to-connect-to-hbase-stand-alone-server-from-windows-remote-client
> https://sematext.com/opensee/m/HBase/YGbbw6MGk1B9nCv?subj=Re:+Remote+Java+client+connection+into+EC2+instance
> https://community.cloudera.com/t5/Storage-Random-Access-HDFS/Problem-in-connectivity-between-HBase-amp-JAVA/td-p/1693
> https://stackoverflow.com/questions/9413481/hbase-node-could-not-be-reached-from-hbase-java-api-client
> https://groups.google.com/forum/#!topic/opentsdb/3w4FCnPYRDg
> Between ec2s I don't get the below error because I can edit /etc/hosts to add 
> the host name below but don't have root/admin access on other machines to do 
> the same. Problem is if we have 100s of users wanting to connect to hbase 
> data then they would all face this /etc/hosts issue.
> Example of the error:
> 19/03/01 17:02:14 WARN client.ConnectionUtils: Can not resolve 
> ip-10x.com, please check your network
> java.net.UnknownHostException: ip-10x.com: Name or service not known
>  at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
>  at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929)
>  at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324)
>  at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
>  at java.net.InetAddress.getAllByName(InetAddress.java:1193)
>  at java.net.InetAddress.getAllByName(InetAddress.java:1127)
>  at java.net.InetAddress.getByName(InetAddress.java:1077)
>  at 
> org.apache.hadoop.hbase.client.ConnectionUtils.getStubKey(ConnectionUtils.java:233)
>  at 
> org.apache.hadoop.hbase.client.ConnectionImplementation.getClient(ConnectionImplementation.java:1189)
>  at 
> org.apache.hadoop.hbase.client.ClientServiceCallable.setStubByServiceName(ClientServiceCallable.java:44)
>  at 
> org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:229)
>  at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
>  at org.apache.hadoop.hbase.client.HTable.get(HTable.java:386)
>  at org.apache.hadoop.hbase.client.HTable.get(HTable.java:360)
>  at 
> org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1066)
>  at 
> org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:389)
>  at org.apache.hadoop.hbase.client.HBaseAdmin$6.rpcCall(HBaseAdmin.java:437)
>  at org.apache.hadoop.hbase.client.HBaseAdmin$6.rpcCall(HBaseAdmin.java:434)
>  at 
> org.apache.hadoop.hbase.client.RpcRetryingCallable.call(RpcRetryingCallable.java:58)
>  at 
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3055)
>  at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3047)
>  at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:434)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21134) Add guardrails to cell tags in order to avoid the tags length to overflow

2018-08-30 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-21134:
-

 Summary: Add guardrails to cell tags in order to avoid the tags 
length to overflow 
 Key: HBASE-21134
 URL: https://issues.apache.org/jira/browse/HBASE-21134
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.5.0
Reporter: Esteban Gutierrez


We found that per cell tags can easily overflow and and cause failures while 
reading HFiles. If a mutation has more than 32KB for the byte array with the 
tags we should reject the operation on the client side (proactively) and the 
server side as we deserialize the request.

{code}
2018-08-21 11:08:45,387 ERROR 
org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction failed 
Request = regionName=table1,,1534870486680.9112ca53504084152da5e28116f40ec2., 
storeName=c1, fileCount=4, fileSize=254.2 K (138.0 K, 33.5 K, 34.0 K, 48.7 K), 
priority=1, time=8555785624243
java.lang.IllegalStateException: Invalid currTagsLen -20658. Block offset: 0, 
block length: 44912, position: 0 (without header).
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV3$ScannerV3.checkTagsLen(HFileReaderV3.java:226)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV3$ScannerV3.readKeyValueLen(HFileReaderV3.java:251)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.updateCurrBlock(HFileReaderV2.java:956)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.seekTo(HFileReaderV2.java:919)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:304)
at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:200)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:350)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:269)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.(StoreScanner.java:231)
at 
org.apache.hadoop.hbase.regionserver.compactions.Compactor.createScanner(Compactor.java:414)
at 
org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:91)
at 
org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:125)
at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1247)
at 
org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1915)
at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.doCompaction(CompactSplitThread.java:529)
at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:566)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20761) FSReaderImpl#readBlockDataInternal can fail to switch to HDFS checksums in some edge cases

2018-06-20 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-20761:
-

 Summary: FSReaderImpl#readBlockDataInternal can fail to switch to 
HDFS checksums in some edge cases
 Key: HBASE-20761
 URL: https://issues.apache.org/jira/browse/HBASE-20761
 Project: HBase
  Issue Type: Bug
  Components: HFile
Reporter: Esteban Gutierrez


One of our users reported this problem on HBase 1.2 before and after 
HBASE-11625:

{code}
Caused by: java.io.IOException: On-disk size without header provided is 131131, 
but block header contains 0. Block offset: 2073954793, data starts with: 
\x00\x00\x00\x00\x00\x00\x00\x0\
0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.validateOnDiskSizeWithoutHeader(HFileBlock.java:526)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.access$700(HFileBlock.java:92)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1699)
at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1542)
at 
org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:445)
at 
org.apache.hadoop.hbase.util.CompoundBloomFilter.contains(CompoundBloomFilter.java:100)
{code}

The problems occurs when we do a read a block without HDFS checksums enabled 
and due some data corruption we end with an empty headerBuf while trying to 
read the block before the HDFS checksum failover code. This will cause further 
attempts to read the block to fail since we will still retry the corrupt 
replica instead of reporting the corrupt replica and trying a different one. 








--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20604) ProtobufLogReader#readNext can incorrectly loop to the same position in the stream until the the WAL is rolled

2018-05-18 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-20604:
-

 Summary: ProtobufLogReader#readNext can incorrectly loop to the 
same position in the stream until the the WAL is rolled
 Key: HBASE-20604
 URL: https://issues.apache.org/jira/browse/HBASE-20604
 Project: HBase
  Issue Type: Bug
  Components: Replication, wal
Affects Versions: 3.0.0, 2.1.0, 1.5.0
Reporter: Esteban Gutierrez


Every time we call {{ProtobufLogReader#readNext}} we consume the input stream 
associated to the {{FSDataInputStream}} from the WAL that we are reading. Under 
certain conditions, e.g. when using the encryption at rest 
({{CryptoInputStream}}) the stream can return partial data which can cause a 
premature EOF that cause {{inputStream.getPos()}} to return to the same origina 
position causing {{ProtobufLogReader#readNext}} to re-try over the reads until 
the WAL is rolled.

The side effect of this issue is that {{ReplicationSource}} can get stuck until 
the WAL is rolled and causing replication delays up to an hour in some cases.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19572) RegionMover should use the configured default port number and not the one from HConstants

2017-12-20 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-19572:
-

 Summary: RegionMover should use the configured default port number 
and not the one from HConstants
 Key: HBASE-19572
 URL: https://issues.apache.org/jira/browse/HBASE-19572
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez


The issue I ran into HBASE-19499 was due RegionMover not using the port used by 
{{hbase-site.xml}}. The tool should use the value used in the configuration 
before falling back to the hardcoded value 
{{HConstants.DEFAULT_REGIONSERVER_PORT}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HBASE-19499) RegionMover#stripMaster in RegionMover needs to handle HBASE-18511 gracefully

2017-12-20 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-19499.
---
Resolution: Not A Bug

> RegionMover#stripMaster in RegionMover needs to handle HBASE-18511 gracefully
> -
>
> Key: HBASE-19499
> URL: https://issues.apache.org/jira/browse/HBASE-19499
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Esteban Gutierrez
>
> Probably this is the first of few issues found during some tests with 
> RegionMover. After HBASE-13014 we ship the new RegionMover tool but it 
> currently assumes that master will be hosting regions so it attempts to 
> remove master from the list and that causes an issue similar to this:
> {code}
> 17/12/12 11:01:06 WARN util.RegionMover: Could not remove master from list of 
> RS
> java.lang.Exception: Server host1.example.com:22001 is not in list of online 
> servers(Offline/Incorrect)
>   at 
> org.apache.hadoop.hbase.util.RegionMover.stripServer(RegionMover.java:818)
>   at 
> org.apache.hadoop.hbase.util.RegionMover.stripMaster(RegionMover.java:757)
>   at 
> org.apache.hadoop.hbase.util.RegionMover.access$1800(RegionMover.java:78)
>   at 
> org.apache.hadoop.hbase.util.RegionMover$Unload.call(RegionMover.java:339)
>   at 
> org.apache.hadoop.hbase.util.RegionMover$Unload.call(RegionMover.java:314)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> Basicaly



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19499) RegionMover#stripMaster is not longer necessary in RegionMover

2017-12-12 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-19499:
-

 Summary: RegionMover#stripMaster is not longer necessary in 
RegionMover
 Key: HBASE-19499
 URL: https://issues.apache.org/jira/browse/HBASE-19499
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Esteban Gutierrez


Probably this is the first of few issues found during some tests with 
RegionMover. After HBASE-13014 we ship the new RegionMover tool but it 
currently assumes that master will be hosting regions so it attempts to remove 
master from the list and that causes an issue similar to this:

{code}
17/12/12 11:01:06 WARN util.RegionMover: Could not remove master from list of RS
java.lang.Exception: Server host1.example.com:22001 is not in list of online 
servers(Offline/Incorrect)
at 
org.apache.hadoop.hbase.util.RegionMover.stripServer(RegionMover.java:818)
at 
org.apache.hadoop.hbase.util.RegionMover.stripMaster(RegionMover.java:757)
at 
org.apache.hadoop.hbase.util.RegionMover.access$1800(RegionMover.java:78)
at 
org.apache.hadoop.hbase.util.RegionMover$Unload.call(RegionMover.java:339)
at 
org.apache.hadoop.hbase.util.RegionMover$Unload.call(RegionMover.java:314)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}

Basicaly



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19391) Calling HRegion#initializeRegionInternals from a region replica can still re-create a region directory

2017-11-30 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-19391:
-

 Summary: Calling HRegion#initializeRegionInternals from a region 
replica can still re-create a region directory
 Key: HBASE-19391
 URL: https://issues.apache.org/jira/browse/HBASE-19391
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


This is a follow up from HBASE-18024. There stills a chance that attempting to 
open a region that is not the default region replica can still create a GC'd 
region directory by the CatalogJanitor causing inconsistencies with hbck.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19390) Revert to older version of Jetty 9.3

2017-11-30 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-19390:
-

 Summary: Revert to older version of Jetty 9.3 
 Key: HBASE-19390
 URL: https://issues.apache.org/jira/browse/HBASE-19390
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez


As discussed in HBASE-19256 we will have to temporarily revert to Jetty 9.3 due 
existing issues with 9.4 and Hadoop3. Once HBASE-19256 is resolved we can 
revert to 9.4.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19352) Port HADOOP-10379: Protect authentication cookies with the HttpOnly and Secure flags

2017-11-27 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-19352:
-

 Summary: Port HADOOP-10379: Protect authentication cookies with 
the HttpOnly and Secure flags
 Key: HBASE-19352
 URL: https://issues.apache.org/jira/browse/HBASE-19352
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez


This came via a security scanner, since we have a fork of HttpServer2 in HBase 
we should include it too.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HBASE-18987) Raise value of HConstants#MAX_ROW_LENGTH

2017-11-20 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-18987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-18987.
---
Resolution: Later

Solving as later since we could only do this with a new HFile format.

> Raise value of HConstants#MAX_ROW_LENGTH
> 
>
> Key: HBASE-18987
> URL: https://issues.apache.org/jira/browse/HBASE-18987
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 1.0.0, 2.0.0
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
>Priority: Minor
> Attachments: HBASE-18987.master.001.patch, 
> HBASE-18987.master.002.patch
>
>
> Short.MAX_VALUE hasn't been a problem for a long time but one of our 
> customers ran into an  edgy case when the midKey used for the split point was 
> very close to Short.MAX_VALUE. When the split is submitted, we attempt to 
> create the new two daughter regions and we name those regions via 
> {{HRegionInfo.createRegionName()}} in order to be added to META. 
> Unfortunately, since {{HRegionInfo.createRegionName()}} uses midKey as the 
> startKey {{Put}} will fail since the row key length will now fail checkRow 
> and thus causing the split to fail.
> I tried a couple of alternatives to address this problem, e.g. truncating the 
> startKey. But the number of changes in the code doesn't justify for this edge 
> condition. Since we already use {{Integer.MAX_VALUE - 1}} for 
> {{HConstants#MAXIMUM_VALUE_LENGTH}} it should be ok to use the same limit for 
> the maximum row key. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19309) Lower HConstants#MAX_ROW_LENGTH as guardrail in order to avoid HBASE-18987

2017-11-20 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-19309:
-

 Summary: Lower HConstants#MAX_ROW_LENGTH as guardrail in order to 
avoid HBASE-18987
 Key: HBASE-19309
 URL: https://issues.apache.org/jira/browse/HBASE-19309
 Project: HBase
  Issue Type: Bug
  Components: HFile, regionserver
Reporter: Esteban Gutierrez


As discussed in HBASE-18987. A problem of having a row about the maximum size 
of a row (Short.MAX_VALUE) is when a split happens, there is a possibility that 
the midkey could be that row and the Put created to add the new entry in META 
will exceed the maximum row size since the new row key will include the table 
name and that will cause the split to abort. Since is not possible to raise 
that row key size in HFileV3, a reasonable solution is to reduce the maximum 
size of row key in order to avoid exceeding Short.MAX_VALUE.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18987) Raise value of HConstants#MAX_ROW_LENGTH

2017-10-11 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-18987:
-

 Summary: Raise value of HConstants#MAX_ROW_LENGTH
 Key: HBASE-18987
 URL: https://issues.apache.org/jira/browse/HBASE-18987
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 1.0.0, 2.0.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Minor


Short.MAX_VALUE hasn't been a problem for a long time but one of our customers 
ran into an  edgy case when the midKey used for the split point was very close 
to Short.MAX_VALUE. When the split is submitted, we attempt to create the new 
two daughter regions and we name those regions via 
{{HRegionInfo.createRegionName()}} in order to be added to META. Unfortunately, 
since {{HRegionInfo.createRegionName()}} uses midKey as the startKey {{Put}} 
will fail since the row key length will now fail checkRow and thus causing the 
split to fail.

I tried a couple of alternatives to address this problem, e.g. truncating the 
startKey. But the number of changes in the code doesn't justify for this edge 
condition. Since we already use {{Integer.MAX_VALUE - 1}} for 
{{HConstants#MAXIMUM_VALUE_LENGTH}} it should be ok to use the same limit for 
the maximum row key. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18563) Fix RAT License complaint about website jenkins scripts

2017-08-10 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-18563:
-

 Summary: Fix RAT License complaint about website jenkins scripts
 Key: HBASE-18563
 URL: https://issues.apache.org/jira/browse/HBASE-18563
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez
Priority: Trivial


{{2 Unknown Licenses

*

Files with unapproved licenses:

  dev-support/jenkins-scripts/check-website-links.sh
  dev-support/jenkins-scripts/generate-hbase-website.sh

*
}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18177) FanOutOneBlockAsyncDFSOutputHelper fails to compile against Hadoop 3

2017-06-06 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-18177:
-

 Summary: FanOutOneBlockAsyncDFSOutputHelper fails to compile 
against Hadoop 3
 Key: HBASE-18177
 URL: https://issues.apache.org/jira/browse/HBASE-18177
 Project: HBase
  Issue Type: Bug
  Components: wal
Reporter: Esteban Gutierrez


After HDFS-10996 ClientProtocol#create() needs to specify the erasure code 
policy to use. In the meantime we should add a workaround to 
FanOutOneBlockAsyncDFSOutputHelper to be able to compile against Hadoop 3 and 
Hadoop 2.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18025) CatalogJanitor should collect outdated RegionStates from the AM

2017-05-10 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-18025:
-

 Summary: CatalogJanitor should collect outdated RegionStates from 
the AM
 Key: HBASE-18025
 URL: https://issues.apache.org/jira/browse/HBASE-18025
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


I don't think this will matter on the long run for HBase 2, but at least in 
branch-1 and the current master we keep in multiple places copies of the region 
states in the master and this copies include information like the HRI. A 
problem that we have observed is when region replicas are being used and there 
is a split, the region replica from parent doesn't get collected from the 
region states and when the balancer tries to assign the old parent region 
replica, this will cause the RegionServer to create a new HRI with the details 
of the parent causing an inconstancy (see HBASE-18024).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18024) HRegion#initializeRegionInternals should not re-create .hregioninfo file when the region directory no longer exists

2017-05-10 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-18024:
-

 Summary: HRegion#initializeRegionInternals should not re-create 
.hregioninfo file when the region directory no longer exists
 Key: HBASE-18024
 URL: https://issues.apache.org/jira/browse/HBASE-18024
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment, regionserver
Affects Versions: 1.2.5, 1.3.1, 1.4.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


When a RegionSever attempts to open a region, during initialization the RS 
tries to open the {{/data///.hregioninfo}} file, 
however if the {{.hregioninfofile}} doesn't exist, the RegionServer will create 
a new one on {{HRegionFileSystem#checkRegionInfoOnFilesystem}}. A side effect 
of that tools like hbck will incorrectly assume an inconsistency due the 
presence of this new {{.hregioninfofile}}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17799) HBCK region boundaries check can return false negatives when IOExceptions are thrown

2017-03-17 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17799:
-

 Summary: HBCK region boundaries check can return false negatives 
when IOExceptions are thrown
 Key: HBASE-17799
 URL: https://issues.apache.org/jira/browse/HBASE-17799
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 2.0.0, 1.4.0, 1.3.1, 1.2.5
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


When enabled, HBaseFsck#checkRegionBoundaries will crawl all HFiles across all 
namespaces and tables when {{-boundaries}} is specified. However if an 
IOException is thrown by accessing a corrupt HFile, an un-handled HLink or by 
any other reason, we will only log the exception and stop crawling the HFiles 
and potentially reporting the wrong result.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17756) We should have better introspection of HFiles

2017-03-07 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17756:
-

 Summary: We should have better introspection of HFiles
 Key: HBASE-17756
 URL: https://issues.apache.org/jira/browse/HBASE-17756
 Project: HBase
  Issue Type: Brainstorming
  Components: HFile
Reporter: Esteban Gutierrez


[~saint@gmail.com] was suggesting to use DataSketches 
(https://datasketches.github.io) in order to write additional statistics to the 
HFiles. This could be used to improve our split decisions, troubleshooting or 
potentially do other interesting analysis without having to perform full table 
scans. The statistics could be stored as part of the HFile but we could 
initially improve the visibility of the data by adding some statistics to 
HFilePrettyPrinter.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17755) CellBasedKeyBlockIndexReader#midkey should exhaust search of the target middle key on skewed regions

2017-03-07 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17755:
-

 Summary: CellBasedKeyBlockIndexReader#midkey should exhaust search 
of the target middle key on skewed regions
 Key: HBASE-17755
 URL: https://issues.apache.org/jira/browse/HBASE-17755
 Project: HBase
  Issue Type: Bug
  Components: HFile
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


We have always been returning the middle key of the the block index regardless 
the distribution of the data on an HFile. A side effect of that approach is 
that when millions of rows share the same key its quite easy to run into a 
situation when the start key is equal to the middle key or when the end key is 
equal to the middle key making that HFile nearly impossible to split until 
enough data is written into the region and the middle key shifts to another row 
or when an operator uses a custom split point in order to split that region. 

Instead we should exhaust the search of the middle key in the block index in 
order to be able to split an HFile earlier when possible even if our edge case 
is to serve a region that could hold a single key with millions of versions of 
a row or with millions of qualifiers on the same row.





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-17679) Log arguments passed to hbck

2017-02-22 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-17679.
---
Resolution: Duplicate

Duplicate of HBASE-12678. Perhaps PrintingErrorReporter sends to stdout that 
information while our log4j properties make the console write to stderr. 

> Log arguments passed to hbck
> 
>
> Key: HBASE-17679
> URL: https://issues.apache.org/jira/browse/HBASE-17679
> Project: HBase
>  Issue Type: Bug
>Reporter: Esteban Gutierrez
>Assignee: Esteban Gutierrez
>Priority: Trivial
>
> Sometimes hbck arguments get lost and we only end up with the output of hbck. 
> This should log some basic info about our arguments passed to hbck for better 
> supportability. Additional server side logging will be added later on HBase 
> Admin calls in a different JIRA.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17679) Log arguments passed to hbck

2017-02-22 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17679:
-

 Summary: Log arguments passed to hbck
 Key: HBASE-17679
 URL: https://issues.apache.org/jira/browse/HBASE-17679
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Trivial


Sometimes hbck arguments get lost and we only end up with the output of hbck. 
This should log some basic info about our arguments passed to hbck for better 
supportability. Additional server side logging will be added later on HBase 
Admin calls in a different JIRA.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17622) Add hbase-metrics package to TableMapReduceUtil

2017-02-09 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17622:
-

 Summary: Add hbase-metrics package to TableMapReduceUtil
 Key: HBASE-17622
 URL: https://issues.apache.org/jira/browse/HBASE-17622
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 2.0.0
Reporter: Esteban Gutierrez
Priority: Trivial


HBASE-9774 moved our metrics to its own package recently, unfortunately running 
a MR job against snapshots will fail since  
org.apache.hadoop.hbase.metrics.impl.FastLongHistogram is not present in the 
classpath and is needed by the ClientSideRegionScanner (HStore keeps track 
cache statistics from LruBlockCache).  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17544) Expose metrics for the CatalogJanitor

2017-01-25 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17544:
-

 Summary: Expose metrics for the CatalogJanitor
 Key: HBASE-17544
 URL: https://issues.apache.org/jira/browse/HBASE-17544
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez


Currently there is other way to know what the CatalogJanitor is doing except in 
the logs. We should have better visibility of when it was the last time the 
CatalogJanitor ran, how long it took to scan meta, the number of merged and 
parent regions cleaned on the last run, and if in maintenance mode (see 
HBASE-16008). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17305) Two active HBase Masters can run at the same time under certain circumstances

2016-12-13 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17305:
-

 Summary: Two active HBase Masters can run at the same time under 
certain circumstances 
 Key: HBASE-17305
 URL: https://issues.apache.org/jira/browse/HBASE-17305
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 2.0.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Critical


This needs a little more investigation, but we found a very edgy case when the 
active master is restarted and a stand-by master tries to become active, 
however the original active master was able to become the active master again 
and just before the standby master passed the point of the transition to become 
active we ended up with two active masters running at the same time. Assuming 
the clock on both masters were accurate to milliseconds, this race happened in 
less than 85ms. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-9913) weblogic deployment project implementation under the mapreduce hbase reported a NullPointerException

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-9913.
--
Resolution: Duplicate

Fixed in HBASE-12491

> weblogic deployment project implementation under the mapreduce hbase reported 
> a NullPointerException
> 
>
> Key: HBASE-9913
> URL: https://issues.apache.org/jira/browse/HBASE-9913
> Project: HBase
>  Issue Type: Bug
>  Components: hadoop2, mapreduce
>Affects Versions: 0.94.10
> Environment: weblogic windows
>Reporter: 刘泓
> Attachments: TableMapReduceUtil.class, TableMapReduceUtil.java
>
>
> java.lang.NullPointerException
>   at java.io.File.(File.java:222)
>   at java.util.zip.ZipFile.(ZipFile.java:75)
>   at 
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.updateMap(TableMapReduceUtil.java:617)
>   at 
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.findOrCreateJar(TableMapReduceUtil.java:597)
>   at 
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:557)
>   at 
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.addDependencyJars(TableMapReduceUtil.java:518)
>   at 
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:144)
>   at 
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:221)
>   at 
> org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.initTableMapperJob(TableMapReduceUtil.java:87)
>   at 
> com.easymap.ezserver6.map.source.hbase.convert.HBaseMapMerge.beginMerge(HBaseMapMerge.java:163)
>   at 
> com.easymap.ezserver6.app.servlet.EzMapToHbaseService.doPost(EzMapToHbaseService.java:32)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
>   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
>   at 
> weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
>   at 
> weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
>   at 
> weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:292)
>   at 
> weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:175)
>   at 
> weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3594)
>   at 
> weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
>   at 
> weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121)
>   at 
> weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2202)
>   at 
> weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2108)
>   at 
> weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1432)
>   at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
>   at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)
> > 
> my project deploy under weblogic11,and when i run hbase mapreduce,it throws a 
> NullPointerException.i found the method 
> TableMapReduceUtil.findContainingJar() returns null,so i debug it, 
> url.getProtocol() return "zip",but the file is a jar file,so the if condition:
>  if ("jar".equals(url.getProtocol()))  cann't run. so i add a if condition to 
> judge "zip" type



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-9925) Don't close a file if doesn't EOF while replicating

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-9925.
--
Resolution: Later

Resolving for later, we should better fix other replication bottlenecks before 
we hit some contention from the NN.

> Don't close a file if doesn't EOF while replicating
> ---
>
> Key: HBASE-9925
> URL: https://issues.apache.org/jira/browse/HBASE-9925
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.0, 0.96.0
>Reporter: Himanshu Vashishtha
>
> While doing replication, we open and close the WAL file _every_ time we read 
> entries to send. We could open/close the reader only when we hit EOF. That 
> would alleviate some NN load, especially on a write heavy cluster.
> This came while discussing our current open/close heuristic in replication 
> with [~jdcryans].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-9940) PerformanceEvaluation should have a test with many table options on (Bloom, compression, FAST_DIFF, etc.)

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-9940.
--
Resolution: Fixed

Most of the features requested by [~jmspaggi] are already present in 
PerformanceEvaluation. Created HBASE-17116 to address missing feature to 
configure block size.


> PerformanceEvaluation should have a test with many table options on (Bloom, 
> compression, FAST_DIFF, etc.)
> -
>
> Key: HBASE-9940
> URL: https://issues.apache.org/jira/browse/HBASE-9940
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, test
>Affects Versions: 0.96.0, 0.94.13
>Reporter: Jean-Marc Spaggiari
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17116) [PerformanceEvaluation] Add option to configure block size

2016-11-16 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17116:
-

 Summary: [PerformanceEvaluation] Add option to configure block size
 Key: HBASE-17116
 URL: https://issues.apache.org/jira/browse/HBASE-17116
 Project: HBase
  Issue Type: Bug
  Components: tooling
Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.2.5
Reporter: Esteban Gutierrez
Priority: Trivial


Followup from HBASE-9940 to add option to configure block size for 
PerformanceEvaluation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-9968) Cluster is non operative if the RS carrying -ROOT- is expiring after deleting -ROOT- region transition znode and before adding it to online regions.

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-9968.
--
Resolution: Won't Fix

We no longer have {{-ROOT-}}

> Cluster is non operative if the RS carrying -ROOT- is expiring after deleting 
> -ROOT- region transition znode and before adding it to online regions.
> 
>
> Key: HBASE-9968
> URL: https://issues.apache.org/jira/browse/HBASE-9968
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Affects Versions: 0.94.11
>Reporter: rajeshbabu
>Assignee: rajeshbabu
>
> When we check whether the dead region is carrying root or meta, first we will 
> check any transition znode for the region is there or not. In this case it 
> got deleted. So from zookeeper we cannot find the region location. 
> {code}
> try {
>   data = ZKAssign.getData(master.getZooKeeper(), hri.getEncodedName());
> } catch (KeeperException e) {
>   master.abort("Unexpected ZK exception reading unassigned node for 
> region="
> + hri.getEncodedName(), e);
> }
> {code}
> Now we will check from the AssignmentManager whether its in online regions or 
> not
> {code}
> ServerName addressFromAM = getRegionServerOfRegion(hri);
> boolean matchAM = (addressFromAM != null &&
>   addressFromAM.equals(serverName));
> LOG.debug("based on AM, current region=" + hri.getRegionNameAsString() +
>   " is on server=" + (addressFromAM != null ? addressFromAM : "null") +
>   " server being checked: " + serverName);
> {code}
> From AM we will get null because  while adding region to online regions we 
> will check whether the RS is in onlineservers or not and if not we will not 
> add the region to online regions.
> {code}
>   if (isServerOnline(sn)) {
> this.regions.put(regionInfo, sn);
> addToServers(sn, regionInfo);
> this.regions.notifyAll();
>   } else {
> LOG.info("The server is not in online servers, ServerName=" + 
>   sn.getServerName() + ", region=" + regionInfo.getEncodedName());
>   }
> {code}
> Even though the dead regionserver carrying ROOT region, its returning false. 
> After that ROOT region never assigned.
> Here are the logs
> {code}
> 2013-11-11 18:04:14,730 INFO 
> org.apache.hadoop.hbase.catalog.RootLocationEditor: Unsetting ROOT region 
> location in ZooKeeper
> 2013-11-11 18:04:14,775 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
> was found (or we are ignoring an existing plan) for -ROOT-,,0.70236052 so 
> generated a random one; hri=-ROOT-,,0.70236052, src=, 
> dest=HOST-10-18-40-69,60020,1384173244404; 1 (online=1, available=1) 
> available servers
> 2013-11-11 18:04:14,809 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> -ROOT-,,0.70236052 to HOST-10-18-40-69,60020,1384173244404
> 2013-11-11 18:04:18,375 DEBUG 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
> Looked up root region location, 
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@12133926;
>  serverName=HOST-10-18-40-69,60020,1384173244404
> 2013-11-11 18:04:26,213 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENED, server=HOST-10-18-40-69,60020,1384173244404, 
> region=70236052/-ROOT-
> 2013-11-11 18:04:26,213 INFO 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
> event for -ROOT-,,0.70236052 from HOST-10-18-40-69,60020,1384173244404; 
> deleting unassigned node
> 2013-11-11 18:04:31,553 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: based on AM, current 
> region=-ROOT-,,0.70236052 is on server=null server being checked: 
> HOST-10-18-40-69,60020,1384173244404
> 2013-11-11 18:04:31,561 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
> Added=HOST-10-18-40-69,60020,1384173244404 to dead servers, submitted 
> shutdown handler to be executed, root=false, meta=false
> {code}
> {code}
> 2013-11-11 18:04:32,323 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: The znode of region 
> -ROOT-,,0.70236052 has been deleted.
> 2013-11-11 18:04:32,323 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: The server is not in online 
> servers, ServerName=HOST-10-18-40-69,60020,1384173244404, region=70236052
> 2013-11-11 18:04:32,323 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: The master has opened the 
> region -ROOT-,,0.70236052 that was online on 
> HOST-10-18-40-69,60020,1384173244404
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-6205) Support an option to keep data of dropped table for some time

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-6205.
--
Resolution: Later

Resolving for later, We already have the archive and snapshots and we could 
take care of this after HBASE-14439.

> Support an option to keep data of dropped table for some time
> -
>
> Key: HBASE-6205
> URL: https://issues.apache.org/jira/browse/HBASE-6205
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.94.0, 0.95.2
>Reporter: chunhui shen
>Assignee: chunhui shen
> Attachments: HBASE-6205.patch, HBASE-6205v2.patch, 
> HBASE-6205v3.patch, HBASE-6205v4.patch, HBASE-6205v5.patch
>
>
> User may drop table accidentally because of error code or other uncertain 
> reasons.
> Unfortunately, it happens in our environment because one user make a mistake 
> between production cluster and testing cluster.
> So, I just give a suggestion, do we need to support an option to keep data of 
> dropped table for some time, e.g. 1 day
> In the patch:
> We make a new dir named .trashtables in the rood dir.
> In the DeleteTableHandler, we move files in dropped table's dir to trash 
> table dir instead of deleting them directly.
> And Create new class TrashCleaner which will clean dropped tables if it is 
> time out with a period check.
> Default keep time for dropped tables is 1 day, and check period is 1 hour.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3991) Add Util folder for Utility Scripts

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3991.
--
Resolution: Won't Fix

No progress on this in 5 years. We tend to unify things on the main hbase 
script or the hbase shell, in some cases like the region_mover.rb we ended 
creating better tooling.

> Add Util folder for Utility Scripts
> ---
>
> Key: HBASE-3991
> URL: https://issues.apache.org/jira/browse/HBASE-3991
> Project: HBase
>  Issue Type: Brainstorming
>  Components: scripts, util
>Affects Versions: 0.92.0
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>
> This JIRA is to start discussion around adding some sort of 'util' folder to 
> HBase for common operational scripts.  We're starting to write a lot of HBase 
> analysis utilities that we'd love to share with open source, but don't want 
> to clutter the 'bin' folder, which seems like it should be reserved for 
> start/stop tasks.  If we add a 'util' folder, how do we keep it from becoming 
> a cesspool of half-baked & duplicated operational hacks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3975) NoServerForRegionException stalls write pipeline

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3975.
--
Resolution: Fixed

The new async client is taking care of this.

> NoServerForRegionException stalls write pipeline
> 
>
> Key: HBASE-3975
> URL: https://issues.apache.org/jira/browse/HBASE-3975
> Project: HBase
>  Issue Type: Bug
>  Components: Client
>Affects Versions: 0.89.20100924, 0.90.3, 0.92.0
>Reporter: Nicolas Spiegelberg
>Assignee: Nicolas Spiegelberg
>
> When we process a batch of puts, the current algorithm basically goes like 
> this:
> 1. Find all servers for the Put requests
> 2. Partition Puts by servers
> 3. Make requests
> 4. Collect success/error results
> If we throw an IOE in step 1 or 2, we will abort the whole batch operation.  
> In our case, this was an NoServerForRegionException due to region 
> rebalancing.  However, the asynchronous put case normally has requests going 
> to a wide variety of servers.  We should fail all the put requests that throw 
> an IOE in Step 1 but continue to try all the put requests that succeed at 
> this stage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3854) [thrift] broken thrift examples

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3854.
--
Resolution: Later

Resolving as later for now. We should fix coverage on the hbase-examples 
module. At least the code generation for php, perl and others seems to work.

> [thrift] broken thrift examples
> ---
>
> Key: HBASE-3854
> URL: https://issues.apache.org/jira/browse/HBASE-3854
> Project: HBase
>  Issue Type: Bug
>  Components: Thrift
>Affects Versions: 0.20.0
>Reporter: Alexey Diomin
>Priority: Minor
>
> We introduce NotFound exception in HBASE-1292, but we drop it in HBASE-1367.
> In result:
> 1. incorrect doc in Hbase.thrift in as result in generated java and java-doc
> 2. broken examples in src/examples/thrift/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3792) TableInputFormat leaks ZK connections

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3792.
--
Resolution: Won't Fix

> TableInputFormat leaks ZK connections
> -
>
> Key: HBASE-3792
> URL: https://issues.apache.org/jira/browse/HBASE-3792
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.1
> Environment: Java 1.6.0_24, Mac OS X 10.6.7
>Reporter: Bryan Keller
> Attachments: patch0.90.4, tableinput.patch
>
>
> The TableInputFormat creates an HTable using a new Configuration object, and 
> it never cleans it up. When running a Mapper, the TableInputFormat is 
> instantiated and the ZK connection is created. While this connection is not 
> explicitly cleaned up, the Mapper process eventually exits and thus the 
> connection is closed. Ideally the TableRecordReader would close the 
> connection in its close() method rather than relying on the process to die 
> for connection cleanup. This is fairly easy to implement by overriding 
> TableRecordReader, and also overriding TableInputFormat to specify the new 
> record reader.
> The leak occurs when the JobClient is initializing and needs to retrieves the 
> splits. To get the splits, it instantiates a TableInputFormat. Doing so 
> creates a ZK connection that is never cleaned up. Unlike the mapper, however, 
> my job client process does not die. Thus the ZK connections accumulate.
> I was able to fix the problem by writing my own TableInputFormat that does 
> not initialize the HTable in the getConf() method and does not have an HTable 
> member variable. Rather, it has a variable for the table name. The HTable is 
> instantiated where needed and then cleaned up. For example, in the 
> getSplits() method, I create the HTable, then close the connection once the 
> splits are retrieved. I also create the HTable when creating the record 
> reader, and I have a record reader that closes the connection when done.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3791) Display total number of zookeeper connections on master.jsp

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3791.
--
Resolution: Fixed

zk.jsp (ZKUtil.dump() see HBASE-2692) from the Master UI already provides the 
total number of connections to ZK open.

> Display total number of zookeeper connections on master.jsp
> ---
>
> Key: HBASE-3791
> URL: https://issues.apache.org/jira/browse/HBASE-3791
> Project: HBase
>  Issue Type: Improvement
>  Components: Zookeeper
>Affects Versions: 0.90.2
>Reporter: Ted Yu
> Attachments: 3791.patch
>
>
> Quite often, user needs to telnet to Zookeeper and type 'stats' to get the 
> connections, or count the connections on zk.jsp
> We should display the total number of connections beside the link to zk.jsp 
> on master.jsp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3786) Enhance MasterCoprocessorHost to include notification of balancing of each region

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3786.
--
Resolution: Won't Fix

HBASE-4552 was closed and as [~apurtell] stated in HBASE-3529 with NGDATA's 
hbase-indexer we have some indexing functionality that relies on our 
replication infra.

> Enhance MasterCoprocessorHost to include notification of balancing of each 
> region
> -
>
> Key: HBASE-3786
> URL: https://issues.apache.org/jira/browse/HBASE-3786
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 0.90.2
>Reporter: Ted Yu
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3782) Multi-Family support for bulk upload tools causes File Not Found Exception

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3782.
--
Resolution: Won't Fix

Should be fixed by atomic bulk loading from HBASE-4552

> Multi-Family support for bulk upload tools causes File Not Found Exception
> --
>
> Key: HBASE-3782
> URL: https://issues.apache.org/jira/browse/HBASE-3782
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.3
>Reporter: Nichole Treadway
> Attachments: HBASE-3782.patch
>
>
> I've been testing HBASE-1861 in 0.90.2, which adds multi-family support for 
> bulk upload tools.
> I found that when running the importtsv program, some reduce tasks fail with 
> a File Not Found exception if there are no keys in the input data which fall 
> into the region assigned to that reduce task.  From what I can determine, it 
> seems that an output directory is created in the write() method and expected 
> to exist in the writeMetaData() method...if there are no keys to be written 
> for that reduce task, the write method is never called and the output 
> directory is never created, but writeMetaData is expecting the output 
> directory to exist...thus the FnF exception:
> 2011-03-17 11:52:48,095 WARN org.apache.hadoop.mapred.TaskTracker: Error 
> running child
> java.io.FileNotFoundException: File does not exist: 
> hdfs://master:9000/awardsData/_temporary/_attempt_201103151859_0066_r_00_0
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:468)
>   at 
> org.apache.hadoop.hbase.regionserver.StoreFile.getUniqueFile(StoreFile.java:580)
>   at 
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat$1.writeMetaData(HFileOutputFormat.java:186)
>   at 
> org.apache.hadoop.hbase.mapreduce.HFileOutputFormat$1.close(HFileOutputFormat.java:247)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:567)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>   at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Simply checking if the file exists should fix the issue. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3778) HBaseAdmin.create doesn't create empty boundary keys

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3778.
--
Resolution: Duplicate

> HBaseAdmin.create doesn't create empty boundary keys
> 
>
> Key: HBASE-3778
> URL: https://issues.apache.org/jira/browse/HBASE-3778
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.2
>Reporter: Ted Dunning
> Attachments: HBASE-3778.patch
>
>
> In my ycsb stuff, I have code that looks like this:
> {code}
> String startKey = "user102000";
> String endKey = "user94000";
> admin.createTable(descriptor, startKey.getBytes(), endKey.getBytes(), 
> regions);
> {code}
> The result, however, is a table where the first and last region has defined 
> first and last keys rather than empty keys.
> The patch I am about to attach fixes this, I think.  I have some worries 
> about other uses of Bytes.split, however, and would like some eyes on this 
> patch.  Perhaps we need a new dialect of split.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3725) HBase increments from old value after delete and write to disk

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3725.
--
Resolution: Resolved

Resolving per last comment from [~larsh]

> HBase increments from old value after delete and write to disk
> --
>
> Key: HBASE-3725
> URL: https://issues.apache.org/jira/browse/HBASE-3725
> Project: HBase
>  Issue Type: Bug
>  Components: io, regionserver
>Affects Versions: 0.90.1
>Reporter: Nathaniel Cook
>Assignee: ShiXing
> Attachments: HBASE-3725-0.92-V1.patch, HBASE-3725-0.92-V2.patch, 
> HBASE-3725-0.92-V3.patch, HBASE-3725-0.92-V4.patch, HBASE-3725-0.92-V5.patch, 
> HBASE-3725-0.92-V6.patch, HBASE-3725-Test-v1.patch, HBASE-3725-v3.patch, 
> HBASE-3725.patch
>
>
> Deleted row values are sometimes used for starting points on new increments.
> To reproduce:
> Create a row "r". Set column "x" to some default value.
> Force hbase to write that value to the file system (such as restarting the 
> cluster).
> Delete the row.
> Call table.incrementColumnValue with "some_value"
> Get the row.
> The returned value in the column was incremented from the old value before 
> the row was deleted instead of being initialized to "some_value".
> Code to reproduce:
> {code}
> import java.io.IOException;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.hbase.HBaseConfiguration;
> import org.apache.hadoop.hbase.HColumnDescriptor;
> import org.apache.hadoop.hbase.HTableDescriptor;
> import org.apache.hadoop.hbase.client.Delete;
> import org.apache.hadoop.hbase.client.Get;
> import org.apache.hadoop.hbase.client.HBaseAdmin;
> import org.apache.hadoop.hbase.client.HTableInterface;
> import org.apache.hadoop.hbase.client.HTablePool;
> import org.apache.hadoop.hbase.client.Increment;
> import org.apache.hadoop.hbase.client.Result;
> import org.apache.hadoop.hbase.util.Bytes;
> public class HBaseTestIncrement
> {
>   static String tableName  = "testIncrement";
>   static byte[] infoCF = Bytes.toBytes("info");
>   static byte[] rowKey = Bytes.toBytes("test-rowKey");
>   static byte[] newInc = Bytes.toBytes("new");
>   static byte[] oldInc = Bytes.toBytes("old");
>   /**
>* This code reproduces a bug with increment column values in hbase
>* Usage: First run part one by passing '1' as the first arg
>*Then restart the hbase cluster so it writes everything to disk
>*Run part two by passing '2' as the first arg
>*
>* This will result in the old deleted data being found and used for 
> the increment calls
>*
>* @param args
>* @throws IOException
>*/
>   public static void main(String[] args) throws IOException
>   {
>   if("1".equals(args[0]))
>   partOne();
>   if("2".equals(args[0]))
>   partTwo();
>   if ("both".equals(args[0]))
>   {
>   partOne();
>   partTwo();
>   }
>   }
>   /**
>* Creates a table and increments a column value 10 times by 10 each 
> time.
>* Results in a value of 100 for the column
>*
>* @throws IOException
>*/
>   static void partOne()throws IOException
>   {
>   Configuration conf = HBaseConfiguration.create();
>   HBaseAdmin admin = new HBaseAdmin(conf);
>   HTableDescriptor tableDesc = new HTableDescriptor(tableName);
>   tableDesc.addFamily(new HColumnDescriptor(infoCF));
>   if(admin.tableExists(tableName))
>   {
>   admin.disableTable(tableName);
>   admin.deleteTable(tableName);
>   }
>   admin.createTable(tableDesc);
>   HTablePool pool = new HTablePool(conf, Integer.MAX_VALUE);
>   HTableInterface table = pool.getTable(Bytes.toBytes(tableName));
>   //Increment unitialized column
>   for (int j = 0; j < 10; j++)
>   {
>   table.incrementColumnValue(rowKey, infoCF, oldInc, 
> (long)10);
>   Increment inc = new Increment(rowKey);
>   inc.addColumn(infoCF, newInc, (long)10);
>   table.increment(inc);
>   }
>   Get get = new Get(rowKey);
>   Result r = table.get(get);
>   System.out.println("initial values: new " + 
> Bytes.toLong(r.getValue(infoCF, newInc)) + " old " + 
> Bytes.toLong(r.getValue(infoCF, oldInc)));
>   }
>   /**
>* First deletes the data then increments the column 10 times by 1 each 
> time
>*
>* Should result in 

[jira] [Resolved] (HBASE-3432) [hbck] Add "remove table" switch

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3432.
--
Resolution: Won't Fix

closing as stale. Not seen in a long time.

> [hbck] Add "remove table" switch
> 
>
> Key: HBASE-3432
> URL: https://issues.apache.org/jira/browse/HBASE-3432
> Project: HBase
>  Issue Type: New Feature
>  Components: util
>Affects Versions: 0.89.20100924
>Reporter: Lars George
>Priority: Minor
>
> This happened before and I am not sure how the new Master improves on it 
> (this stuff is only available between the lines are buried in some peoples 
> heads - one other thing I wish was for a better place to communicate what 
> each path improves). Just so we do not miss it, there is an issue that 
> sometimes disabling large tables simply times out and the table gets stuck in 
> limbo. 
> From the CDH User list:
> {quote}
> On Fri, Jan 7, 2011 at 1:57 PM, Sean Sechrist  wrote:
> To get them out of META, you can just scan '.META.' for that table name, and 
> delete those rows. We had to do that a few months ago.
> -Sean
> That did it.  For the benefit of others, here's code.  Beware the literal 
> table names, run at your own peril.
> {quote}
> {code}
> import java.io.IOException;
> import org.apache.hadoop.conf.Configuration;
> import org.apache.hadoop.hbase.HBaseConfiguration;
> import org.apache.hadoop.hbase.client.HTable;
> import org.apache.hadoop.hbase.client.Delete;
> import org.apache.hadoop.hbase.client.Result;
> import org.apache.hadoop.hbase.client.MetaScanner;
> import org.apache.hadoop.hbase.util.Bytes;
> public class CleanFromMeta {
> public static class Cleaner implements MetaScanner.MetaScannerVisitor {
> public HTable meta = null;
> public Cleaner(Configuration conf) throws IOException {
> meta = new HTable(conf, ".META.");
> }
> public boolean processRow(Result rowResult) throws IOException {
> String r = new String(rowResult.getRow());
> if (r.startsWith("webtable,")) {
> meta.delete(new Delete(rowResult.getRow()));
> System.out.println("Deleting row " + rowResult);
> }
> return true;
> }
> }
> public static void main(String[] args) throws Exception {
> String tname = ".META.";
> Configuration conf = HBaseConfiguration.create();
> MetaScanner.metaScan(conf, new Cleaner(conf), 
>  Bytes.toBytes("webtable"));
> }
> }
> {code}
> I suggest to move this into HBaseFsck. I do not like personally to have these 
> JRuby scripts floating around that may or may not help. This should be 
> available if a user gets stuck and knows what he is doing (they can delete 
> from .META. anyways). Maybe a "\-\-disable-table  \-\-force" or 
> so? But since disable is already in the shell we could add an "\-\-force" 
> there? Or add a "\-\-delete-table " to the hbck?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-3307) Add checkAndPut to the Thrift API

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-3307.
--
Resolution: Duplicate

dup of HBASE-10960

> Add checkAndPut to the Thrift API
> -
>
> Key: HBASE-3307
> URL: https://issues.apache.org/jira/browse/HBASE-3307
> Project: HBase
>  Issue Type: Improvement
>  Components: Thrift
>Affects Versions: 0.89.20100924
>Reporter: Chris Tarnas
>Priority: Minor
>  Labels: thrift
>
> It would be very useful to have the checkAndPut method available via the 
> Thrift API. This would both allow for easier atomic updates as well as cut 
> down on at least one Thrift roundtrip for quite a few common tasks. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-2535) split hostname format should be consistent with tasktracker for locality

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-2535.
--
Resolution: Duplicate

resolved by HBASE-7693

> split hostname format should be consistent with tasktracker for locality
> 
>
> Key: HBASE-2535
> URL: https://issues.apache.org/jira/browse/HBASE-2535
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Affects Versions: 0.20.4
>Reporter: John Sichi
>
> I was running a mapreduce job (via Hive) against HBase, and noticed that I 
> wasn't getting any locality (the input split location and the task tracker 
> machine in the job tracker UI were always different, and "Rack-local map 
> tasks" in the job counters was 0).
> I tracked this down to a discrepancy in the way hostnames were being compared.
> The task tracker detail had a Host like
> /f/s/1.2.3.4/h.s.f.com.
> (with trailing dot)
> But the Input Split Location had
> /f/s/1.2.3.4/h.s.f.com
> (without trailing dot)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-2434) Add scanner caching option to Export and write buffer option for Import

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-2434.
--
Resolution: Won't Fix

No longer relevant, superseded by the buffered mutator and [~yangzhe1991]'s 
rationalization on sizing and timing scanners.

> Add scanner caching option to Export and write buffer option for Import
> ---
>
> Key: HBASE-2434
> URL: https://issues.apache.org/jira/browse/HBASE-2434
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Affects Versions: 0.20.3
>Reporter: Ted Yu
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> An option of number of rows to fetch every time we hit a region server should 
> be added to mapreduce.Export so that createSubmittableJob() calls 
> s.setCaching() with the specified value.
> Also, an option of write buffer size should be added to mapreduce.Import so 
> that we can set write buffer. Sample calls:
> +table.setAutoFlush(false);
> +table.setWriteBufferSize(desired_buffer_size);



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-2376) Add special SnapshotScanner which presents view of all data at some time in the past

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-2376.
--
Resolution: Later

Equivalent functionality can be achieved by using HBASE-4536, HBASE-4071 if you 
think this stills necessary please re-open.

> Add special SnapshotScanner which presents view of all data at some time in 
> the past
> 
>
> Key: HBASE-2376
> URL: https://issues.apache.org/jira/browse/HBASE-2376
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, regionserver
>Affects Versions: 0.20.3
>Reporter: Jonathan Gray
>Assignee: Pritam Damania
>
> In order to support a particular kind of database "snapshot" feature which 
> doesn't require copying data, we came up with the idea for a special 
> SnapshotScanner that would present a view of your data at some point in the 
> past.  The primary use case for this would be to be able to recover 
> particular data/rows (but not all data, like a global rollback) should they 
> have somehow been messed up (application fault, application bug, user error, 
> etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-2213) HCD should only have those fields explicitly set by user while creating tables

2016-11-16 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-2213.
--
Resolution: Won't Fix

Stale, re-open if you consider this needs to be implemented.

> HCD should only have those fields explicitly set by user while creating tables
> --
>
> Key: HBASE-2213
> URL: https://issues.apache.org/jira/browse/HBASE-2213
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.20.3
>Reporter: ryan rawson
>
> right now we take the default HCD fields and 'snapshot' them into every HCD.  
> So things like 'BLOCKCACHE' and 'FILESIZE' are in every table, even if they 
> don't differ from the defaults.  If the default changes in a 
> meanful/important way, the user is left with the unenviable task of (a) 
> determining this happened and (b) actually going through and 
> disabling/altering the tables to fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17058) Lower epsilon used for jitter verification from HBASE-15324

2016-11-09 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17058:
-

 Summary: Lower epsilon used for jitter verification from 
HBASE-15324
 Key: HBASE-17058
 URL: https://issues.apache.org/jira/browse/HBASE-17058
 Project: HBase
  Issue Type: Bug
  Components: Compaction
Affects Versions: 1.2.4, 1.1.7, 2.0.0, 1.3.0, 1.4.0
Reporter: Esteban Gutierrez


The current epsilon used is 1E-6 and its too big it might overflow the 
desiredMaxFileSize. A trivial fix is to lower the epsilon to 2^-52 or even 
2^-53. An option to consider too is just to shift the jitter to always 
decrement hbase.hregion.max.filesize (MAX_FILESIZE) instead of increase the 
size of the region and having to deal with the round off.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17007) Move ZooKeeper logging to its own log file

2016-11-02 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-17007:
-

 Summary: Move ZooKeeper logging to its own log file
 Key: HBASE-17007
 URL: https://issues.apache.org/jira/browse/HBASE-17007
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Trivial


ZooKeeper logging can be too verbose. Lets move ZooKeeper logging to a 
different log file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16774) [shell] Add coverage to TestShell when ZooKeeper is not reachable

2016-10-05 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-16774:
-

 Summary: [shell] Add coverage to TestShell when ZooKeeper is not 
reachable
 Key: HBASE-16774
 URL: https://issues.apache.org/jira/browse/HBASE-16774
 Project: HBase
  Issue Type: Improvement
  Components: shell, test
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


While testing a couple of things in master I noticed that after some of the 
changes done in HBASE-16117 the hbase shell would die when there is no 
ZooKeeper server up or if we get another ZK exception. This is to add coverage 
to test the shell when ZK is not up or if we get another exception.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16450) Shell tool to dump replication queues

2016-08-18 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-16450:
-

 Summary: Shell tool to dump replication queues 
 Key: HBASE-16450
 URL: https://issues.apache.org/jira/browse/HBASE-16450
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Affects Versions: 1.2.2, 1.1.5, 2.0.0, 1.3.0
Reporter: Esteban Gutierrez


Currently there is no way to dump list of the configured queues and the 
replication queues when replication is enabled. Unfortunately the HBase master 
only offers an option to dump the whole content of the znodes but not details 
on the queues being processed on each RS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16379) [replication] Minor improvement to replication/copy_tables_desc.rb

2016-08-08 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-16379:
-

 Summary: [replication] Minor improvement to 
replication/copy_tables_desc.rb
 Key: HBASE-16379
 URL: https://issues.apache.org/jira/browse/HBASE-16379
 Project: HBase
  Issue Type: Improvement
  Components: Replication, shell
Affects Versions: 1.2.2, 1.3.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Trivial


copy_tables_desc.rb is helpful for quickly setting up a table remotely based on 
an existing schema. However it does copy by default all tables. Now you can 
pass a list of tables as an optional third argument and it will also display 
what table descriptors where copied.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15612) Minor doc improvements to CellCounter and RowCounter documentation

2016-04-07 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-15612:
-

 Summary: Minor doc improvements to CellCounter and RowCounter 
documentation
 Key: HBASE-15612
 URL: https://issues.apache.org/jira/browse/HBASE-15612
 Project: HBase
  Issue Type: Improvement
  Components: documentation, mapreduce
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Trivial


Both Javadoc and the HBase Book need to reflect that is possible to specify an 
optional time range in the command line arguments.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15511) ClusterStatus should be able

2016-03-21 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-15511:
-

 Summary: ClusterStatus should be able 
 Key: HBASE-15511
 URL: https://issues.apache.org/jira/browse/HBASE-15511
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15489) Improve handling of hbase.rpc.protection configuration mismatch when using replication

2016-03-18 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-15489:
-

 Summary: Improve handling of hbase.rpc.protection configuration 
mismatch when using replication
 Key: HBASE-15489
 URL: https://issues.apache.org/jira/browse/HBASE-15489
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez


This probably should be a sub-task for major a major revamp of how we report 
our replication metrics in HBase in the UI. 

After switching {{hbase.rpc.protection}} to {{privacy}} in one cluster I didn't 
noticed immediately there was a mismatch across my other clusters which caused 
replication to stop. Ideally if this happens we should have a better log 
message and show this mismatch in the RegionServer and Master UIs.






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14952) hbase-assembly has hbase-external-blockcache missing

2015-12-08 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-14952:
-

 Summary: hbase-assembly has hbase-external-blockcache missing
 Key: HBASE-14952
 URL: https://issues.apache.org/jira/browse/HBASE-14952
 Project: HBase
  Issue Type: Bug
  Components: build, dependencies
Affects Versions: 1.2.0
Reporter: Esteban Gutierrez
Assignee: Sean Busbey
Priority: Blocker


After generating a tarball we noticed that hbase-external-blockcache was 
missing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14500) Remove load of deprecated MOB ruby scripts after HBASE-14227

2015-09-28 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-14500:
-

 Summary: Remove load of deprecated MOB ruby scripts after 
HBASE-14227
 Key: HBASE-14500
 URL: https://issues.apache.org/jira/browse/HBASE-14500
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 2.0.0
Reporter: Esteban Gutierrez






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14405) region_mover.rb should verify the location of the region that is being scanned by isSuccessfulScan()

2015-09-10 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-14405:
-

 Summary: region_mover.rb should verify the location of the region 
that is being scanned by isSuccessfulScan()
 Key: HBASE-14405
 URL: https://issues.apache.org/jira/browse/HBASE-14405
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 1.2.1, 1.0.3, 1.1.3, 0.98.16
Reporter: Esteban Gutierrez


When we do isSuccessfulScan() to verify if the region can be scanned or not we 
never verify if the scanner is returning a result from the expected 
RegionServer, e.g. if unloading the regions from a RS, the scanner should 
return from the source and after moving the region the scanner should come from 
a different RS and if loading the regions we should verify in a similar way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14358) Parent region is not removed from regionstates after a successful split

2015-09-03 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-14358:
-

 Summary: Parent region is not removed from regionstates after a 
successful split
 Key: HBASE-14358
 URL: https://issues.apache.org/jira/browse/HBASE-14358
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.2.1, 1.0.3, 1.1.3
Reporter: Esteban Gutierrez
Priority: Critical


Ran into this while trying to find out why region_mover.rb was not catching an 
exception after a region was split. Digging further I found that the problem is 
happening in the handling of the region state in the Master since we don't 
remove the old state after the split is successful:

{code}
2015-09-03 02:56:49,255 INFO org.apache.hadoop.hbase.master.AssignmentManager: 
Ignored moving region not assigned: {ENCODED => 
9a4930ed41dc7013d9956240e6f5c03e, NAME => 
'u,user3605,1432797255754.9a4930ed41dc7013d9956240e6f5c03e.', STARTKEY => 
'user3605', ENDKEY => 'user3723'}, {9a4930ed41dc7013d9956240e6f5c03e 
state=SPLIT, ts=1441273152561, 
server=a2209.halxg.cloudera.com,22101,1441243232790}
{code}

I don't think the problem is happening in the master branch but at least I've 
been able to confirm this is happening on branch-1 and branch-1.2 at least.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14354) Minor improvements for usage of the mlock agent

2015-09-01 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-14354:
-

 Summary: Minor improvements for usage of the mlock agent
 Key: HBASE-14354
 URL: https://issues.apache.org/jira/browse/HBASE-14354
 Project: HBase
  Issue Type: Bug
  Components: hbase, regionserver
Reporter: Esteban Gutierrez
Priority: Trivial


1. MLOCK_AGENT points to the wrong path in hbase-config.sh

When the mlock agent is build, the binary is installed under 
$HBASE_HOME/lib/native  and not under $HBASE_HOME/native

2. By default we pass $HBASE_REGIONSERVER_UID to the agent options which causes 
the mlock agent to attempt to do a setuid in order to mlock the memory of the 
RS process. We should only pass that user if specified in the environment, not 
by default. (the agent currently handles that gracefully)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14347) Add a switch to DynamicClassLoader to be disabled and make that the default

2015-08-31 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-14347:
-

 Summary: Add a switch to DynamicClassLoader to be disabled and 
make that the default
 Key: HBASE-14347
 URL: https://issues.apache.org/jira/browse/HBASE-14347
 Project: HBase
  Issue Type: Bug
  Components: Client, defaults, regionserver
Affects Versions: 2.0.0, 1.2.0, 1.1.2, 0.98.15, 1.0.3
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


Since HBASE-1936 we have the option to load jars dynamically by default from 
HDFS or the local filesystem, however hbase.dynamic.jars.dir points to a 
directory that could be world writable it potentially opens a security problem 
in both the client side and the RS. We should consider to have a switch to 
enable or disable this option and it should be off by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14076) ResultSerialization and MutationSerialization can throw InvalidProtocolBufferException when serializing a cell larger than 64MB

2015-07-14 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-14076:
-

 Summary: ResultSerialization and MutationSerialization can throw 
InvalidProtocolBufferException when serializing a cell larger than 64MB
 Key: HBASE-14076
 URL: https://issues.apache.org/jira/browse/HBASE-14076
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez


This was reported in CRUNCH-534 but is a problem how we handle deserialization 
of large Cells ( 64MB) in ResultSerialization and MutationSerialization.

The fix is just re-using what it was done in HBASE-13230.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14060) Add FuzzyRowFilter to ParseFilter

2015-07-11 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-14060:
-

 Summary: Add FuzzyRowFilter to ParseFilter
 Key: HBASE-14060
 URL: https://issues.apache.org/jira/browse/HBASE-14060
 Project: HBase
  Issue Type: Bug
  Components: Filters, Scanners, Usability
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


FuzzyRowFilter is not currently exposed in ParseFilter. I think it would be 
nice to have it there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14059) We should add a RS to the dead servers list if admin calls fail more than a threshold

2015-07-10 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-14059:
-

 Summary: We should add a RS to the dead servers list if admin 
calls fail more than a threshold
 Key: HBASE-14059
 URL: https://issues.apache.org/jira/browse/HBASE-14059
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, rpc
Affects Versions: 0.98.13
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Critical


I ran into this problem twice this week: calls from the HBase master to a RS 
can timeout since the RS call queue size has been maxed out, however since the 
RS is not dead (ephemeral znode still present) the master keeps attempting to 
perform admin tasks like trying to open or close a region but those operations 
eventually fail after we run out of retries or the assignment manager attempts 
to re-assign to other RSs. From the side effects of this I've noticed master 
operations to be fully blocked or RITs since we cannot close the region and 
open the region in a new location since RS is not dead. 

A potential solution for this is to add the RS to the list of dead RSs after 
certain number of calls from the master to the RS fail.

I've noticed only the problem in 0.98.x but it should be present in all 
versions.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13729) old hbase.regionserver.global.memstore.upperLimit is ignored if present

2015-05-20 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13729:
-

 Summary: old hbase.regionserver.global.memstore.upperLimit is 
ignored if present
 Key: HBASE-13729
 URL: https://issues.apache.org/jira/browse/HBASE-13729
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 1.1.0, 1.0.1, 2.0.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Critical


If hbase.regionserver.global.memstore.upperLimit is present we should use it 
instead of hbase.regionserver.global.memstore.size the current implementation 
of HeapMemorySizeUtil.getGlobalMemStorePercent() asumes that if 
hbase.regionserver.global.memstore.size is not defined thenit should use the 
old configuration, however it should be the other way around.

This has a large impact specially if doing a rolling upgrade of a cluster when 
the memstore upper limit has been changed from the default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13714) Add tracking of the total response queue size

2015-05-19 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13714:
-

 Summary: Add tracking of the total response queue size
 Key: HBASE-13714
 URL: https://issues.apache.org/jira/browse/HBASE-13714
 Project: HBase
  Issue Type: Improvement
  Components: master, metrics, regionserver, rpc
Affects Versions: 2.0.0, 1.0.2, 1.2.0
Reporter: Esteban Gutierrez


I noticed this behavior while working on HBASE-13694:
Once we are done processing a request, we decrement the call queue size on the 
RPC server. However, responses can be very large and sometimes sending them can 
take a long time. Since we don't keep track the response queue via metrics it 
is hard to spot when the responses are using too much resources on the RS. 

Ideally we should be tracking on the RS how much data we have in-flight in the 
response queue via metrics and not just in the logs if the size of the response 
exceeds a threshold (e.g hbase.ipc.warn.response.size or 
hbase.ipc.warn.response.time)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13694) CallQueueSize is incorrectly decremented after the response is sent

2015-05-15 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13694:
-

 Summary: CallQueueSize is incorrectly decremented after the 
response is sent
 Key: HBASE-13694
 URL: https://issues.apache.org/jira/browse/HBASE-13694
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver, rpc
Affects Versions: 2.0.0, 1.1.0, 1.0.2, 1.2.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


We should decrement the CallQueueSize as soon as we no longer need the call 
around, e.g. after {{RpcServer.CurCall.set(null)}} otherwise we will be only 
pushing back to other client requests while we send the response back to the 
client that original caller.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13495) Create metrics for purged calls, abandoned calls and other RPC failures

2015-04-17 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13495:
-

 Summary: Create metrics for purged calls, abandoned calls and 
other RPC failures
 Key: HBASE-13495
 URL: https://issues.apache.org/jira/browse/HBASE-13495
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez


Similar to HBASE-13477 this aimed to add metrics to keep track of how many 
calls are abandoned, purged and other states before the call is executed. This 
would be helpful to track the rate of channel closed exceptions when 100s or 
1000s of clients disconnect or the calls are not correctly formed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13484) [docs] docs need to be specific about hbase.bucketcache.size range

2015-04-15 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13484:
-

 Summary: [docs] docs need to be specific about 
hbase.bucketcache.size range
 Key: HBASE-13484
 URL: https://issues.apache.org/jira/browse/HBASE-13484
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez


This is not 100% clear for users:

if hbase.bucketcache.size is between 0.0 and 1.0 then is a percentage. But if 
the value is above 1 then the value is expressed in KBs:

From CacheConfig.getBucketCache():

{noformat}
  long bucketCacheSize = (long) (bucketCachePercentage  1? mu.getMax() * 
bucketCachePercentage:
  bucketCachePercentage * 1024 * 1024);
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13483) [docs] onheap is not a valid bucket cache IO engine.

2015-04-15 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13483:
-

 Summary: [docs] onheap is not a valid bucket cache IO engine.
 Key: HBASE-13483
 URL: https://issues.apache.org/jira/browse/HBASE-13483
 Project: HBase
  Issue Type: Bug
  Components: documentation
Affects Versions: 2.0.0
Reporter: Esteban Gutierrez



From the HBase book: 
http://hbase.apache.org/book.html#hbase_default_configurations
:
{code}
hbase.bucketcache.ioengine
Description
Where to store the contents of the bucketcache. One of: *onheap*, offheap, or 
file. If a file, set it to file:PATH_TO_FILE. See 
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/io/hfile/CacheConfig.html
 for more information.
{code}

Instead of onheap it should be heap.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-13461) RegionSever Hlog flush BLOCKED on hbase-0.96.2-hadoop2

2015-04-14 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-13461.
---
Resolution: Invalid

 RegionSever Hlog flush BLOCKED  on  hbase-0.96.2-hadoop2
 

 Key: HBASE-13461
 URL: https://issues.apache.org/jira/browse/HBASE-13461
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.96.2
 Environment: hbase-0.96.2-hadoop2   hadoop2.2.0
Reporter: zhangjg

 I try to dump  thread stack below:
 RpcServer.handler=63,port=60020 daemon prio=10 tid=0x7fdcddc5d000 
 nid=0x5f9 waiting for monitor entry [0x7fd289194000]
java.lang.Thread.State: BLOCKED (on object monitor)
 at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:98)
 - waiting to lock 0x7fd36c023728 (a 
 org.apache.hadoop.hdfs.DFSOutputStream)
 at 
 org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:59)
 at java.io.DataOutputStream.write(DataOutputStream.java:90)
 - locked 0x7fd510cfdc28 (a 
 org.apache.hadoop.hdfs.client.HdfsDataOutputStream)
 at 
 com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833)
 at 
 com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843)
 at 
 com.google.protobuf.AbstractMessageLite.writeDelimitedTo(AbstractMessageLite.java:91)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.append(ProtobufLogWriter.java:87)
 at 
 org.apache.hadoop.hbase.regionserver.wal.FSHLog$LogSyncer.hlogFlush(FSHLog.java:1026)
 at 
 org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1075)
 - locked 0x7fd2d9bbfad0 (a java.lang.Object)
 at 
 org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1240)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.syncOrDefer(HRegion.java:5593)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2315)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2028)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4094)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3380)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3284)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26935)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2185)
 at 
 org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1889)
 RpcServer.handler=12,port=60020 daemon prio=10 tid=0x7fdcddf2c800 
 nid=0x5c6 in Object.wait() [0x7fd28c4c7000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream.waitForAckedSeqno(DFSOutputStream.java:1803)
 - locked 0x7fd45857c540 (a java.util.LinkedList)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1697)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1590)
 at 
 org.apache.hadoop.hdfs.DFSOutputStream.sync(DFSOutputStream.java:1575)
 at 
 org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:121)
 at 
 org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.sync(ProtobufLogWriter.java:135)
 at 
 org.apache.hadoop.hbase.regionserver.wal.FSHLog.syncer(FSHLog.java:1098)
 at 
 org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1240)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.syncOrDefer(HRegion.java:5593)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutation(HRegion.java:2315)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2028)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4094)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3380)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3284)
 at 
 org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26935)
 at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2185)
 at 
 org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1889)
 RpcServer.handler=11,port=60020 daemon prio=10 tid=0x7fdcdd9e1000 
 nid=0x5c5 in Object.wait() [0x7fd28c5c8000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at 

[jira] [Created] (HBASE-13403) Make waitOnSafeMode configurable in MasterFileSystem

2015-04-03 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13403:
-

 Summary: Make waitOnSafeMode configurable in MasterFileSystem
 Key: HBASE-13403
 URL: https://issues.apache.org/jira/browse/HBASE-13403
 Project: HBase
  Issue Type: Bug
  Components: master
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Minor


We currently wait whatever is the configured value of 
hbase.server.thread.wakefrequency or the default 10 seconds. We should have a 
configuration to control how long we wait until the HDFS is no longer in safe 
mode, since using the existing hbase.server.thread.wakefrequency property to 
tune that can have adverse side effects. My proposal is to add a new property 
called hbase.master.waitonsafemode and start with the current default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13407) Add a configurable jitter to MemStoreFlusher#FlushHandler in order to smooth write latency

2015-04-03 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13407:
-

 Summary: Add a configurable jitter to MemStoreFlusher#FlushHandler 
in order to smooth write latency
 Key: HBASE-13407
 URL: https://issues.apache.org/jira/browse/HBASE-13407
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez


There is a very interesting behavior that I can reproduce consistently with 
many workloads from HBase 0.98 to HBase 1.0 since hbase.hstore.flusher.count 
was set by default to 2: when writes are evenly distributed across regions, 
memstores grow and flush about the same rate causing spikes in IO and CPU. The 
side effect of those spikes is loss in throughput which in some cases can above 
10% impacting write metrics. When the flushes get a out of sync the spikes 
lower and and throughput is very stable. Reverting hbase.hstore.flusher.count 
to 1 doesn't help too much with write heavy workloads since we end with a large 
flush queue that eventually can block writers.

Adding a small configurable jitter hbase.server.thread.wakefrequency.jitter.pct 
(a percentage of the hbase.server.thread.wakefrequency frequency) can help to 
stagger the writes from FlushHandler to HDFS and smooth the write latencies 
when the memstores are flushed in multiple threads. 





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-13392) Hbase master dyeing out in 1.0 distributed mode with hadoop 2.6 HA

2015-04-02 Thread Esteban Gutierrez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Esteban Gutierrez resolved HBASE-13392.
---
Resolution: Invalid

 Hbase master dyeing out in 1.0 distributed mode with hadoop 2.6 HA
 --

 Key: HBASE-13392
 URL: https://issues.apache.org/jira/browse/HBASE-13392
 Project: HBase
  Issue Type: Brainstorming
  Components: hadoop2, hbase
Affects Versions: 1.0.0
 Environment: rhel 6 64bit
Reporter: sridhararao mutluri
Priority: Minor

 HBASE master is dyeing out speedily in cluster mode with this error:
 My HMASTER is dyeing out speedily as hmaster log shows below error:
 2015-04-02 03:43:43,588 FATAL [vxa1:16020.activeMasterManager] 
 master.HMaster: Failed to become active master
 java.net.ConnectException: Call From vxa1.cloud.com/10.1.178.86 to 
 vxa1.cloud.com:9000 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused
 Please suggest any solution.
 The hadoop 2.6 HA core-site.xml fs deafult shows cluster name only where as 
 in hbase-site.xml shows hostname:9000



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13266) test-patch.sh can return false positives for zombie tests from tests running on the same host

2015-03-17 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13266:
-

 Summary: test-patch.sh can return false positives for zombie tests 
from tests running on the same host
 Key: HBASE-13266
 URL: https://issues.apache.org/jira/browse/HBASE-13266
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez


Just saw this here 
https://builds.apache.org/job/PreCommit-HBASE-Build/13271//consoleFull

{code}
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 01:27 h
[INFO] Finished at: 2015-03-16T23:58:30+00:00
[INFO] Final Memory: 93M/844M
[INFO] 
Suspicious java process found - waiting 30s to see if there are just slow to 
stop
There are 1 zombie tests, they should have been killed by surefire but survived
 BEGIN zombies jstack extract
2015-03-16 23:59:03
Full thread dump Java HotSpot(TM) Server VM (23.25-b01 mixed mode):

Attach Listener daemon prio=10 tid=0xaa400800 nid=0x17cc waiting on condition 
[0x]
   java.lang.Thread.State: RUNNABLE

IPC Client (47) connection to 0.0.0.0/0.0.0.0:4324 from jenkins daemon 
prio=10 tid=0xa8d03400 nid=0x1759 in Object.wait() [0xa9c7d000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xde1987c8 (a org.apache.hama.ipc.Client$Connection)
at org.apache.hama.ipc.Client$Connection.waitForWork(Client.java:533)
- locked 0xde1987c8 (a org.apache.hama.ipc.Client$Connection)
at org.apache.hama.ipc.Client$Connection.run(Client.java:577)
...
   java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at 
org.apache.hama.bsp.TestBSPTaskFaults.tearDown(TestBSPTaskFaults.java:618)
at junit.framework.TestCase.runBare(TestCase.java:140)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
{code}

Which is getting a jstack from a test from Hama:




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13224) Fix minor formatting issue in AuthResult#toContextString

2015-03-12 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13224:
-

 Summary: Fix minor formatting issue in AuthResult#toContextString
 Key: HBASE-13224
 URL: https://issues.apache.org/jira/browse/HBASE-13224
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors, security
Affects Versions: 1.0.0, 2.0.0
Reporter: Esteban Gutierrez
Priority: Trivial


Now that we handle namespace permissions AuthResult#toContextString is not 
correctly formatted:
{code}
Access denied for user esteban; reason: Insufficient permissions; remote 
address: /10.20.30.1; request: createTable; context: (user=esteban@XXX, 
scope=defaultaction=CREATE)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13208) Patch build should match the patch file name and not the whole relative URL in findBranchNameFromPatchName

2015-03-11 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13208:
-

 Summary: Patch build should match the patch file name and not the 
whole relative URL in findBranchNameFromPatchName
 Key: HBASE-13208
 URL: https://issues.apache.org/jira/browse/HBASE-13208
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez
Priority: Trivial


In HBASE-1319 we saw that the patch got applied to the wrong branch, the 
problem is findBranchNameFromPatchName matching a regex that contains wildcard 
symbols against the whole URL, in this case the regex is 0.94 and the 
relativePatchURL is /jira/secure/attachment/12703942/HBASE-13193-v4.patch where 
0394 is a match.

Thanks to  [~jonathan.lawlor] for reporting this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-13105) [hbck] Add option to reconstruct hbase:namespace if corrupt

2015-02-25 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-13105:
-

 Summary: [hbck] Add option to reconstruct hbase:namespace if 
corrupt
 Key: HBASE-13105
 URL: https://issues.apache.org/jira/browse/HBASE-13105
 Project: HBase
  Issue Type: Bug
Reporter: Esteban Gutierrez


If the HFile containing the namespaces gets corrupted, we don't have a way to 
gracefully fix it. hbck should handle this in a similar way to 
OfflineMetaRepair.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12983) HBase book mentions hadoo.ssl.enabled when it should be hbase.ssl.enabled

2015-02-06 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12983:
-

 Summary: HBase book mentions hadoo.ssl.enabled when it should be 
hbase.ssl.enabled
 Key: HBASE-12983
 URL: https://issues.apache.org/jira/browse/HBASE-12983
 Project: HBase
  Issue Type: Bug
  Components: documentation
Reporter: Esteban Gutierrez


In the HBase book we say the following:

{quote}
A default HBase install uses insecure HTTP connections for web UIs for the 
master and region servers. To enable secure HTTP (HTTPS) connections instead, 
set *hadoop.ssl.enabled* to true in hbase-site.xml. This does not change the 
port used by the Web UI. To change the port for the web UI for a given HBase 
component, configure that port’s setting in hbase-site.xml. These settings are:
{quote}

The property should be *hbase.ssl.enabled* instead. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12984) SSL cannot be used by the InfoPort in branch-1

2015-02-06 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12984:
-

 Summary: SSL cannot be used by the InfoPort in branch-1
 Key: HBASE-12984
 URL: https://issues.apache.org/jira/browse/HBASE-12984
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0, 2.0.0, 1.1.0
Reporter: Esteban Gutierrez
Priority: Blocker


Setting {{hbase.ssl.enabled}} to {{true}} doesn't enable SSL on the InfoServer. 
Found that the problem is down the InfoServer and HttpConfig in how we setup 
the protocol in the HttpServer:

{code}
for (URI ep : endpoints) {
Connector listener = null;
String scheme = ep.getScheme();
 if (http.equals(scheme)) {
  listener = HttpServer.createDefaultChannelConnector();
} else if (https.equals(scheme)) {
  SslSocketConnector c = new SslSocketConnectorSecure();
  c.setNeedClientAuth(needsClientAuth);
  c.setKeyPassword(keyPassword);
{code}

It depends what end points have been added by the InfoServer:

{code}
builder
  .setName(name)
  .addEndpoint(URI.create(http://; + bindAddress + : + port))
  .setAppDir(HBASE_APP_DIR).setFindPort(findPort).setConf(c);
{code}

Basically we always use http and we don't look via HttConfig if 
{{hbase.ssl.enabled}} was set to true.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12956) Binding to 0.0.0.0 is broken after HBASE-10569

2015-02-02 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12956:
-

 Summary: Binding to 0.0.0.0 is broken after HBASE-10569
 Key: HBASE-12956
 URL: https://issues.apache.org/jira/browse/HBASE-12956
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Esteban Gutierrez


After the Region Server and Master code was merged, we lost the functionality 
to bind to 0.0.0.0 via hbase.regionserver.ipc.address and znodes now get 
created with the wildcard address which means that RSs and the master. Thanks 
to [~dimaspivak] for reporting the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12950) Extend the truncate command to handle region ranges and not just the whole table

2015-01-30 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12950:
-

 Summary: Extend the truncate command to handle region ranges and 
not just the whole table
 Key: HBASE-12950
 URL: https://issues.apache.org/jira/browse/HBASE-12950
 Project: HBase
  Issue Type: New Feature
  Components: Region Assignment, regionserver, shell
Affects Versions: 2.0.0
Reporter: Esteban Gutierrez


We have seen many times during the last few years that when key prefixes are 
time based and the access pattern only consists of writes to recent KVs we can 
end up with tens of thousands of regions and some of those regions will not be 
longer used. Even if users use TTLs and data is eventually deleted we still 
have the old regions around and only performing an online merge can help to 
reduce the excess of regions. Extending the truncate command to handle also 
region ranges can help user that experience this issue to trim the old regions 
if required.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12826) Expose draining servers into ClusterStatus

2015-01-08 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12826:
-

 Summary: Expose draining servers into ClusterStatus
 Key: HBASE-12826
 URL: https://issues.apache.org/jira/browse/HBASE-12826
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez


We currently keep track of dead, live and in-transition RegionServers I think 
we should expose also the list of servers that are being decommissioned via 
draining.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12806) [hbck] move admin.create() to HBaseTestingUtility.createTable in TestHBaseFsck

2015-01-05 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12806:
-

 Summary: [hbck] move admin.create() to 
HBaseTestingUtility.createTable in TestHBaseFsck
 Key: HBASE-12806
 URL: https://issues.apache.org/jira/browse/HBASE-12806
 Project: HBase
  Issue Type: Bug
  Components: hbck
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Minor


TestHBaseFsck should wait until all regions have been assigned after the table 
has been created.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12792) [backport] HBASE-5835: Catch and handle NotServingRegionException when close region attempt fails

2014-12-30 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12792:
-

 Summary: [backport] HBASE-5835: Catch and handle 
NotServingRegionException when close region attempt fails
 Key: HBASE-12792
 URL: https://issues.apache.org/jira/browse/HBASE-12792
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.94.26
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Trivial
 Fix For: 0.94.27


This one is around in 0.94 and its a low hanging fruit when we get a 
NotServerRegionException if the region is not found when we attempt to close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12793) [hbck] closeRegionSilentlyAndWait() should log cause of IOException and retry until hbase.hbck.close.timeout expires

2014-12-30 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12793:
-

 Summary: [hbck] closeRegionSilentlyAndWait() should log cause of 
IOException and retry until  hbase.hbck.close.timeout expires
 Key: HBASE-12793
 URL: https://issues.apache.org/jira/browse/HBASE-12793
 Project: HBase
  Issue Type: Bug
  Components: hbck
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Minor


This is subtask on HBASE-12131 in order to handle gracefully network partitions.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12458) Improve CellCounter command line parsing

2014-11-11 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12458:
-

 Summary: Improve CellCounter command line parsing
 Key: HBASE-12458
 URL: https://issues.apache.org/jira/browse/HBASE-12458
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez
Priority: Minor


Command line options parsing in CellCounter are different form other tools like 
CopyTable,  RowCounter or VerifyReplication. It should be consistent with the 
other tools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12447) Add support for setTimeRange for CopyTable, RowCounter and CellCounter

2014-11-07 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12447:
-

 Summary: Add support for setTimeRange for CopyTable, RowCounter 
and CellCounter
 Key: HBASE-12447
 URL: https://issues.apache.org/jira/browse/HBASE-12447
 Project: HBase
  Issue Type: Improvement
Reporter: Esteban Gutierrez
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12380) Too many attempts to open a region can crash the RegionServer

2014-10-29 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12380:
-

 Summary: Too many attempts to open a region can crash the 
RegionServer
 Key: HBASE-12380
 URL: https://issues.apache.org/jira/browse/HBASE-12380
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Esteban Gutierrez
Priority: Critical


Noticed this while trying to fix faulty test while working on a fix for 
HBASE-12219:
{code}
Tests in error:
  TestRegionServerNoMaster.testMultipleOpen:237 » Service java.io.IOException: 
R...
  TestRegionServerNoMaster.testCloseByRegionServer:211-closeRegionNoZK:201 » 
Service
{code}

Initially I thought the problem was on my patch for HBASE-12219 but I noticed 
that the issue was occurring on the 7th attempt to open the region. However I 
was able to reproduce the same problem in the master branch after increasing 
the number of requests in testMultipleOpen():

{code}
2014-10-29 15:03:45,043 INFO  [Thread-216] regionserver.RSRpcServices(1334): 
Receiving OPEN for the 
region:TestRegionServerNoMaster,,1414620223682.025198143197ea68803e49819eae27ca.,
 which we are already trying to OPEN - ignoring this new request for this 
region.
Submitting openRegion attempt: 16 
2014-10-29 15:03:45,044 INFO  [Thread-216] regionserver.RSRpcServices(1311): 
Open TestRegionServerNoMaster,,1414620223682.025198143197ea68803e49819eae27ca.
2014-10-29 15:03:45,044 INFO  
[PostOpenDeployTasks:025198143197ea68803e49819eae27ca] 
hbase.MetaTableAccessor(1307): Updated row 
TestRegionServerNoMaster,,1414620223682.025198143197ea68803e49819eae27ca. with 
server=192.168.1.105,63082,1414620220789
Submitting openRegion attempt: 17 
2014-10-29 15:03:45,046 ERROR [RS_OPEN_REGION-192.168.1.105:63082-2] 
handler.OpenRegionHandler(88): Region 025198143197ea68803e49819eae27ca was 
already online when we started processing the opening. Marking this new attempt 
as failed
2014-10-29 15:03:45,047 FATAL [Thread-216] regionserver.HRegionServer(1931): 
ABORTING region server 192.168.1.105,63082,1414620220789: Received OPEN for the 
region:TestRegionServerNoMaster,,1414620223682.025198143197ea68803e49819eae27ca.,
 which is already online
2014-10-29 15:03:45,047 FATAL [Thread-216] regionserver.HRegionServer(1937): 
RegionServer abort: loaded coprocessors are: 
[org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint]
2014-10-29 15:03:45,054 WARN  [Thread-216] regionserver.HRegionServer(1955): 
Unable to report fatal error to master
com.google.protobuf.ServiceException: java.io.IOException: Call to 
/192.168.1.105:63079 failed on local exception: java.io.IOException: Connection 
to /192.168.1.105:63079 is closing. Call id=4, waitTime=2
at 
org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1707)
at 
org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1757)
at 
org.apache.hadoop.hbase.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$BlockingStub.reportRSFatalError(RegionServerStatusProtos.java:8301)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:1952)
at 
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.abortRegionServer(MiniHBaseCluster.java:174)
at 
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.access$100(MiniHBaseCluster.java:108)
at 
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer$2.run(MiniHBaseCluster.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:356)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
at 
org.apache.hadoop.hbase.security.User$SecureHadoopUser.runAs(User.java:277)
at 
org.apache.hadoop.hbase.MiniHBaseCluster$MiniHBaseClusterRegionServer.abort(MiniHBaseCluster.java:165)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.abort(HRegionServer.java:1964)
at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.openRegion(RSRpcServices.java:1308)
at 
org.apache.hadoop.hbase.regionserver.TestRegionServerNoMaster.testMultipleOpen(TestRegionServerNoMaster.java:237)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 

[jira] [Created] (HBASE-12365) [hbck] -fixVersionFile should not require a running master

2014-10-28 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12365:
-

 Summary: [hbck] -fixVersionFile should not require a running master
 Key: HBASE-12365
 URL: https://issues.apache.org/jira/browse/HBASE-12365
 Project: HBase
  Issue Type: Bug
  Components: hbck
Reporter: Esteban Gutierrez


The current logic requires to perform something like this in hbck:

{code}
exec {
...
connect();
...
onlineHbck();
...
}
onlineHbck() {
...
offlineHdfsIntegrityRepair();
...
}
{code}

It should be possible to fix {{hbase.version}} without having to connect to the 
master first.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12369) Warn if hbase.bucketcache.size too close or equal to MaxDirectMemorySize

2014-10-28 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12369:
-

 Summary: Warn if hbase.bucketcache.size too close or equal to 
MaxDirectMemorySize
 Key: HBASE-12369
 URL: https://issues.apache.org/jira/browse/HBASE-12369
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Esteban Gutierrez


Our ref guide currently says that its required to leave some room from the 
DirectMemory. However if hbase.bucketcache.size is too close or equal to 
MaxDirectMemorySize it can trigger OOMEs: 

{code}
2014-10-28 16:14:41,585 INFO  [master//172.16.0.101:16020] 
util.ByteBufferArray: Allocating buffers total=5.00 GB, sizePerBuffer=4 MB, 
count=1280, direct=true
2014-10-28 16:14:41,604 INFO  [172.16.0.101:16020.activeMasterManager] 
master.ServerManager: Waiting for region servers count to settle; currently 
checked in 1, slept for 99 ms, expecting minimum of 2, maximum of 2147483647, 
timeout of 4500 ms, interval of 1500 ms.
2014-10-28 16:14:43,144 INFO  [172.16.0.101:16020.activeMasterManager] 
master.ServerManager: Waiting for region servers count to settle; currently 
checked in 1, slept for 1639 ms, expecting minimum of 2, maximum of 2147483647, 
timeout of 4500 ms, interval of 1500 ms.
2014-10-28 16:14:44,057 INFO  [master//172.16.0.101:16020] 
regionserver.HRegionServer: STOPPED: Failed initialization
2014-10-28 16:14:44,058 ERROR [master//172.16.0.101:16020] 
regionserver.HRegionServer: Failed init
java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658)
at java.nio.DirectByteBuffer.init(DirectByteBuffer.java:123)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
at 
org.apache.hadoop.hbase.util.ByteBufferArray.init(ByteBufferArray.java:65)
at 
org.apache.hadoop.hbase.io.hfile.bucket.ByteBufferIOEngine.init(ByteBufferIOEngine.java:47)
at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.getIOEngineFromName(BucketCache.java:310)
at 
org.apache.hadoop.hbase.io.hfile.bucket.BucketCache.init(BucketCache.java:218)
at 
org.apache.hadoop.hbase.io.hfile.CacheConfig.getL2(CacheConfig.java:513)
at 
org.apache.hadoop.hbase.io.hfile.CacheConfig.instantiateBlockCache(CacheConfig.java:536)
at 
org.apache.hadoop.hbase.io.hfile.CacheConfig.init(CacheConfig.java:213)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.handleReportForDutyResponse(HRegionServer.java:1259)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:818)
at java.lang.Thread.run(Thread.java:724)
{code}

It would be helpful to print a warn message that hbase.bucketcache.size too 
close or equal to MaxDirectMemorySize.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12219) Use optionally a TTL based cache for FSTableDescriptors#getAll() and FSTableDescriptors#TableDescriptorAndModtime()

2014-10-09 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12219:
-

 Summary: Use optionally a TTL based cache for 
FSTableDescriptors#getAll() and FSTableDescriptors#TableDescriptorAndModtime()
 Key: HBASE-12219
 URL: https://issues.apache.org/jira/browse/HBASE-12219
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.98.6.1, 0.94.24, 0.99.1
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


Currently table descriptors and tables are cached once they are accessed for 
the first time. Next calls to the master only require a trip to HDFS to lookup 
the modified time in order to reload the table descriptors if modified. However 
in clusters with a large number of tables or concurrent clients and this can be 
too aggressive to HDFS and the master causing contention to process other 
requests. A simple solution is to have a TTL based cached for 
FSTableDescriptors#getAll() and  FSTableDescriptors#TableDescriptorAndModtime() 
that can allow the master to process those calls faster without causing 
contention without having to perform a trip to HDFS for every call. to 
listtables() or getTableDescriptor()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12131) [hbck] undeployRegions should handle gracefully network partitions and other exceptions to avoid the same region deployed multiple times

2014-09-30 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12131:
-

 Summary: [hbck] undeployRegions should handle gracefully network 
partitions and other exceptions to avoid the same region deployed multiple times
 Key: HBASE-12131
 URL: https://issues.apache.org/jira/browse/HBASE-12131
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.94.23
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Critical


If we get an IOE (we currently ignore it) while regions are being undeployed by 
hbck we should make sure that we don't re-assign that region in the master 
before we know that RS was marked as dead and optionally let the user to 
confirm that action or we will end in a split brain situation with clients 
talking to different RSs serving the same region.

The offending part is here in HBaseFsck.undeployRegions():

{code}
 private void undeployRegions(HbckInfo hi) throws IOException, 
InterruptedException {
for (OnlineEntry rse : hi.deployedEntries) {
  LOG.debug(Undeploy region   + rse.hri +  from  + rse.hsa);
  try {
HBaseFsckRepair.closeRegionSilentlyAndWait(admin, rse.hsa, rse.hri);
offline(rse.hri.getRegionName());
  } catch (IOException ioe) {
LOG.warn(Got exception when attempting to offline region 
+ Bytes.toString(rse.hri.getRegionName()), ioe);
  }
}
  }
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-12099) TestScannerModel fails if using jackson 1.9.13

2014-09-25 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-12099:
-

 Summary: TestScannerModel fails if using jackson 1.9.13
 Key: HBASE-12099
 URL: https://issues.apache.org/jira/browse/HBASE-12099
 Project: HBase
  Issue Type: Bug
  Components: REST
Affects Versions: 2.0.0, 0.98.7, 0.99.1
 Environment: hadoop-2.5.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez


TestScannerModel fails if jackson 1.9.13 is used. (Hadoop 2.5 now uses that 
version, see HADOOP-10104):

{code}
Failed tests:   
testToJSON(org.apache.hadoop.hbase.rest.model.TestScannerModel): 
expected:{batch:100,caching:1000,cacheBlocks:false,endRow:enp5eng=,endTime:1245393318192,maxVersions:2147483647,startRow:YWJyYWNhZGFicmE=,startTime:1245219839331,column:[Y29sdW1uMQ==,Y29sdW1uMjpmb28=],labels:[private,public]}
 but 
was:{startRow:YWJyYWNhZGFicmE=,endRow:enp5eng=,batch:100,startTime:1245219839331,endTime:1245393318192,maxVersions:2147483647,caching:1000,cacheBlocks:false,column:[Y29sdW1uMQ==,Y29sdW1uMjpmb28=],label:[private,public]}
{code}

The problem is the annotation used for the labels element which is 'label' 
instead of 'labels'.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-11846) HStore#assertBulkLoadHFileOk should log if a full HFile verification will be performed during a bulkload

2014-08-27 Thread Esteban Gutierrez (JIRA)
Esteban Gutierrez created HBASE-11846:
-

 Summary: HStore#assertBulkLoadHFileOk should log if a full HFile 
verification will be performed during a bulkload
 Key: HBASE-11846
 URL: https://issues.apache.org/jira/browse/HBASE-11846
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.98.6, 0.99.0, 2.0.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
Priority: Trivial


If hbase.hstore.bulkload.verify is set to true in the Region Server, we should 
log that we are about to perform a full scan of the HFiles that are going to be 
bulk loaded, it might be helpful to correlate other performance issues if the 
operator has enabled that feature.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


  1   2   >