[jira] [Created] (HBASE-26812) ShortCircuitingClusterConnection fails to close RegionScanners when making short-circuited calls

2022-03-07 Thread Lars Hofhansl (Jira)
Lars Hofhansl created HBASE-26812:
-

 Summary: ShortCircuitingClusterConnection fails to close 
RegionScanners when making short-circuited calls
 Key: HBASE-26812
 URL: https://issues.apache.org/jira/browse/HBASE-26812
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.4.9
Reporter: Lars Hofhansl


Just ran into this on the Phoenix side.
We retrieve a Connection via {{RegionCoprocessorEnvironment.createConnection... 
getTable(...)}}. And then call get on that table. The Get's key happens to 
local. Now each call to table.get() leaves an open StoreScanner around forever. 
(verified with a memory profiler).

There references are held via 
RegionScannerImpl.storeHeap.scannersForDelayedClose. Eventially the 
RegionServer goes a GC of death.

The reason appears to be that in this case there is currentCall context. Some 
time in 2.x the Rpc handler/call was made responsible for closing open region 
scanners, but we forgot to handle {{ShortCircuitingClusterConnection}}

It's not immediately clear how to fix this. But it does make 
ShortCircuitingClusterConnection useless and dangerous. If you use it, you 
*will* create a giant memory leak.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-24742.
---
Resolution: Fixed

Also pushed to branch-2 and master.

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>    Reporter: Lars Hofhansl
>    Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 3.0.0-alpha-1, 1.7.0, 2.4.0
>
> Attachments: 24742-master.txt, hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Reopened] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-24742:
---

Lemme put this into branch-2 and master as well.

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>    Reporter: Lars Hofhansl
>    Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 1.7.0
>
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-16 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-24742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-24742.
---
Resolution: Fixed

> Improve performance of SKIP vs SEEK logic
> -
>
> Key: HBASE-24742
> URL: https://issues.apache.org/jira/browse/HBASE-24742
> Project: HBase
>  Issue Type: Bug
>  Components: Performance, regionserver
>Affects Versions: 3.0.0-alpha-1, 1.7.0, 2.4.0
>    Reporter: Lars Hofhansl
>    Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 1.7.0
>
> Attachments: hbase-1.6-regression-flame-graph.png, 
> hbase-24742-branch-1.txt
>
>
> In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
> slowdown in scanning scenarios.
> We tracked it back to HBASE-17958 and HBASE-19863.
> Both add comparisons to one of the tightest HBase has.
> [~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-24742) Improve performance of SKIP vs SEEK logic

2020-07-14 Thread Lars Hofhansl (Jira)
Lars Hofhansl created HBASE-24742:
-

 Summary: Improve performance of SKIP vs SEEK logic
 Key: HBASE-24742
 URL: https://issues.apache.org/jira/browse/HBASE-24742
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


In our testing of HBase 1.3 against the current tip of branch-1 we saw a 30% 
slowdown in scanning scenarios.

We tracked it back to HBASE-17958 and HBASE-19863.
Both add comparisons to one of the tightest HBase has.

[~bharathv]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-9272) A parallel, unordered scanner

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-9272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-9272.
--
Resolution: Won't Fix

> A parallel, unordered scanner
> -
>
> Key: HBASE-9272
> URL: https://issues.apache.org/jira/browse/HBASE-9272
> Project: HBase
>  Issue Type: New Feature
>    Reporter: Lars Hofhansl
>Priority: Minor
> Attachments: 9272-0.94-v2.txt, 9272-0.94-v3.txt, 9272-0.94-v4.txt, 
> 9272-0.94.txt, 9272-trunk-v2.txt, 9272-trunk-v3.txt, 9272-trunk-v3.txt, 
> 9272-trunk-v4.txt, 9272-trunk.txt, ParallelClientScanner.java, 
> ParallelClientScanner.java
>
>
> The contract of ClientScanner is to return rows in sort order. That limits 
> the order in which region can be scanned.
> I propose a simple ParallelScanner that does not have this requirement and 
> queries regions in parallel, return whatever gets returned first.
> This is generally useful for scans that filter a lot of data on the server, 
> or in cases where the client can very quickly react to the returned data.
> I have a simple prototype (doesn't do error handling right, and might be a 
> bit heavy on the synchronization side - it used a BlockingQueue to hand data 
> between the client using the scanner and the threads doing the scanning, it 
> also could potentially starve some scanners long enugh to time out at the 
> server).
> On the plus side, it's only a 130 lines of code. :)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-6970) hbase-deamon.sh creates/updates pid file even when that start failed.

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6970.
--
Resolution: Won't Fix

> hbase-deamon.sh creates/updates pid file even when that start failed.
> -
>
> Key: HBASE-6970
> URL: https://issues.apache.org/jira/browse/HBASE-6970
> Project: HBase
>  Issue Type: Bug
>  Components: Usability
>Reporter: Lars Hofhansl
>Priority: Major
>
> We just ran into a strange issue where could neither start nor stop services 
> with hbase-deamon.sh.
> The problem is this:
> {code}
> nohup nice -n $HBASE_NICENESS "$HBASE_HOME"/bin/hbase \
> --config "${HBASE_CONF_DIR}" \
> $command "$@" $startStop > "$logout" 2>&1 < /dev/null &
> echo $! > $pid
> {code}
> So the pid file is created or updated even when the start of the service 
> failed. The next stop command will then fail, because the pid file has the 
> wrong pid in it.
> Edit: Spelling and more spelling errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-13751) Refactoring replication WAL reading logic as WAL Iterator

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-13751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-13751.
---
Resolution: Won't Fix

> Refactoring replication WAL reading logic as WAL Iterator
> -
>
> Key: HBASE-13751
> URL: https://issues.apache.org/jira/browse/HBASE-13751
> Project: HBase
>  Issue Type: Brainstorming
>    Reporter: Lars Hofhansl
>Priority: Major
>
> The current replication code is all over the place.
> A simple refactoring that we could consider is to factor out the part that 
> reads from the WALs. Could be a simple iterator interface with one additional 
> wrinkle: The iterator needs to be able to provide the position (file and 
> offset) of the last read edit.
> Once we have this, we use this as a building block to many other changes in 
> the replication code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-14014) Explore row-by-row grouping options

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14014.
---
Resolution: Won't Fix

> Explore row-by-row grouping options
> ---
>
> Key: HBASE-14014
> URL: https://issues.apache.org/jira/browse/HBASE-14014
> Project: HBase
>  Issue Type: Improvement
>  Components: Replication
>Reporter: Lars Hofhansl
>Priority: Major
>
> See discussion in parent.
> We need to considering the following attributes of WALKey:
> * The cluster ids
> * Table Name
> * write time (here we could use the latest of any batch)
> * seqNum
> As long as we preserve these we can rearrange the cells between WALEdits. 
> Since seqNum is unique this will be a challenge. Currently it is not used, 
> but we shouldn't design anything that prevents us guaranteeing better 
> ordering guarantees using seqNum.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-14509) Configurable sparse indexes?

2019-12-24 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14509.
---
Resolution: Won't Fix

> Configurable sparse indexes?
> 
>
> Key: HBASE-14509
> URL: https://issues.apache.org/jira/browse/HBASE-14509
> Project: HBase
>  Issue Type: Brainstorming
>    Reporter: Lars Hofhansl
>Priority: Major
>
> This idea just popped up today and I wanted to record it for discussion:
> What if we kept sparse column indexes per region or HFile or per configurable 
> range?
> I.e. For any given CQ we record the lowest and highest value for a particular 
> range (HFile, Region, or a custom range like the Phoenix guide post).
> By tweaking the size of these ranges we can control the size of the index, vs 
> its selectivity.
> For example if we kept it by HFile we can almost instantly decide whether we 
> need scan a particular HFile at all to find a particular value in a Cell.
> We can also collect min/max values for each n MB of data, for example when we 
> can the region the first time. Assuming ranges are large enough we can always 
> keep the index in memory together with the region.
> Kind of a sparse local index. Might much easier than the buddy region stuff 
> we've been discussing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HBASE-23364) HRegionServer sometimes does not shut down.

2019-12-05 Thread Lars Hofhansl (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-23364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-23364.
---
Fix Version/s: 1.6.0
   2.3.0
   3.0.0
   Resolution: Fixed

Committed to branch-1, branch-2, and master.

> HRegionServer sometimes does not shut down.
> ---
>
> Key: HBASE-23364
> URL: https://issues.apache.org/jira/browse/HBASE-23364
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.3.0, 1.6.0
>Reporter: Lars Hofhansl
>    Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 3.0.0, 2.3.0, 1.6.0
>
> Attachments: 23364-branch-1.txt
>
>
> Note that I initially assumed this to be a Phoenix bug. But I tracked it down 
> to HBase.
> 
> I noticed that recently only. Latest build from HBase's branch-1 and latest 
> build from Phoenix' 4.x-HBase-1.5. I don't know, yet, whether it's a Phoenix 
> or an HBase issues.
> Just filing it here for later reference.
> jstack show this thread as the only non-daemon thread:
> {code:java}
> "pool-11-thread-1" #470 prio=5 os_prio=0 tid=0x558a709a4800 nid=0x238e 
> waiting on condition [0x7f213ad68000]
>java.lang.Thread.State: WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for  <0x00058eafece8> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
> at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
> at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
> No other information. Somebody created a thread pool somewhere and forgot to 
> set the threads to daemon or is not shutting down the pool properly.
> Edit: I looked for other reference of the locked objects in the stack dump, 
> but didn't find any.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23279) Switch default block encoding to ROW_INDEX_V1

2019-11-11 Thread Lars Hofhansl (Jira)
Lars Hofhansl created HBASE-23279:
-

 Summary: Switch default block encoding to ROW_INDEX_V1
 Key: HBASE-23279
 URL: https://issues.apache.org/jira/browse/HBASE-23279
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl


Currently we set both block encoding and compression to NONE.

ROW_INDEX_V1 has many advantages and (almost) no disadvantages (the hfiles are 
slightly larger about 3% or so). I think that would a better default than NONE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-23240) branch-1 master and regionservers do not start when compiled against Hadoop 3.2.1

2019-10-31 Thread Lars Hofhansl (Jira)
Lars Hofhansl created HBASE-23240:
-

 Summary: branch-1 master and regionservers do not start when 
compiled against Hadoop 3.2.1
 Key: HBASE-23240
 URL: https://issues.apache.org/jira/browse/HBASE-23240
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl


Exception in thread "main" java.lang.NoSuchMethodError: 
com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
 at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357)
 at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338)
 at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1679)
 at 
org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:339)
 at 
org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:572)
 at 
org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:174)
 at 
org.apache.hadoop.util.GenericOptionsParser.(GenericOptionsParser.java:156)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
 at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:127)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-22457) Harden rhe HBase HFile reader reference counting

2019-05-22 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-22457:
-

 Summary: Harden rhe HBase HFile reader reference counting
 Key: HBASE-22457
 URL: https://issues.apache.org/jira/browse/HBASE-22457
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl


The problem that any coprocessor hook that replaces a passed scanner without 
closing it can cause an incorrect reference count.
This was bad and wrong before of course, but now it has pretty bad 
consequences, since an incorrect reference could will prevent HFiles from being 
archived indefinitely.

All hooks that are passed a scanner and return a scanner are suspect, since the 
returned scanner may or may not close the passed scanner:
* preCompact
* preCompactScannerOpen
* preFlush
* preFlushScannerOpen
* preScannerOpen
* preStoreScannerOpen
* preStoreFileReaderOpen...? (not sure about this one, it could mess with the 
reader)

I sampled the Phoenix and also Tephra code, and found a few instances where 
this is happening.
And for those I filed issued: TEPHRA-300, PHOENIX-5291
(We're not using Tephra)

The Phoenix ones should be rare. In our case we are seeing readers with 
refCount > 1000.
Perhaps there are other issues, a path where not all exceptions are caught and 
scanner is left open that way perhaps. (Generally I am not a fan of reference 
counting in complex systems - it's too easy to miss something. But that's a 
different discussion. :) ).

Let's brainstorm some way in which we can harden this.

[~ram_krish], [~anoop.hbase], [~apurtell]




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22385) Consider "programmatic" HFiles

2019-05-08 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-22385:
-

 Summary: Consider "programmatic" HFiles
 Key: HBASE-22385
 URL: https://issues.apache.org/jira/browse/HBASE-22385
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl


For various use case (among other there is mass deletes) it would be great if 
HBase had a mechanism for programmatic HFiles. I.e. HFiles (with HFileScanner 
and Reader) that produce KeyValue just like any other old HFile, but the key 
values produced are generated or produced by some other means rather than being 
physically read from some storage medium.

In fact this could be a generalization for the various HFiles we have: (Normal) 
HFiles, HFileLinks, HalfStoreFiles, etc.

A simple way could be to allow for storing a classname into the HFile. Upon 
reading the HFile HBase would instantiate an instance of that class and that 
instance is responsible for all further interaction with that HFile. For normal 
HFiles it would just be the normal HFileReader.

(Remember this is Brainstorming)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-22235) OperationStatus.{SUCCESS|FAILURE|NOT_RUN} are not visible to 3rd party coprocessors

2019-04-13 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-22235:
-

 Summary: OperationStatus.{SUCCESS|FAILURE|NOT_RUN} are not visible 
to 3rd party coprocessors
 Key: HBASE-22235
 URL: https://issues.apache.org/jira/browse/HBASE-22235
 Project: HBase
  Issue Type: Bug
  Components: Coprocessors
Reporter: Lars Hofhansl


This looks like an oversight.

preBatchMutate is useless for some operation due to this.

See also TEPHRA-299. This looks like an oversight.

MiniBatchOperationInProgress has limited visibility for coprocessors. 
OperationStatus and OperationStatusCode should have the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21856) Consider Causal Replication

2019-02-07 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-21856:
-

 Summary: Consider Causal Replication
 Key: HBASE-21856
 URL: https://issues.apache.org/jira/browse/HBASE-21856
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl


We've had various effort to improve the ordering guarantees for HBase, most 
notably Serial Replication.

I think in many cases guaranteeing a Total Replication Order is not required, 
but a simpler Causal Replication Order is sufficient.
Specifically we would guarantee causal ordering for a single Rowkey. Any 
changes to a Row - Puts, Deletes, etc) would be replicated in the exact order 
in which they occurred in the source system.

Unlike total ordering this can be accomplished with only local region server 
control.

I don't have a full design in mind, let's discuss here. It should be sufficient 
to to the following:
# RegionServers only adopt the replication queues from other RegionServers for 
regions they (now) own. This requires log splitting for replication.
# RegionServer ship all edits for queues adopted from other servers before any 
of their "own" edits are shipped.

It's probably a bit more involved, but should be much cheaper that the total 
ordering provided by serial replication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21590) Optimize trySkipToNextColumn in StoreScanner a bit

2018-12-12 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-21590:
-

 Summary: Optimize trySkipToNextColumn in StoreScanner a bit
 Key: HBASE-21590
 URL: https://issues.apache.org/jira/browse/HBASE-21590
 Project: HBase
  Issue Type: Task
Reporter: Lars Hofhansl


See latest comment on HBASE-17958



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19034) Implement "optimize SEEK to SKIP" in storefile scanner

2018-12-12 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-19034.
---
Resolution: Won't Fix

Closing as "Won't Fix" as it turns out that the doing this optimization at the 
StoreFile (or HFile) Scanner level misses the most important opportunity for 
optimization - it's too far down the stack.

> Implement "optimize SEEK to SKIP" in storefile scanner
> --
>
> Key: HBASE-19034
> URL: https://issues.apache.org/jira/browse/HBASE-19034
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Guanghao Zhang
>Priority: Major
>
> {code}
>   protected boolean trySkipToNextRow(Cell cell) throws IOException {
> Cell nextCell = null;
> do { 
>   Cell nextIndexedKey = getNextIndexedKey();
>   if (nextIndexedKey != null && nextIndexedKey != 
> KeyValueScanner.NO_NEXT_INDEXED_KEY
>   && matcher.compareKeyForNextRow(nextIndexedKey, cell) >= 0) { 
> this.heap.next();
> ++kvsScanned;
>   } else {
> return false;
>   }
> } while ((nextCell = this.heap.peek()) != null && 
> CellUtil.matchingRows(cell, nextCell));
> return true;
>   }
> {code}
> When SQM return a SEEK_NEXT_ROW, the store scanner will seek to the cell from 
> next row. HBASE-13109 optimized the SEEK to SKIP when we can read the cell in 
> current loaded block. So it will skip by call heap.next to the cell from next 
> row. But the problem is it compare too many times with the nextIndexedKey in 
> the while loop. We plan move the compare outside the loop to reduce compare 
> times. One problem is the nextIndexedKey maybe changed when call heap.peek, 
> because the current storefile scanner was changed. So my proposal is to move 
> the "optimize SEEK to SKIP" to storefile scanner. When we call seek for 
> storefile scanner, it may real seek or implement seek by several times skip.
> Any suggestions are welcomed. Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HBASE-20993) [Auth] IPC client fallback to simple auth allowed doesn't work

2018-09-21 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-20993:
---

> [Auth] IPC client fallback to simple auth allowed doesn't work
> --
>
> Key: HBASE-20993
> URL: https://issues.apache.org/jira/browse/HBASE-20993
> Project: HBase
>  Issue Type: Bug
>  Components: Client, IPC/RPC, security
>Affects Versions: 1.2.6, 1.3.2, 1.2.7, 1.4.7
>Reporter: Reid Chan
>Assignee: Jack Bearden
>Priority: Critical
> Fix For: 1.5.0, 1.4.8
>
> Attachments: HBASE-20993.001.patch, 
> HBASE-20993.003.branch-1.flowchart.png, HBASE-20993.branch-1.002.patch, 
> HBASE-20993.branch-1.003.patch, HBASE-20993.branch-1.004.patch, 
> HBASE-20993.branch-1.005.patch, HBASE-20993.branch-1.006.patch, 
> HBASE-20993.branch-1.007.patch, HBASE-20993.branch-1.008.patch, 
> HBASE-20993.branch-1.009.patch, HBASE-20993.branch-1.009.patch, 
> HBASE-20993.branch-1.2.001.patch, HBASE-20993.branch-1.wip.002.patch, 
> HBASE-20993.branch-1.wip.patch, yetus-local-testpatch-output-009.txt
>
>
> It is easily reproducible.
> client's hbase-site.xml: hadoop.security.authentication:kerberos, 
> hbase.security.authentication:kerberos, 
> hbase.ipc.client.fallback-to-simple-auth-allowed:true, keytab and principal 
> are right set
> A simple auth hbase cluster, a kerberized hbase client application. 
> application trying to r/w/c/d table will have following exception:
> {code}
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
>   at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
>   at 
> org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:617)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$700(RpcClientImpl.java:162)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:743)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:740)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:740)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:906)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:873)
>   at 
> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1241)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
>   at 
> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58383)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1592)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1530)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1552)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1581)
>   at 
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1738)
>   at 
> org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38)
>   at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:134)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4297)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4289)
>   at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsyncV2(HBaseAdmin.java:753)
>   at 
> org.apache.hadoop.hbase.client.HBaseA

[jira] [Created] (HBASE-21166) Creating a CoprocessorHConnection re-retrieves the cluster id from ZK

2018-09-06 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-21166:
-

 Summary: Creating a CoprocessorHConnection re-retrieves the 
cluster id from ZK
 Key: HBASE-21166
 URL: https://issues.apache.org/jira/browse/HBASE-21166
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.5.0
Reporter: Lars Hofhansl


CoprocessorHConnections are created for example during a call of 
CoprocessorHost$Environent.getTable(...). The region server already know the 
cluster id, yet, we're resolving it over and over again.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-20446) Allow building HBase 1.x against Hadoop 3.1.x

2018-09-06 Thread Lars Hofhansl (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-20446.
---
  Resolution: Fixed
Release Note: Finally committed this.

> Allow building HBase 1.x against Hadoop 3.1.x
> -
>
> Key: HBASE-20446
> URL: https://issues.apache.org/jira/browse/HBASE-20446
> Project: HBase
>  Issue Type: Improvement
>    Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 1.5.0
>
> Attachments: 20446.txt
>
>
> Simple change, just leaving it here in case somebody needs this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21137) After HBASE-20940 any local index query will open all HFiles of every Region involved in the query

2018-08-31 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-21137:
-

 Summary: After HBASE-20940 any local index query will open all 
HFiles of every Region involved in the query
 Key: HBASE-21137
 URL: https://issues.apache.org/jira/browse/HBASE-21137
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


See HBASE-20940.

[~vishk], [~apurtell]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21033) Separate StoreHeap from StoreFileHeap

2018-08-09 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-21033:
-

 Summary: Separate StoreHeap from StoreFileHeap
 Key: HBASE-21033
 URL: https://issues.apache.org/jira/browse/HBASE-21033
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl


Currently KeyValueHeap is used for both, heaps of StoreScanners at the Region 
level as well as heaps of StoreFileScanners (and a MemstoreScanner) at the 
Store level.

This is various problems:
 # Some incorrect method usage can only be deduced at runtime via runtime 
exception.
 # In profiling sessions it's hard to distinguish the two.
 # It's just not clean :)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-6562) Fake KVs are sometimes passed to filters

2018-05-12 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6562.
--
Resolution: Fixed

In 1.4+ this should be fixed.

> Fake KVs are sometimes passed to filters
> 
>
> Key: HBASE-6562
> URL: https://issues.apache.org/jira/browse/HBASE-6562
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
>Priority: Minor
> Attachments: 6562-0.94-v1.txt, 6562-0.96-v1.txt, 6562-v2.txt, 
> 6562-v3.txt, 6562-v4.txt, 6562-v5.txt, 6562.txt, minimalTest.java
>
>
> In internal tests at Salesforce we found that fake row keys sometimes are 
> passed to filters (Filter.filterRowKey(...) specifically).
> The KVs are eventually filtered by the StoreScanner/ScanQueryMatcher, but the 
> row key is passed to filterRowKey in RegionScannImpl *before* that happens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20459) Majority of scan time in HBase-1 spent in size estimation

2018-04-19 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-20459:
-

 Summary: Majority of scan time in HBase-1 spent in size estimation
 Key: HBASE-20459
 URL: https://issues.apache.org/jira/browse/HBASE-20459
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
 Attachments: Screenshot_20180419_162559.png





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-20446) Allow building HBase 1.x against Hadoop 3.1.0

2018-04-17 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-20446:
-

 Summary: Allow building HBase 1.x against Hadoop 3.1.0
 Key: HBASE-20446
 URL: https://issues.apache.org/jira/browse/HBASE-20446
 Project: HBase
  Issue Type: Improvement
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 20446.txt





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HBASE-19631) Allow building HBase 1.5.x against Hadoop 3.0.0

2018-01-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-19631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-19631.
---
Resolution: Fixed

Committed to hbase-1 branch.

> Allow building HBase 1.5.x against Hadoop 3.0.0
> ---
>
> Key: HBASE-19631
> URL: https://issues.apache.org/jira/browse/HBASE-19631
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Major
> Fix For: 1.5.0
>
> Attachments: 19631.txt
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-19631) Allow building HBase 1.5.x against Hadoop 3.0.0

2017-12-26 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-19631:
-

 Summary: Allow building HBase 1.5.x against Hadoop 3.0.0
 Key: HBASE-19631
 URL: https://issues.apache.org/jira/browse/HBASE-19631
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HBASE-15453) [Performance] Considering reverting HBASE-10015 - reinstate synchronized in StoreScanner

2017-12-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-15453.
---
Resolution: Won't Fix

Lemme just close. In 1.3+ it's not an issue anyway (the need to synchronize is 
gone there)

> [Performance] Considering reverting HBASE-10015 - reinstate synchronized in 
> StoreScanner
> 
>
> Key: HBASE-15453
> URL: https://issues.apache.org/jira/browse/HBASE-15453
> Project: HBase
>  Issue Type: Improvement
>  Components: Performance
>    Reporter: Lars Hofhansl
>    Assignee: Lars Hofhansl
>Priority: Critical
> Attachments: 15453-0.98.txt
>
>
> In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
> StoreScanner are slower that explicit locks.
> I was surprised by this. To make sure I added a simple perf test and many 
> folks ran it on their machines. All found that explicit locks were faster.
> Now... I just ran that test again. On the latest JDK8 I find that now the 
> intrinsic locks are significantly faster:
> (OpenJDK Runtime Environment (build 1.8.0_72-b15))
> Explicit locks:
> 10 runs  mean:2223.6 sigma:72.29412147609237
> Intrinsic locks:
> 10 runs  mean:1865.3 sigma:32.63755505548784
> I confirmed the same with timing some Phoenix scans. We can save a bunch of 
> time by changing this back 
> Arrghhh... So maybe it's time to revert this now...?
> (Note that in trunk due to [~ram_krish]'s work, we do not lock in 
> StoreScanner anymore)
> I'll attach the perf test and a patch that changes lock to synchronized, if 
> some folks could run this on 0.98, that'd be great.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HBASE-13094) Consider Filters that are evaluated before deletes and see delete markers

2017-12-23 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-13094.
---
Resolution: Won't Fix

No interest. Closing.

> Consider Filters that are evaluated before deletes and see delete markers
> -
>
> Key: HBASE-13094
> URL: https://issues.apache.org/jira/browse/HBASE-13094
> Project: HBase
>  Issue Type: Brainstorming
>  Components: regionserver, Scanners
>Reporter: Lars Hofhansl
>    Assignee: Lars Hofhansl
> Attachments: 13094-0.98.txt
>
>
> That would be good for full control filtering of all cells, such as needed 
> for some transaction implementations.
> [~ghelmling]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19534) Document risks of RegionObserver.preStoreScannerOpen

2017-12-16 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-19534:
-

 Summary: Document risks of RegionObserver.preStoreScannerOpen
 Key: HBASE-19534
 URL: https://issues.apache.org/jira/browse/HBASE-19534
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl


We just had an outage because we used preStoreScannerOpen, which, in our case, 
created a new StoreScanner. In HBase versions before 1.3 this caused a definite 
memory leak, a reference to the old StoreScanner (if not null) would be held by 
the store until the region is closed.

The 1.3 and later there's no such leak, but still the old scanner is not 
properly closed.

This should be added to the Javadoc and the ZooKeeperScanPolicyObserver example 
should be fixed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-19458) Allow building HBase 1.3.x against Hadoop 2.8.2

2017-12-07 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-19458:
-

 Summary: Allow building HBase 1.3.x against Hadoop 2.8.2
 Key: HBASE-19458
 URL: https://issues.apache.org/jira/browse/HBASE-19458
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18228) HBCK improvements

2017-06-16 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-18228:
-

 Summary: HBCK improvements
 Key: HBASE-18228
 URL: https://issues.apache.org/jira/browse/HBASE-18228
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


We just had a prod issue and running HBCK the way we did actually causes more 
problems.
In part HBCK did stuff we did not expect, in part we had little visibility into 
what HBCK was doing, and in part the logging was confusing.

I'm proposing 2 improvements:
1. A dry-run mode. Run, and just list what would have been done.
2. An interactive mode. Run, and for each action request Y/N user input. So 
that a user can opt-out of stuff.

[~jmhsieh], FYI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HBASE-18165) Predicate based deletion during major compactions

2017-06-05 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-18165:
-

 Summary: Predicate based deletion during major compactions
 Key: HBASE-18165
 URL: https://issues.apache.org/jira/browse/HBASE-18165
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl


In many cases it is expensive to place a delete per version, column, or family.
HBase should have way to specify a predicate and remove all Cells matching the 
predicate during the next compactions (major and minor).

Nothing more concrete. The tricky part would be to know when it is safe to 
remove the predicate, i.e. when we can be sure that all Cells matching the 
predicate actually have been removed.

Could potentially use HBASE-12859 for that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-18000) Make sure we always return the scanner id with ScanResponse

2017-05-05 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-18000:
-

 Summary: Make sure we always return the scanner id with 
ScanResponse
 Key: HBASE-18000
 URL: https://issues.apache.org/jira/browse/HBASE-18000
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


Some external tooling (like OpenTSDB) relies on the scanner id to tie 
asynchronous responses back to their requests.

(see comments on HBASE-17489)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [ANNOUNCE] Apache HBase 1.3.1 is now available for download

2017-04-21 Thread lars hofhansl
Hi Mikhail,

I don't see a 1.3.1 tag, yet.

Thanks.

-- Lars

  From: Mikhail Antonov 
 To: "u...@hbase.apache.org" ; "dev@hbase.apache.org" 
 
 Sent: Friday, April 21, 2017 1:02 AM
 Subject: [ANNOUNCE] Apache HBase 1.3.1 is now available for download
   
The HBase team is happy to announce the immediate availability of HBase 1.3
.1!

Apache HBase is an open-source, distributed, versioned, non-relational
database. Apache HBase gives you low latency random access to billions of
rows with millions of columns atop non-specialized hardware. To learn more
about HBase, see https://hbase.apache.org/.

HBase 1.3.1 is the first maintenance release in the HBase 1.3.z release
line, continuing on the theme of bringing a stable, reliable database to
the Hadoop and NoSQL communities. This release includes 68 bugfixes and
improvements since the initial 1.3.0 release.

Notable fixes include:

[HBASE-16630] - Fragmentation in long running Bucket Cache
[HBASE-17059] - SimpleLoadBalancer schedules large amount of invalid region
moves
[HBASE-17060] - Compute region locality in parallel at startup
[HBASE-17227] - FSHLog may roll a new writer successfully with unflushed
entries
[HBASE-17265, HBASE-17275] - Region assignment fixes.

And other important fixes, including the areas of load balancing, region
assignment, replication and write-ahead log.

The full list of resolved issues is available at

https://s.apache.org/hbase-1.3.1-jira-releasenotes

Download through an ASF mirror near you:

http://www.apache.org/dyn/closer.lua/hbase/1.3.1

The relevant checksums files are available at:

    https://www.apache.org/dist/hbase/1.3.1/hbase-1.3.1-src.tar.gz.mds
    https://www.apache.org/dist/hbase/1.3.1/hbase-1.3.1-bin.tar.gz.mds

Project members signature keys can be found at

    https://www.apache.org/dist/hbase/KEYS

PGP signatures are available at:

    https://www.apache.org/dist/hbase/1.3.1/hbase-1.3.1-src.tar.gz.asc
    https://www.apache.org/dist/hbase/1.3.1/hbase-1.3.1-bin.tar.gz.asc

For instructions on verifying ASF release downloads, please see

    https://www.apache.org/dyn/closer.cgi#verify

Questions, comments and problems are always welcome at: dev@hbase.apache.org
.

Thank you!
The HBase Dev Team



   

[jira] [Created] (HBASE-17893) Allow HBase to build against Hadoop 2.8.0

2017-04-07 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-17893:
-

 Summary: Allow HBase to build against Hadoop 2.8.0
 Key: HBASE-17893
 URL: https://issues.apache.org/jira/browse/HBASE-17893
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-remote-resources-plugin:1.5:process (default) on 
project hbase-assembly: Error rendering velocity resource. Error invoking 
method 'get(java.lang.Integer)' in java.util.ArrayList at 
META-INF/LICENSE.vm[line 1671, column 8]: InvocationTargetException: Index: 0, 
Size: 0 -> [Help 1]
{code}

The in the generated LICENSE.
{code}
This product includes Nimbus JOSE+JWT licensed under the The Apache Software 
License, Version 2.0.

${dep.licenses[0].comments}
Please check  this License for acceptability here:

https://www.apache.org/legal/resolved

If it is okay, then update the list named 'non_aggregate_fine' in the 
LICENSE.vm file.
If it isn't okay, then revert the change that added the dependency.

More info on the dependency:

com.nimbusds
nimbus-jose-jwt
3.9

maven central search
g:com.nimbusds AND a:nimbus-jose-jwt AND v:3.9

project website
https://bitbucket.org/connect2id/nimbus-jose-jwt
project source
https://bitbucket.org/connect2id/nimbus-jose-jwt
{code}

Maybe the problem is just that it says: Apache _Software_ License



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-9739) HBaseClient does not behave nicely when the called thread is interrupted

2017-04-07 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-9739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-9739.
--
Resolution: Won't Fix

Old issue. No update. Closing.

> HBaseClient does not behave nicely when the called thread is interrupted
> 
>
> Key: HBASE-9739
> URL: https://issues.apache.org/jira/browse/HBASE-9739
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
>
> Just ran into a scenario where HBaseClient became permanently useless after 
> we interrupted the using thread.
> The problem is here:
> {code}
>   } catch(IOException e) {
> markClosed(e);
> {code}
> In sendParam(...).
> If the IOException is caused by an interrupt we should not close the 
> connection.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-10145) Table creation should proceed in the presence of a stale znode

2017-04-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-10145.
---
Resolution: Won't Fix

Closing old issue.

> Table creation should proceed in the presence of a stale znode
> --
>
> Key: HBASE-10145
> URL: https://issues.apache.org/jira/browse/HBASE-10145
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
>Priority: Minor
>
> HBASE-7600 fixed a race condition where concurrent attempts to create the 
> same table could succeed.
> An unfortunate side effect is that it is now impossible to create a table as 
> long as the table's znode is around, which is an issue when a cluster was 
> wiped at the HDFS level.
> Minor issue as we have discussed this many times before, but it ought to be 
> possible to check whether the table directory exists and if not either create 
> it or remove the corresponding znode.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-10028) Cleanup metrics documentation

2017-04-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-10028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-10028.
---
Resolution: Won't Fix

Closing old issue.

> Cleanup metrics documentation
> -
>
> Key: HBASE-10028
> URL: https://issues.apache.org/jira/browse/HBASE-10028
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
>
> The current documentation of the metrics is incomplete and at point incorrect 
> (HDFS latencies are in ns rather than ms for example).
> We should clean this up and add other related metrics as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-6492) Remove Reflection based Hadoop abstractions

2017-04-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6492.
--
Resolution: Won't Fix

Closing old issue.

> Remove Reflection based Hadoop abstractions
> ---
>
> Key: HBASE-6492
> URL: https://issues.apache.org/jira/browse/HBASE-6492
> Project: HBase
>  Issue Type: Improvement
>    Reporter: Lars Hofhansl
>
> In 0.96 we now have the Hadoop1-compat and Hadoop2-compat projects.
> The reflection we're using to deal with different versions of Hadoop should 
> be removed in favour of using the compact projects.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-5475) Allow importtsv and Import to work truly offline when using bulk import option

2017-04-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-5475.
--
Resolution: Won't Fix

Closing old issue.

> Allow importtsv and Import to work truly offline when using bulk import option
> --
>
> Key: HBASE-5475
> URL: https://issues.apache.org/jira/browse/HBASE-5475
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Lars Hofhansl
>Priority: Minor
>
> Currently importtsv (and now also Import with HBASE-5440) support using 
> HFileOutputFormat for later bulk loading.
> However, currently that cannot be without having access to the table we're 
> going to import to, because both importtsv and Import need to lookup the 
> split points, and find the compression setting.
> It would be nice if there would be an offline way to provide the split point 
> and compression setting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-5311) Allow inmemory Memstore compactions

2017-04-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-5311.
--
Resolution: Won't Fix

Closing old issue.

> Allow inmemory Memstore compactions
> ---
>
> Key: HBASE-5311
> URL: https://issues.apache.org/jira/browse/HBASE-5311
> Project: HBase
>  Issue Type: Improvement
>    Reporter: Lars Hofhansl
> Attachments: InternallyLayeredMap.java
>
>
> Just like we periodically compact the StoreFiles we should also periodically 
> compact the MemStore.
> During these compactions we eliminate deleted cells, expired cells, cells to 
> removed because of version count, etc, before we even do a memstore flush.
> Besides the optimization that we could get from this, it should also allow us 
> to remove the special handling of ICV, Increment, and Append (all of which 
> use upsert logic to avoid accumulating excessive cells in the Memstore).
> Not targeting this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (HBASE-17877) Replace/improve HBase's byte[] comparator

2017-04-04 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-17877:
-

 Summary: Replace/improve HBase's byte[] comparator
 Key: HBASE-17877
 URL: https://issues.apache.org/jira/browse/HBASE-17877
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


[~vik.karma] did some extensive tests and found that Hadoop's version is faster 
- dramatically faster in some cases.

Patch forthcoming.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HBASE-17440) [0.98] Make sure DelayedClosing chore is stopped as soon as an HConnection is closed

2017-01-09 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-17440.
---
   Resolution: Fixed
Fix Version/s: 0.98.25

Done.

> [0.98] Make sure DelayedClosing chore is stopped as soon as an HConnection is 
> closed
> 
>
> Key: HBASE-17440
> URL: https://issues.apache.org/jira/browse/HBASE-17440
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.24
>    Reporter: Lars Hofhansl
>    Assignee: Lars Hofhansl
> Fix For: 0.98.25
>
> Attachments: 17440.txt
>
>
> We're seeing many issue with run-away ZK client connection in long running 
> app servers. 10k or more send or event threads are happening frequently.
> While I looked around in the code I noticed that DelayedClosing closing is 
> not immediately ended when an HConnection is closed, when there's an issue 
> with HBase or ZK and client reconnect in a tight loop, this can lead 
> temporarily to very many threads running. These will all get cleaned out 
> after at most 60s, but during that time a lot of threads can be created.
> The fix is a one-liner. We'll likely file other issues soon.
> Interestingly branch-1 and beyond do not have this chore anymore, although - 
> at least in branch-1 and later - I still see the ZooKeeperAliveConnection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-17440) [0.98] Make sure DelayedClosing chore is stopped as soon as an HConnection is closed

2017-01-09 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-17440:
-

 Summary: [0.98] Make sure DelayedClosing chore is stopped as soon 
as an HConnection is closed
 Key: HBASE-17440
 URL: https://issues.apache.org/jira/browse/HBASE-17440
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl


We're seeing many issue with run-away ZK client connection in long running app 
servers. 10k or more send or event threads are happening frequently.

While I looked around in the code I noticed that DelayedClosing closing is not 
immediately ended when an HConnection is closed, when there's an issue with 
HBase or ZK and client reconnect in a tight loop, this can lead temporarily to 
very many threads running. These will all get cleaned out after at most 60s, 
but during that time a lot of threads can be created.

The fix is a one-liner. We'll likely file other issues soon.

Interestingly branch-1 and beyond do not have this chore anymore, although - at 
least in branch-1 and later - I still see the ZooKeeperAliveConnection.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-16115) Missing security context in RegionObserver coprocessor when a compaction/split is triggered manually

2016-12-12 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-16115.
---
Resolution: Won't Fix

> Missing security context in RegionObserver coprocessor when a 
> compaction/split is triggered manually
> 
>
> Key: HBASE-16115
> URL: https://issues.apache.org/jira/browse/HBASE-16115
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.20
>    Reporter: Lars Hofhansl
>
> We ran into an interesting phenomenon which can easily render a cluster 
> unusable.
> We loaded some tests data into a test table and forced a manual compaction 
> through the UI. We have some compaction hooks implemented in a region 
> observer, which writes back to another HBase table when the compaction 
> finishes. We noticed that this coprocessor is not setup correctly, it seems 
> the security context is missing.
> The interesting part is that this _only_ happens when the compaction is 
> triggere through the UI. Automatic compactions (major or minor) or when 
> triggered via the HBase shell (folling a kinit) work fine. Only the 
> UI-triggered compactions cause this issues and lead to essentially 
> neverending compactions, immovable regions, etc.
> Not sure what exactly the issue is, but I wanted to make sure I capture this.
> [~apurtell], [~ghelmling], FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-12433) Coprocessors not dynamically reordered when reset priority

2016-11-11 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-12433.
---
Resolution: Not A Bug

> Coprocessors not dynamically reordered when reset priority
> --
>
> Key: HBASE-12433
> URL: https://issues.apache.org/jira/browse/HBASE-12433
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors
>Affects Versions: 0.98.7
>Reporter: James Taylor
>
> When modifying the coprocessor priority through the HBase shell, the order of 
> the firing of the coprocessors wasn't changing. It probably would have with a 
> cluster bounce, but if we can make it dynamic easily, that would be 
> preferable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-12570) Improve table configuration sanity checking

2016-11-11 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-12570.
---
Resolution: Duplicate

> Improve table configuration sanity checking
> ---
>
> Key: HBASE-12570
> URL: https://issues.apache.org/jira/browse/HBASE-12570
> Project: HBase
>  Issue Type: Umbrella
>Reporter: James Taylor
>
> See PHOENIX-1473. If a split policy class cannot be resolved, then your HBase 
> cluster will be brought down as each region server that successively attempts 
> to open the region will not find the class and will bring itself down.
> One idea to prevent this would be to fail the CREATE TABLE or ALTER TABLE 
> admin call if the split policy class cannot be found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-16765) New SteppingRegionSplitPolicy, avoid too aggressive spread of regions for small tables.

2016-11-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-16765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-16765.
---
   Resolution: Fixed
 Assignee: Lars Hofhansl
Fix Version/s: 1.1.8
   0.98.24
   1.2.4
   1.4.0
   1.3.0
   2.0.0

> New SteppingRegionSplitPolicy, avoid too aggressive spread of regions for 
> small tables.
> ---
>
> Key: HBASE-16765
> URL: https://issues.apache.org/jira/browse/HBASE-16765
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>    Assignee: Lars Hofhansl
> Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4, 0.98.24, 1.1.8
>
> Attachments: 16765-0.98.txt
>
>
> We just did some experiments on some larger clusters and found that while 
> using IncreasingToUpperBoundRegionSplitPolicy generally works well and is 
> very convenient, it does tend to produce too many regions.
> Since the logic is - by design - local, checking the number of regions of the 
> table in question on the local server only, we end with more regions then 
> necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14613) Remove MemStoreChunkPool?

2016-10-06 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14613.
---
Resolution: Won't Fix

> Remove MemStoreChunkPool?
> -
>
> Key: HBASE-14613
> URL: https://issues.apache.org/jira/browse/HBASE-14613
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Lars Hofhansl
>Priority: Minor
> Attachments: 14613-0.98.txt, gc.png, writes.png
>
>
> I just stumbled across MemStoreChunkPool. The idea behind is to reuse chunks 
> of allocations rather than letting the GC handle this.
> Now, it's off by default, and it seems to me to be of dubious value. I'd 
> recommend just removing it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16765) Improve IncreasingToUpperBoundRegionSplitPolicy

2016-10-04 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-16765:
-

 Summary: Improve IncreasingToUpperBoundRegionSplitPolicy
 Key: HBASE-16765
 URL: https://issues.apache.org/jira/browse/HBASE-16765
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


We just did some experiments on some larger clusters and found that while using 
IncreasingToUpperBoundRegionSplitPolicy generally works well and is very 
convenient, it does tend to produce too many regions.

Since the logic is - by design - local, checking the number of regions of the 
table in question on the local server only, we end with more regions then 
necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-15059) Allow 0.94 to compile against Hadoop 2.7.x

2016-07-18 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-15059.
---
   Resolution: Won't Fix
Fix Version/s: (was: 0.94.28)

> Allow 0.94 to compile against Hadoop 2.7.x
> --
>
> Key: HBASE-15059
> URL: https://issues.apache.org/jira/browse/HBASE-15059
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Attachments: 15059-addendum.txt, 15059-v2.txt, 15059.txt
>
>
> Currently HBase 0.94 cannot be compiled against Hadoop 2.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-15431) A bunch of methods are hot and too big to be inlined

2016-07-18 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-15431.
---
Resolution: Invalid

Giving up on this.

> A bunch of methods are hot and too big to be inlined
> 
>
> Key: HBASE-15431
> URL: https://issues.apache.org/jira/browse/HBASE-15431
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
> Attachments: hotMethods.txt
>
>
> I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
> -XX:+PrintInlining" and then looked for "hot method too big" log lines.
> I'll attach a log of those messages.
> I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods 
> (as long as they're hot, but actually didn't see any improvement).
> In all cases I primed the JVM to make sure the JVM gets a chance to profile 
> the methods and decide whether they're hot or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-16115) Missing security context in RegionObserver coprocessor when a compaction is triggered through the UI

2016-06-25 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-16115:
-

 Summary: Missing security context in RegionObserver coprocessor 
when a compaction is triggered through the UI
 Key: HBASE-16115
 URL: https://issues.apache.org/jira/browse/HBASE-16115
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


We ran into an interesting phenomenon which can easily render a cluster 
unusable.

We loaded some tests data into a test table and forced a manual compaction 
through the UI. We have some compaction hooks implemented in a region observer, 
which writes back to another HBase table when the compaction finishes. We 
noticed that this coprocessor is not setup correctly, it seems the security 
context is missing.

The interesting part is that this _only_ happens when the compaction is 
triggere through the UI. Automatic compactions (major or minor) or when 
triggered via the HBase shell (folling a kinit) work fine. Only the 
UI-triggered compactions cause this issues and lead to essentially neverending 
compactions, immovable regions, etc.

Not sure what exactly the issue is, but I wanted to make sure I capture this.

[~apurtell], [~ghelmling], FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15881) Allow BZIP2 compression

2016-05-23 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-15881:
-

 Summary: Allow BZIP2 compression
 Key: HBASE-15881
 URL: https://issues.apache.org/jira/browse/HBASE-15881
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


BZIP2 is a very efficient compressor in terms of compression rate.
Compression speed is very slow, de-compression is equivalent or faster than 
GZIP.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-14 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-15452.
---
Resolution: Invalid

NM... I'm full of @#%^

> Consider removing checkScanOrder from StoreScanner.next
> ---
>
> Key: HBASE-15452
> URL: https://issues.apache.org/jira/browse/HBASE-15452
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
> Attachments: 15452-0.98.txt
>
>
> In looking why we spent so much time in StoreScanner.next when doing a simple 
> Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
> function dispatch (that the JIT would eventually inline), it also requires 
> setting the prevKV member for every Cell encountered.
> Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
> 0.98).
> I will repeat this test on my work machine tomorrow.
> I think we're stable enough to remove that check anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15453) Considering reverting HBASE-10015 - reinstance synchronized in StoreScanner

2016-03-13 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-15453:
-

 Summary: Considering reverting HBASE-10015 - reinstance 
synchronized in StoreScanner
 Key: HBASE-15453
 URL: https://issues.apache.org/jira/browse/HBASE-15453
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


In HBASE-10015 back then I found that intrinsic locks (synchronized) in 
StoreScanner are slower that explicit locks.

I was surprised by this. To make sure I added a simple perf test and many folks 
ran it on their machines. All found that explicit locks were faster.

Now... I just ran that test again. On the latest JDK8 I find that now the 
intrinsic locks are significantly faster:

Explicit locks:
10 runs  mean:2223.6 sigma:72.29412147609237

Intrinsic locks:
10 runs  mean:1865.3 sigma:32.63755505548784

I confirmed the same with timing some Phoenix scans. We can save a bunch of 
time by changing this back 

Arrghhh... So maybe it's time to revert this now...?

(Note that in trunk due to [~ram_krish]'s work, we do not lock in StoreScanner 
anymore)

I'll attach the perf test and a patch that changes lock to synchronized, if 
some folks could run this on 0.98, that'd be great.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15452) Consider removing checkScanOrder from StoreScanner.next

2016-03-13 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-15452:
-

 Summary: Consider removing checkScanOrder from StoreScanner.next
 Key: HBASE-15452
 URL: https://issues.apache.org/jira/browse/HBASE-15452
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


In looking why we spent so much time in StoreScanner.next when doing a simple 
Phoenix count\(*) query I came across checkScanOrder. Not only is this a 
function dispatch (that the JIT would eventually inline), it also requires 
setting the prevKV member for every Cell encountered.

Removing that logic a yields measurable end-to-end improvement of 5-20% (in 
0.98).
I will repeat this test on my work machine tomorrow.

I think we're stable enough to remove that check anyway.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15431) A bunch of methods are hot and too big to be inline

2016-03-08 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-15431:
-

 Summary: A bunch of methods are hot and too big to be inline
 Key: HBASE-15431
 URL: https://issues.apache.org/jira/browse/HBASE-15431
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


I ran HBase with "-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions 
-XX:+PrintInlining" and then looked for "hot method too big" log lines.

I'll attach a log of those messages.
I tried to increase -XX:FreqInlineSize to 1010 to inline all these methods (as 
long as they're hot, but actually didn't see any improvement).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-13068) Unexpected client exception with slow scan

2016-01-10 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-13068.
---
Resolution: Cannot Reproduce

Closing this old issue.

> Unexpected client exception with slow scan
> --
>
> Key: HBASE-13068
> URL: https://issues.apache.org/jira/browse/HBASE-13068
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.98.10.1
>Reporter: Lars Hofhansl
>
> I just came across in interesting exception:
> {code}
> Caused by: java.io.IOException: Call 10 not added as the connection 
> newbunny/127.0.0.1:60020/ClientService/lars (auth:SIMPLE)/6 is closing
> at 
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.addCall(RpcClient.java:495)
> at 
> org.apache.hadoop.hbase.ipc.RpcClient.getConnection(RpcClient.java:1534)
> at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1442)
> ... 13 more
> {code}
> Called from here:
> {code}
> at 
> org.apache.hadoop.hbase.client.ScannerCallable.close(ScannerCallable.java:291)
> at 
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:160)
> at 
> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:115)
> at 
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:91)
> at 
> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:247)
> {code}
> This happened when I scanned with multiple client against a single region 
> server when all data is filtered at the server by a filter.
> I had 10 clients, the region server has 30 handles.
> This means the scanners are not getting closed and their lease has to expire.
> The workaround is to increase hbase.ipc.client.connection.maxidletime.
> But it's strange that this *only* happens at close time. And since I am not 
> using up all handlers there shouldn't be any starvation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15084) Remove references to repository.codehaus.org

2016-01-08 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-15084:
-

 Summary: Remove references to repository.codehaus.org
 Key: HBASE-15084
 URL: https://issues.apache.org/jira/browse/HBASE-15084
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.27
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.28






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-15084) Remove references to repository.codehaus.org

2016-01-08 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-15084.
---
Resolution: Fixed

Committed to 0.94.

> Remove references to repository.codehaus.org
> 
>
> Key: HBASE-15084
> URL: https://issues.apache.org/jira/browse/HBASE-15084
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.27
>Reporter: Lars Hofhansl
>    Assignee: Lars Hofhansl
> Fix For: 0.94.28
>
>
> repository.codehause.org is not longer active.
> A dns-lookup reveals an alias to stop-looking-at.repository-codehaus-org :)
> All repositories have been moved to Maven central, so it can just removed 
> from the pom.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14213) Ensure ASF policy compliant headers and correct LICENSE and NOTICE files in artifacts for 0.94

2016-01-07 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14213.
---
Resolution: Fixed

Committed to 0.94.

Thanks again. [~busbey]

> Ensure ASF policy compliant headers and correct LICENSE and NOTICE files in 
> artifacts for 0.94
> --
>
> Key: HBASE-14213
> URL: https://issues.apache.org/jira/browse/HBASE-14213
> Project: HBase
>  Issue Type: Task
>  Components: build
>Reporter: Nick Dimiduk
>Assignee: Sean Busbey
>Priority: Blocker
> Fix For: 0.94.28
>
> Attachments: 14213-LICENSE.txt, 14213-combined.txt, 14213-part1.txt, 
> 14213-part2.txt, 14213-part3.sh, 14213-part4.sh, 14213-part5.sh, 
> HBASE-14213.1.0.94.patch
>
>
> From tail of thread on HBASE-14085, opening a backport ticket for 0.94. Took 
> the liberty of assigning to [~busbey].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14213) Ensure ASF policy compliant headers and correct LICENSE and NOTICE files in artifacts for 0.94

2016-01-04 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14213.
---
Resolution: Won't Fix

0.94 is EOL'd. We're doing final release, no more versions after that.

> Ensure ASF policy compliant headers and correct LICENSE and NOTICE files in 
> artifacts for 0.94
> --
>
> Key: HBASE-14213
> URL: https://issues.apache.org/jira/browse/HBASE-14213
> Project: HBase
>  Issue Type: Task
>  Components: build
>Reporter: Nick Dimiduk
>Assignee: Sean Busbey
>Priority: Blocker
>
> From tail of thread on HBASE-14085, opening a backport ticket for 0.94. Took 
> the liberty of assigning to [~busbey].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-15054) Allow 0.94 to compile with JDK8

2016-01-04 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-15054.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

> Allow 0.94 to compile with JDK8
> ---
>
> Key: HBASE-15054
> URL: https://issues.apache.org/jira/browse/HBASE-15054
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.28
>
> Attachments: 15054.txt
>
>
> Fix the following two problems:
> # PoolMap
> # InputSampler



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-15059) Allow 0.94 to compile against Hadoop 2.7.x

2016-01-04 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-15059.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

Pushed to 0.94 only
(added a comment into the pom about the extra build steps)

> Allow 0.94 to compile against Hadoop 2.7.x
> --
>
> Key: HBASE-15059
> URL: https://issues.apache.org/jira/browse/HBASE-15059
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.28
>
> Attachments: 15059-v2.txt, 15059.txt
>
>
> Currently HBase 0.94 cannot be compiled against Hadoop 2.7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-15054) Allow 0.94 to compile with JDK8

2015-12-29 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-15054:
-

 Summary: Allow 0.94 to compile with JDK8
 Key: HBASE-15054
 URL: https://issues.apache.org/jira/browse/HBASE-15054
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


Fix the following two problems:
# PoolMap
# InputSampler




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14940) Make our unsafe based ops more safe

2015-12-24 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14940.
---
  Resolution: Fixed
Release Note: Pushed to 0.98 only.

> Make our unsafe based ops more safe
> ---
>
> Key: HBASE-14940
> URL: https://issues.apache.org/jira/browse/HBASE-14940
> Project: HBase
>  Issue Type: Bug
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.3, 0.98.17, 1.0.4
>
> Attachments: HBASE-14940.patch, HBASE-14940_addendum_0.98.patch, 
> HBASE-14940_branch-1.patch, HBASE-14940_branch-1.patch, 
> HBASE-14940_branch-1.patch, HBASE-14940_branch-1.patch, HBASE-14940_v2.patch
>
>
> Thanks for the nice findings [~ikeda]
> This jira solves 3 issues with Unsafe operations and ByteBufferUtils
> 1. We can do sun unsafe based reads and writes iff unsafe package is 
> available and underlying platform is having unaligned-access capability. But 
> we were missing the second check
> 2. Java NIO is doing a chunk based copy while doing Unsafe copyMemory. The 
> max chunk size is 1 MB. This is done for "A limit is imposed to allow for 
> safepoint polling during a large copy" as mentioned in comments in Bits.java. 
>  We are also going to do same way
> 3. In ByteBufferUtils, when Unsafe is not available and ByteBuffers are off 
> heap, we were doing byte by byte operation (read/copy). We can avoid this and 
> do better way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14777) Fix Inter Cluster Replication Future ordering issues

2015-11-25 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14777.
---
Resolution: Fixed

Pushed to all branches. Thanks for bearing with me.

> Fix Inter Cluster Replication Future ordering issues
> 
>
> Key: HBASE-14777
> URL: https://issues.apache.org/jira/browse/HBASE-14777
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Bhupendra Kumar Jain
>Assignee: Ashu Pachauri
>Priority: Critical
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14777-alternative.txt, HBASE-14777-1.patch, 
> HBASE-14777-2.patch, HBASE-14777-3.patch, HBASE-14777-4.patch, 
> HBASE-14777-5.patch, HBASE-14777-6.patch, HBASE-14777-addendum.patch, 
> HBASE-14777.patch
>
>
> Replication fails with IndexOutOfBoundsException 
> {code}
> regionserver.ReplicationSource$ReplicationSourceWorkerThread(939): 
> org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint
>  threw unknown exception:java.lang.IndexOutOfBoundsException: Index: 1, Size: 
> 1
>   at java.util.ArrayList.rangeCheck(Unknown Source)
>   at java.util.ArrayList.remove(Unknown Source)
>   at 
> org.apache.hadoop.hbase.replication.regionserver.HBaseInterClusterReplicationEndpoint.replicate(HBaseInterClusterReplicationEndpoint.java:222)
> {code}
> Its happening due to incorrect removal of entries from the replication 
> entries list. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14869) Better request latency histograms

2015-11-21 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14869:
-

 Summary: Better request latency histograms
 Key: HBASE-14869
 URL: https://issues.apache.org/jira/browse/HBASE-14869
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl


I just discussed this with a colleague.
The get, put, etc, histograms that each region server keeps are somewhat 
useless (depending on what you want to achieve of course), as they are 
aggregated and calculated by each region server.

It would be better to record the number of requests in certainly latency bands 
in addition to what we do now.
For example the number of gets that took 0-5ms, 6-10ms, 10-20ms, 20-50ms, 
50-100ms, 100-1000ms, > 1000ms, etc. (just as an example, should be 
configurable).

That way we can do further calculations after the fact, and answer questions 
like: How often did we miss our SLA? Percentage of requests that missed an SLA, 
etc.

Comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14791) [0.98] CopyTable is extremely slow when moving delete markers

2015-11-10 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14791:
-

 Summary: [0.98] CopyTable is extremely slow when moving delete 
markers
 Key: HBASE-14791
 URL: https://issues.apache.org/jira/browse/HBASE-14791
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.16
Reporter: Lars Hofhansl


We found that some of our copy table job run for many hours, even when there 
isn't that much data to copy.

[~vik.karma] did his magic and found that the issue with copying delete markers 
(we use raw mode to also move deletes across).
Looking at the code in 0.98 it's immediately obvious that deletes (unlike puts) 
are not batched and hence sent to the other side one by one, cause a network 
RTT for each delete marker.

Looks like in trunk it's doing the right thing (using BufferedMutators for all 
mutations in TableOutputFormat). So likely only a 0.98 (and 1.0, 1.1, 1.2?) 
issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14657) Remove unneeded API from EncodedSeeker

2015-10-20 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14657:
-

 Summary: Remove unneeded API from EncodedSeeker
 Key: HBASE-14657
 URL: https://issues.apache.org/jira/browse/HBASE-14657
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
 Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3


See parent. We do not need getKeyValueBuffer. It's only used for tests, and 
parent patch fixes all tests to use getKeyValue instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14628) Save object creation for scanning with FAST_DIFF encoding

2015-10-16 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14628:
-

 Summary: Save object creation for scanning with FAST_DIFF encoding
 Key: HBASE-14628
 URL: https://issues.apache.org/jira/browse/HBASE-14628
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


I noticed that (at least in 0.98 - master is entirely different) we create 
ByteBuffer just to create a byte[], which is then used to create a KeyValue.

We can save the creation of the ByteBuffer and hence save allocating an extra 
object for each KV we find by creating the byte[] directly.

In a Phoenix count\(*) query that saved from 10% of runtime.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14613) Remove MemStoreChunkPool?

2015-10-14 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14613:
-

 Summary: Remove MemStoreChunkPool?
 Key: HBASE-14613
 URL: https://issues.apache.org/jira/browse/HBASE-14613
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl
Priority: Minor


I just stumbled across MemStoreChunkPool. The idea behind is to reuse chunks of 
allocations rather than letting the GC handle this.

Now, it's off by default, and it seems to me to be of dubious value. I'd 
recommend just removing it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14549) Simplify scanner stack reset logic

2015-10-03 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14549:
-

 Summary: Simplify scanner stack reset logic
 Key: HBASE-14549
 URL: https://issues.apache.org/jira/browse/HBASE-14549
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl


Looking at the code, I find that the logic is unnecessarily complex.
We indicate in updateReaders that the scanner stack needs to be reset. Than 
almost all store scanner (and derived classes) methods need to check and 
actually reset the scanner stack.
Compaction are rare, we should reset the scanner stack in update readers, and 
hence avoid needing to check in all methods.

Patch forthcoming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14539) Slight improvement of StoreScanner.optimize

2015-10-02 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14539:
-

 Summary: Slight improvement of StoreScanner.optimize
 Key: HBASE-14539
 URL: https://issues.apache.org/jira/browse/HBASE-14539
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
Priority: Minor


While looking at the code I noticed that StoreScanner.optimize does not some 
unnecessary work. This is a very tight loop and even just looking up a 
reference can throw off the CPUs cache lines. This does safe a few percent of 
performance (not a lot, though).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14539) Slight improvement of StoreScanner.optimize

2015-10-02 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14539.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.98.x, 1.0.x, 1.1.x, 1.2.x, 1.3, and 2.0.

> Slight improvement of StoreScanner.optimize
> ---
>
> Key: HBASE-14539
> URL: https://issues.apache.org/jira/browse/HBASE-14539
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>Priority: Minor
> Fix For: 2.0.0, 1.3.0, 1.2.1, 1.0.3, 1.1.3, 0.98.15
>
> Attachments: 14539-0.98.txt
>
>
> While looking at the code I noticed that StoreScanner.optimize does not some 
> unnecessary work. This is a very tight loop and even just looking up a 
> reference can throw off the CPUs cache lines. This does safe a few percent of 
> performance (not a lot, though).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14509) Configurable sparse indexes?

2015-09-29 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14509:
-

 Summary: Configurable sparse indexes?
 Key: HBASE-14509
 URL: https://issues.apache.org/jira/browse/HBASE-14509
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl


This idea just popped up today and I wanted to record it for discussion:
What if we kept sparse column indexes per region or HFile or per configurable 
range?

I.e. For any given CQ we record the lowest and highest value for a particular 
range (HFile, Region, or a custom range like the Phoenix guide post).

By tweaking the size of these ranges we can control the size of the index, vs 
its selectivity.

For example if we kept it by HFile we can almost instantly decide whether we 
need scan a particular HFile at all to find a particular value in a Cell.

We can also collect min/max values for each n MB of data, for example when we 
can the region the first time. Assuming ranges are large enough we can always 
keep the index in memory together with the region.

Kind of a sparse local index. Might much easier than the buddy region stuff 
we've been discussing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14489) postScannerFilterRow consumes a lot of CPU

2015-09-26 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14489.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

> postScannerFilterRow consumes a lot of CPU
> --
>
> Key: HBASE-14489
> URL: https://issues.apache.org/jira/browse/HBASE-14489
> Project: HBase
>  Issue Type: Bug
>    Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
>  Labels: performance
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: 14489-0.98.txt, 14489-master.txt
>
>
> During an unrelated test I found that when scanning a tall table with CQ only 
> and filtering most results at the server, 50%(!) of time is spend in 
> postScannerFilterRow, even though the coprocessor does nothing in that hook.
> We need to find a way not to call this hook when not needed, or to question 
> why we have this hook at all.
> I think [~ram_krish] added the hook (or maybe [~anoop.hbase]). I am also not 
> sure whether Phoenix uses this hook ([~giacomotaylor]?)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14489) postScannerFilterRow consumes a lot of CPU

2015-09-24 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14489:
-

 Summary: postScannerFilterRow consumes a lot of CPU
 Key: HBASE-14489
 URL: https://issues.apache.org/jira/browse/HBASE-14489
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


During an unrelated test I found that when scanning a tall table with CQ only 
and filtering most results at the server, 50%(!) of time is spend in 
postScannerFilterRow, even though the coprocessor does nothing in that hook.

We need to find a way not to call this hook when not needed, or to question why 
we have this hook at all.

I think [~ram_krish] added the hook (or maybe [~anoop.hbase]). I am also not 
sure whether Phoenix uses this hook ([~giacomotaylor]?)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14418) Make better SEEK vs SKIP decision with seek hints.

2015-09-12 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14418.
---
   Resolution: Invalid
Fix Version/s: (was: 0.98.12)
   (was: 1.1.0)
   (was: 1.0.1)
   (was: 2.0.0)

Yeah. Creating the Cell, just to return the hint is what's taking up more time 
than the actual seek. Closing for now. I might revisit this.

[~giacomotaylor], FYI.


> Make better SEEK vs SKIP decision with seek hints.
> --
>
> Key: HBASE-14418
> URL: https://issues.apache.org/jira/browse/HBASE-14418
> Project: HBase
>  Issue Type: Sub-task
>    Reporter: Lars Hofhansl
> Attachments: 13109.txt, 14418-0.98.txt
>
>
> Continuation of parent.
> We can also do this optimization for seek hints. This would allow filters and 
> coprocessors to be more liberal with seek hints, as seeking when unnecessary 
> is less of perf detriment.
> It's not quite as clear cut, since in order to check, we actually do need to 
> create the seek hint Cell. Then when we actually seek, we need to create it 
> again. Need to test carefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14418) Make better SEEK vs SKIP decision with seek hints.

2015-09-11 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14418:
-

 Summary: Make better SEEK vs SKIP decision with seek hints.
 Key: HBASE-14418
 URL: https://issues.apache.org/jira/browse/HBASE-14418
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl


Continuation of parent.
We can also do this optimization for seek hints. This would allow filters and 
coprocessors to be more liberal with seek hints, as seeking when unnecessary is 
less of perf detriment.

It's not quite as clear cut, since in order to check, we actually do need to 
create the seek hint Cell. Then when we actually seek, we need to create it 
again. Need to test carefully.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14364) hlog_roll and compact_rs broken in shell

2015-09-03 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14364:
-

 Summary: hlog_roll and compact_rs broken in shell
 Key: HBASE-14364
 URL: https://issues.apache.org/jira/browse/HBASE-14364
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.14
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl


Just noticed that both hlog_roll and compact_rs are broken in shell (at least 
in 0.98).

The hlog_roll broken 3 times: (1) calls admin.rollWALWriter, which no longer 
exists, and (2) tries to pass a ServerName, but method takes a string, and (3) 
uses unqualified ServerName to get a server name, which leads to an  
uninitialized constant error.

compact_rs only has the latter problem.
Patch upcoming.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-14315) Save one call to KeyValueHeap.peek per row

2015-08-28 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-14315.
---
Resolution: Fixed

Committed to every branch on the planet.

 Save one call to KeyValueHeap.peek per row
 --

 Key: HBASE-14315
 URL: https://issues.apache.org/jira/browse/HBASE-14315
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 2.0.0, 1.1.2, 1.3.0, 0.98.15, 1.2.1, 1.0.3

 Attachments: 14315-0.98.txt, 14315-master.txt


 Another one of my micro optimizations.
 In StoreScanner.next(...) we can actually save a call to KeyValueHeap.peek, 
 which in my runs of scan heavy loads shows up at top.
 Based on the run and data this can safe between 3 and 10% of runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HBASE-14315) Save one call to KeyValueHeap.peek per row

2015-08-26 Thread Lars Hofhansl (JIRA)
Lars Hofhansl created HBASE-14315:
-

 Summary: Save one call to KeyValueHeap.peek per row
 Key: HBASE-14315
 URL: https://issues.apache.org/jira/browse/HBASE-14315
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl


Another one of my micro optimizations.
In StoreScanner.next(...) we can actually safe a call to KeyValueHeap.peek, 
which in my runs of scan heavy loads shows up at top.

Based on the run and data this can safe between 3 and 10% of runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-12853) distributed write pattern to replace ad hoc 'salting'

2015-08-02 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-12853.
---
   Resolution: Invalid
Fix Version/s: (was: 2.0.0)

The discussion has been off topic. We can open a new topic if we have something 
concrete.

 distributed write pattern to replace ad hoc 'salting'
 -

 Key: HBASE-12853
 URL: https://issues.apache.org/jira/browse/HBASE-12853
 Project: HBase
  Issue Type: New Feature
Reporter: Michael Segel 

 In reviewing HBASE-11682 (Description of Hot Spotting), one of the issues is 
 that while 'salting' alleviated  regional hot spotting, it increased the 
 complexity required to utilize the data.  
 Through the use of coprocessors, it should be possible to offer a method 
 which distributes the data on write across the cluster and then manages 
 reading the data returning a sort ordered result set, abstracting the 
 underlying process. 
 On table creation, a flag is set to indicate that this is a parallel table. 
 On insert in to the table, if the flag is set to true then a prefix is added 
 to the key.  e.g. region server#- or region server #|| where the region 
 server # is an integer between 1 and the number of region servers defined.  
 On read (scan) for each region server defined, a separate scan is created 
 adding the prefix. Since each scan will be in sort order, its possible to 
 strip the prefix and return the lowest value key from each of the subsets. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Support for upgrades from 0.94

2015-07-31 Thread lars hofhansl
Following our guidelines as stated here: 
https://hbase.apache.org/book.html#hbase.versioning we can remove the upgrade 
path from 2.0.Considering the 0.98 is a major version step over 0.94 we could 
in theory remove such from 1.x, but we only established semantic versioning 
with 1.0.0.
So, yes, it seems we can (and I would argue, should) remove the upgrade code in 
question from 2.x.
-- Lars

  From: Lars Francke lars.fran...@gmail.com
 To: dev@hbase.apache.org 
 Sent: Friday, July 31, 2015 3:59 AM
 Subject: Support for upgrades from 0.94
   
Hi,

this is referring to these two issues:
https://issues.apache.org/jira/browse/HBASE-8778
https://issues.apache.org/jira/browse/HBASE-11611

I'm still looking for deprecated stuff that can be cleaned up. We have a
file in the code (FSTableDescriptorMigrationToSubdir) that is used to
migrate pre-0.96 table format to the new version. It has an annotation that
marks it as to-delete for the next major version after 0.96.

HBASE-11611 removed the class UpgradeTo96 without removing the hbase shell
command upgrade which referred to that class. So that means that as of
2.0.0 upgrades from 0.94 are not supported anymore (If I understand that
correctly).

Are we okay with that? If so I'd like to create a JIRA removing the
remnants and documenting this.

Otherwise we need to partially revert HBASE-11611 and probably adapt the
Upgrade class.

Cheers,
Lars


   

Re: [DISCUSSION] Switching from RTC to CTR

2015-07-31 Thread lars hofhansl
I don't really agree that RTC means we do not trust you. It just means that 
changes should be peer reviewed, with which I heartily agree.CTR can work with 
small group (for example in a branch). For a big project I think it will lead 
to lower quality (and we already have issues with constantly failing tests - 
part due to the infrastructure, but in part because they are flaky).
That said, I like the idea to leave this at discretion of the committer. In 
that case we do not need the specific week time-line. For a small fix I think a 
committer can just commit without review at all (null checks, etc). For larger 
changes or features the committer should naturally request some review.
Not a fan of codifying too much details, it's better to trust the judgment of 
the committer and state some general guidelines.
So what am I saying?
I think we can state that review is not _required_. Period. Then we could state 
that committers should use good judgment as to when to request feedback.

-- Lars

 From: Andrew Purtell andrew.purt...@gmail.com
 To: dev@hbase.apache.org dev@hbase.apache.org 
Cc: priv...@hbase.apache.org priv...@hbase.apache.org 
 Sent: Thursday, July 30, 2015 6:15 PM
 Subject: Re: [DISCUSSION] Switching from RTC to CTR
   
I appreciate very much the earlier feedback about switching from RTC to
CTR. It helped me think about the essential thing I was after.

I'm thinking of making a formal proposal to adopt this, with a VOTE:

 After posting a patch to JIRA, after one week if there is no review or
veto, a committer can commit their own work.

It's important we discuss this and have a vote because the default
Foundation decision making process (
http://www.apache.org/foundation/voting.html) does not allow what would
amount to lazy consensus when RTC is in effect. Should my proposal pass, we
would arrive at a hybrid policy that is identical to the default Foundation
one *until* one week has elapsed after a code change is proposed. Then, for
a committer, for that one code change, they would be able to operate using
CTR. I think the HBase PMC is empowered to set this kind of policy for our
own project at our option. If you feel I am mistaken about that, please
speak up. Should the vote pass I will run it by board@ for review to be
sure.

We'd document this in the book:
https://hbase.apache.org/book.html#_decisions

Also, looking at https://hbase.apache.org/book.html#_decisions, I don't
think the patch +1 policy should remain because the trial OWNERS concept
hasn't worked out, IMHO. The OWNERS concept requires a set of constantly
present and engaged owners, a resource demand that's hard to square with
the volunteer nature of our community. The amount of time any committer or
PMC member has on this project is highly variable day to day and week to
week.  I'm also thinking of calling a VOTE to significantly revise or
strike this section.

Both of these things have a common root: Volunteer time is a very precious
commodity. Our community's supply of volunteer time fluctuates. I would
like to see committers be able to make progress with their own work even in
periods when volunteer time is in very short supply, or when they are
working on niche concerns that simply do not draw sufficient interest from
other committers. (This is different from work that people think isn't
appropriate - in that case ignoring it so it will go away would no longer
be an option, a veto would be required if you want to stop something.)




On Wed, Jul 29, 2015 at 3:56 PM, Andrew Purtell andrew.purt...@gmail.com
wrote:

 Had this thought after getting back on the road. As an alternative to any
 sweeping change we could do one incremental but very significant thing that
 acknowledges our status as trusted and busy peers: After posting a patch to
 JIRA, after one week if there is no review or veto, a committer can commit
 their own work.


  On Jul 29, 2015, at 2:20 PM, Mikhail Antonov olorinb...@gmail.com
 wrote:
 
  Just curious, I assume if this change is made, would it only apply to
  master branch?
 
  -Mikhail
 
  On Wed, Jul 29, 2015 at 2:09 PM, Andrew Purtell
  andrew.purt...@gmail.com wrote:
  @dev is now CCed
 
  I didn't want to over structure the discussion with too much detail up
 front. I do think CTR without supporting process or boundaries can be more
 problematic than helpful. That could take the form of customarily waiting
 for reviews before commit even under a CTR regime. I think this discussion
 has been great so far. I would also add that CTR moves 'R' from a gating
 requirement per commit (which can be hard to overcome for niche areas or
 when volunteer resources are less available) more to RMs. will be back
 later with more.
 
 
  On Jul 29, 2015, at 1:36 PM, Sean Busbey sean.bus...@gmail.com
 wrote:
 
  I'd also favor having this discussion on dev@.
 
  On Wed, Jul 29, 2015 at 2:29 PM, Gary Helmling ghelml...@gmail.com
 wrote:
 
  This is already a really interesting and meaningful discussion, and is
  

Re: DISCUSSION: lets do a developer workshop on near-term work

2015-07-20 Thread lars hofhansl
Personally, I think that is a reasonable way to test the internal friction of 
the server. I've been doing a lot of tests like that and found a lot of 
inefficiencies in HBase that way.For cases where we return all Cells back to a 
(remote) client improving the server by 10 or 20% would mostly go unnoticed.

Analytics (aggregates via Phoenix of direct coprocessors) will be more 
important going forward, so improving that part is important.
I completely agree that end-to-end (by which I mean data shipped to the client) 
testing is important, it's just I'd expect us to work on different areas (put 
Protobufs on a diet, have a streaming protocol, etc).
-- Lars

 From: Andrew Purtell andrew.purt...@gmail.com
 To: dev@hbase.apache.org dev@hbase.apache.org 
 Sent: Saturday, July 18, 2015 11:24 AM
 Subject: Re: DISCUSSION: lets do a developer workshop on near-term work
   
That's not a realistic or useful test scenario, unless the goal is to 
accelerate queries where all cells are filtered at the server. 





 On Jul 18, 2015, at 11:02 AM, Anoop John anoop.hb...@gmail.com wrote:
 
 No Andy. 11425 having doc attached to it. At the end of it, we have added
 perf numbers in a cluster testing.  This was done using PE get and scan
 tests with filtering all cells at server (to not consider n/w bandwidth
 constraints)
 
 -Anoop-
 
 On Sat, Jul 18, 2015 at 9:30 PM, Andrew Purtell andrew.purt...@gmail.com
 wrote:
 
 We have some microbenchmarks, not evidence of differences seen from a
 client application. I'm not saying that microbenchmarks are not totally
 necessary and a great start - they are - but that they don't measure an end
 goal. Furthermore unless I've missed one somewhere we don't have a JIRA or
 design doc that states a clear end goal metric like the strawman I threw
 together in my previous mail. A measurable system level goal and some data
 from full cluster testing would go a lot further toward letting all of us
 evaluate the potential and payoff of the work. In the meantime we should
 probably be assembling these changes on a branch instead of in trunk, for
 as long as the goal is not clearly defined and the payoff and potential for
 perf regressions is untested and unknown.
 
 
 On Jul 18, 2015, at 8:05 AM, Anoop John anoop.hb...@gmail.com wrote:
 
 Thanks Andy and Lars.  The parent jira has doc attached which contains
 some
 perf gain numbers..  We will be doing more tests in next 2 weeks (before
 end of this month) and will publish them.  Yes it will be great if it is
 more IST friendly time :-)
 
 -Anoop-
 
 On Fri, Jul 17, 2015 at 9:44 PM, Andrew Purtell 
 andrew.purt...@gmail.com
 wrote:
 
 I can represent your side Ram (and Anoop). I've been known always argue
 both side of a discussion and to never take sides easily (drives some
 folks
 crazy).
 
 I can vouch for this (smile)
 
 I also can offer support for off heaping there. At the same time we do
 have a gap where we can't point to a timeline of improvements (yet,
 anyway)
 with benchmarks showing gains where your goals need them. For example,
 stock HBase in one JVM can address max N GB for response time
 distribution
 D; dev version of HBase in off heap branch can address max N' GB for
 distribution D', where N'  N and D  D' (distribution D' statistically
 shows better/lower response times).
 
 
 
 On Jul 17, 2015, at 6:56 AM, lars hofhansl la...@apache.org wrote:
 
 I'm in favor of anything that improves performance (and preferably
 doesn't set us back into a world that's worse than C due to the lack of
 pointers in Java).Never said I don't like it, it's just that I'm
 perhaps
 asking for more numbers and justification in weighing the pros and cons.
 I can represent your side Ram (and Anoop). I've been known always argue
 both side of a discussion and to never take sides easily (drives some
 folks
 crazy). And Stack's there too, he yell at me where needed :)
 
 Perhaps we can do it a bit later in the evening so there is a fighting
 chance that folks on IST can participate. I know that some of our folks
 on
 IST would love to participate in the backup discussion).
 
 Like Enis, I'm also happy to host. We're in Downtown SF. I'd just need
 an approx. number of folks.
 
 -- Lars
 
    From: ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com
 To: dev@hbase.apache.org dev@hbase.apache.org; lars hofhansl 
 la...@apache.org
 Sent: Wednesday, July 15, 2015 10:10 AM
 Subject: Re: DISCUSSION: lets do a developer workshop on near-term work
 
 Hi
 What time will it be on August 26th?
 @LarsYa. I know that you are not generally in favour of this offheaping
 stuff.  May be if we (from India) can attend this meeting remotely your
 thoughts can be discussed and also the current state of this work.
 RegardsRam
 
 
 On Wed, Jul 15, 2015 at 9:28 PM, lars hofhansl la...@apache.org
 wrote:
 
 Works for me. I'll be back in the Bay Area the week of August 9th.
 We have done a _lot_ of work on backups as well - ours are more
 complicated as we wanted fast per

[jira] [Resolved] (HBASE-12945) Port: New master API to track major compaction completion to 0.98

2015-07-18 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-12945.
---
   Resolution: Won't Fix
Fix Version/s: (was: 0.98.14)

Looks like there's no interest. Closing.

 Port: New master API to track major compaction completion to 0.98
 -

 Key: HBASE-12945
 URL: https://issues.apache.org/jira/browse/HBASE-12945
 Project: HBase
  Issue Type: Sub-task
Reporter: Lars Hofhansl





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-5210) HFiles are missing from an incremental load

2015-07-18 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-5210.
--
Resolution: Cannot Reproduce

Closing for now

 HFiles are missing from an incremental load
 ---

 Key: HBASE-5210
 URL: https://issues.apache.org/jira/browse/HBASE-5210
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.90.2
 Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
 RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
Reporter: Lawrence Simpson
 Attachments: HBASE-5210-crazy-new-getRandomFilename.patch


 We run an overnight map/reduce job that loads data from an external source 
 and adds that data to an existing HBase table.  The input files have been 
 loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
 TotalOrderPartitioner) to create HFiles which are subsequently added to the 
 HBase table.  On at least two separate occasions (that we know of), a range 
 of output would be missing for a given day.  The range of keys for the 
 missing values corresponded to those of a particular region.  This implied 
 that a complete HFile somehow went missing from the job.  Further 
 investigation revealed the following:
  * Two different reducers (running in separate JVMs and thus separate class 
 loaders)
  * in the same server can end up using the same file names for their
  * HFiles.  The scenario is as follows:
  *1.  Both reducers start near the same time.
  *2.  The first reducer reaches the point where it wants to write its 
 first file.
  *3.  It uses the StoreFile class which contains a static Random 
 object 
  *which is initialized by default using a timestamp.
  *4.  The file name is generated using the random number generator.
  *5.  The file name is checked against other existing files.
  *6.  The file is written into temporary files in a directory named
  *after the reducer attempt.
  *7.  The second reduce task reaches the same point, but its 
 StoreClass
  *(which is now in the file system's cache) gets loaded within the
  *time resolution of the OS and thus initializes its Random()
  *object with the same seed as the first task.
  *8.  The second task also checks for an existing file with the name
  *generated by the random number generator and finds no conflict
  *because each task is writing files in its own temporary folder.
  *9.  The first task finishes and gets its temporary files committed
  *to the real folder specified for output of the HFiles.
  * 10.The second task then reaches its own conclusion and commits its
  *files (moveTaskOutputs).  The released Hadoop code just 
 overwrites
  *any files with the same name.  No warning messages or anything.
  *The first task's HFiles just go missing.
  * 
  *  Note:  The reducers here are NOT different attempts at the same 
  *reduce task.  They are different reduce tasks so data is
  *really lost.
 I am currently testing a fix in which I have added code to the Hadoop 
 FileOutputCommitter.moveTaskOutputs method to check for a conflict with
 an existing file in the final output folder and to rename the HFile if
 needed.  This may not be appropriate for all uses of FileOutputFormat.
 So I have put this into a new class which is then used by a subclass of
 HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
 more of a problem due to private declarations.
 I don't know if my approach is the best fix for the problem.  If someone
 more knowledgeable than myself deems that it is, I will be happy to share
 what I have done and by that time I may have some information on the
 results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: 0.98 patch acceptance criteria discussion

2015-07-17 Thread lars hofhansl
Thanks Andy.
I think the gist of the discussion boils down to this:We generally have two 
goals: (1) follow semver from 1.0.0 onward and (2) avoid losing 
features/improvements when upgrading from an older version to a newer one.
Turns out these two are conflicting unless we follow certain additional 
policies.
The issue at hand was a performance improvement that we added to 0.98, 1.3.0, 
and 2.0.0, but not 1.0.x, 1.1.x, and 1.2.x (x = 1 in all cases)So when 
somebody would upgrade from 0.98 to (say) 1.1.7 (if/when that's out) that 
improvement would silently be lost.
I think the extra statement we have to make is that only the latest minor 
version of the next major branch is guaranteed have all the improvements of the 
previous major branch.Or phrased in other words: Improvements that are not bug 
fixes will only go into the x.y.0 minor version, but not (by default anyway, 
the RM should use good judgment) into any existing minor version (and thus not 
in a patch version  0)

If that's OK with everybody we can just state that and move on (and I'll shut 
up :) ).
-- Lars

  From: Andrew Purtell apurt...@apache.org
 To: dev@hbase.apache.org dev@hbase.apache.org 
 Sent: Thursday, July 16, 2015 8:58 AM
 Subject: 0.98 patch acceptance criteria discussion
   
Hi devs,

I'd like to call your attention to an interesting and important discussion
taking place on the tail of HBASE-12596. It starts from here:
https://issues.apache.org/jira/browse/HBASE-12596?focusedCommentId=14628295page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14628295

-- 
Best regards,

  - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)


  

Re: DISCUSSION: lets do a developer workshop on near-term work

2015-07-17 Thread lars hofhansl
I'm in favor of anything that improves performance (and preferably doesn't set 
us back into a world that's worse than C due to the lack of pointers in 
Java).Never said I don't like it, it's just that I'm perhaps asking for more 
numbers and justification in weighing the pros and cons.
I can represent your side Ram (and Anoop). I've been known always argue both 
side of a discussion and to never take sides easily (drives some folks crazy). 
And Stack's there too, he yell at me where needed :)

Perhaps we can do it a bit later in the evening so there is a fighting chance 
that folks on IST can participate. I know that some of our folks on IST would 
love to participate in the backup discussion).

Like Enis, I'm also happy to host. We're in Downtown SF. I'd just need an 
approx. number of folks.

-- Lars

  From: ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com
 To: dev@hbase.apache.org dev@hbase.apache.org; lars hofhansl 
la...@apache.org 
 Sent: Wednesday, July 15, 2015 10:10 AM
 Subject: Re: DISCUSSION: lets do a developer workshop on near-term work
   
Hi 
What time will it be on August 26th?
@LarsYa. I know that you are not generally in favour of this offheaping stuff.  
May be if we (from India) can attend this meeting remotely your thoughts can be 
discussed and also the current state of this work.
RegardsRam


On Wed, Jul 15, 2015 at 9:28 PM, lars hofhansl la...@apache.org wrote:

Works for me. I'll be back in the Bay Area the week of August 9th.
We have done a _lot_ of work on backups as well - ours are more complicated as 
we wanted fast per-tenant restores, so data is grouped by tenant. Would like 
to sync up on that (hopefully some of the folks who wrote most of the code will 
be in town, I'll check).

Also interested in the Time and offheap parts (although you folks usually 
do not like what I think about the offheap efforts :) ).
Would like to add the following topics:


- Timestamp Resolution. Or making space for more bits in the timestamps 
(happy to cover that, unless it's part of the Time topic)


- Replication. We found that replication cannot keep up with high write 
loads, due to the fact that replicated is strictly single threaded per 
regionserver (even though we have multiple region servers on the sink side)


- Spark integration (Ted Malaska?)


OK... Out now to make a bullshit hat.

-- Lars


From: Sean Busbey bus...@cloudera.com
To: dev dev@hbase.apache.org
Sent: Tuesday, July 14, 2015 7:11 PM
Subject: Re: DISCUSSION: lets do a developer workshop on near-term work


I'm planning to be in the Bay area the week of the 24th of August.

--
Sean



On Jul 14, 2015 7:53 PM, Andrew Purtell apurt...@apache.org wrote:

 I can be up in your area in August.

 On Tue, Jul 14, 2015 at 5:31 PM, Stack st...@duboce.net wrote:

  On Tue, Jul 14, 2015 at 3:39 PM, Enis Söztutar enis@gmail.com
 wrote:
 
   Sounds good. It has been a while we did the talk-aton.
  
   I'll be off starting 25 of July, so I prefer something next week if
   possible.
  
   You ever coming back? If so, when? I'm back on 10th of August (Mikhail
 on
  the 20th).
  St.Ack
 
 
 
 
   Enis
  
   On Tue, Jul 14, 2015 at 3:18 PM, Stack st...@duboce.net wrote:
  
Matteo and I were thinking it time devs got together for a pow-wow.
  There
is a bunch of stuff in flight at the moment (see below list) and it
  would
be good to meet and whiteboard, surface goodo ideas that have gone
   dormant
in JIRA, or revisit designs/proposals out in JIRA-attached google doc
   that
need socializing.
   
You can only come if you are wearing your bullshit hat.
   
Topics we'd go over could include:
   
+ Our filesystem layout will not work if 1M regions (Matteo/Stack)
+ Current state of the offheaping of read path and alternate KeyValue
implementation (Anoop/Ram)
+ Append rejigger (Elliott)
+ A Pv2-based Assign (Matteo/Steven)
+ Splitting meta/1M regions
+ The revived Backup (Vladimir)
+ Time (Enis)
+ The overloaded SequenceId (Stack)
+ Upstreaming IT testing (Dima/Sean)
+ hbase-2.0.0
   
I put names by folks I know could talk to the topic. If you want to
  take
over a topic or put your name by one, just say.  Suggest that
  discussion
lead off with a 5-10minute on current state of
thought/design/implementation.
   
What do others think?
   
What date would suit folks?
   
Anyone want to host?
   
Thanks,
Matteo and St.Ack
   
  
 



 --
 Best regards,

    - Andy

 Problems worthy of attack prove their worth by hitting back. - Piet Hein
 (via Tom White)





  

Re: DISCUSSION: lets do a developer workshop on near-term work

2015-07-15 Thread lars hofhansl
Works for me. I'll be back in the Bay Area the week of August 9th.
We have done a _lot_ of work on backups as well - ours are more complicated as 
we wanted fast per-tenant restores, so data is grouped by tenant. Would like 
to sync up on that (hopefully some of the folks who wrote most of the code will 
be in town, I'll check).

Also interested in the Time and offheap parts (although you folks usually 
do not like what I think about the offheap efforts :) ).
Would like to add the following topics:


- Timestamp Resolution. Or making space for more bits in the timestamps 
(happy to cover that, unless it's part of the Time topic)


- Replication. We found that replication cannot keep up with high write 
loads, due to the fact that replicated is strictly single threaded per 
regionserver (even though we have multiple region servers on the sink side)


- Spark integration (Ted Malaska?)


OK... Out now to make a bullshit hat.

-- Lars


From: Sean Busbey bus...@cloudera.com
To: dev dev@hbase.apache.org 
Sent: Tuesday, July 14, 2015 7:11 PM
Subject: Re: DISCUSSION: lets do a developer workshop on near-term work


I'm planning to be in the Bay area the week of the 24th of August.

-- 
Sean



On Jul 14, 2015 7:53 PM, Andrew Purtell apurt...@apache.org wrote:

 I can be up in your area in August.

 On Tue, Jul 14, 2015 at 5:31 PM, Stack st...@duboce.net wrote:

  On Tue, Jul 14, 2015 at 3:39 PM, Enis Söztutar enis@gmail.com
 wrote:
 
   Sounds good. It has been a while we did the talk-aton.
  
   I'll be off starting 25 of July, so I prefer something next week if
   possible.
  
   You ever coming back? If so, when? I'm back on 10th of August (Mikhail
 on
  the 20th).
  St.Ack
 
 
 
 
   Enis
  
   On Tue, Jul 14, 2015 at 3:18 PM, Stack st...@duboce.net wrote:
  
Matteo and I were thinking it time devs got together for a pow-wow.
  There
is a bunch of stuff in flight at the moment (see below list) and it
  would
be good to meet and whiteboard, surface goodo ideas that have gone
   dormant
in JIRA, or revisit designs/proposals out in JIRA-attached google doc
   that
need socializing.
   
You can only come if you are wearing your bullshit hat.
   
Topics we'd go over could include:
   
+ Our filesystem layout will not work if 1M regions (Matteo/Stack)
+ Current state of the offheaping of read path and alternate KeyValue
implementation (Anoop/Ram)
+ Append rejigger (Elliott)
+ A Pv2-based Assign (Matteo/Steven)
+ Splitting meta/1M regions
+ The revived Backup (Vladimir)
+ Time (Enis)
+ The overloaded SequenceId (Stack)
+ Upstreaming IT testing (Dima/Sean)
+ hbase-2.0.0
   
I put names by folks I know could talk to the topic. If you want to
  take
over a topic or put your name by one, just say.  Suggest that
  discussion
lead off with a 5-10minute on current state of
thought/design/implementation.
   
What do others think?
   
What date would suit folks?
   
Anyone want to host?
   
Thanks,
Matteo and St.Ack
   
  
 



 --
 Best regards,

- Andy

 Problems worthy of attack prove their worth by hitting back. - Piet Hein
 (via Tom White)



[jira] [Resolved] (HBASE-11482) Optimize HBase TableInput/OutputFormats for exposing tables and snapshots as Spark RDDs

2015-07-15 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-11482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-11482.
---
   Resolution: Duplicate
Fix Version/s: (was: 2.0.0)

Closing as dupe of HBASE-13992.

 Optimize HBase TableInput/OutputFormats for exposing tables and snapshots as 
 Spark RDDs
 ---

 Key: HBASE-11482
 URL: https://issues.apache.org/jira/browse/HBASE-11482
 Project: HBase
  Issue Type: New Feature
  Components: mapreduce, spark
Reporter: Andrew Purtell
Assignee: Ted Malaska

 A core concept of Apache Spark is the resilient distributed dataset (RDD), a 
 fault-tolerant collection of elements that can be operated on in parallel. 
 One can create a RDDs referencing a dataset in any external storage system 
 offering a Hadoop InputFormat, like HBase's TableInputFormat and 
 TableSnapshotInputFormat. 
 Insure the integration is reasonable and provides good performance. 
 Add the ability to save RDDs back to HBase with a {{saveAsHBaseTable}} 
 action, implicitly creating necessary schema on demand.
 Add support for {{filter}} transformations that push predicates down to the 
 server as HBase filters. 
 Consider supporting conversions between Scala and Java types and HBase data 
 using the HBase types library.
 Consider an option to lazily and automatically produce a snapshot only when 
 needed, in a coordinated way. (Concurrently executing workers may want to 
 materialize a table snapshot RDD at the same time.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HBASE-13329) ArrayIndexOutOfBoundsException in CellComparator#getMinimumMidpointArray

2015-07-05 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-13329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-13329.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 2.0, 1.3, 1.2, 1.1, and 1.0. (0.98 does not have this issue)

 ArrayIndexOutOfBoundsException in CellComparator#getMinimumMidpointArray
 

 Key: HBASE-13329
 URL: https://issues.apache.org/jira/browse/HBASE-13329
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Affects Versions: 1.0.1
 Environment: linux-debian-jessie
 ec2 - t2.micro instances
Reporter: Ruben Aguiar
Assignee: Lars Hofhansl
Priority: Critical
 Attachments: 13329-asserts.patch, 13329-v1.patch, 13329.txt, 
 HBASE-13329.test.00.branch-1.1.patch


 While trying to benchmark my opentsdb cluster, I've created a script that 
 sends to hbase always the same value (in this case 1). After a few minutes, 
 the whole region server crashes and the region itself becomes impossible to 
 open again (cannot assign or unassign). After some investigation, what I saw 
 on the logs is that when a Memstore flush is called on a large region (128mb) 
 the process errors, killing the regionserver. On restart, replaying the edits 
 generates the same error, making the region unavailable. Tried to manually 
 unassign, assign or close_region. That didn't work because the code that 
 reads/replays it crashes.
 From my investigation this seems to be an overflow issue. The logs show that 
 the function getMinimumMidpointArray tried to access index -32743 of an 
 array, extremely close to the minimum short value in Java. Upon investigation 
 of the source code, it seems an index short is used, being incremented as 
 long as the two vectors are the same, probably making it overflow on large 
 vectors with equal data. Changing it to int should solve the problem.
 Here follows the hadoop logs of when the regionserver went down. Any help is 
 appreciated. Any other information you need please do tell me:
 2015-03-24 18:00:56,187 INFO  [regionserver//10.2.0.73:16020.logRoller] 
 wal.FSHLog: Rolled WAL 
 /hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427220018516
  with entries=143, filesize=134.70 MB; new WAL 
 /hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427220056140
 2015-03-24 18:00:56,188 INFO  [regionserver//10.2.0.73:16020.logRoller] 
 wal.FSHLog: Archiving 
 hdfs://10.2.0.74:8020/hbase/WALs/10.2.0.73,16020,1427216382590/10.2.0.73%2C16020%2C1427216382590.default.1427219987709
  to 
 hdfs://10.2.0.74:8020/hbase/oldWALs/10.2.0.73%2C16020%2C1427216382590.default.1427219987709
 2015-03-24 18:04:35,722 INFO  [MemStoreFlusher.0] regionserver.HRegion: 
 Started memstore flush for 
 tsdb,,1427133969325.52bc1994da0fea97563a4a656a58bec2., current region 
 memstore size 128.04 MB
 2015-03-24 18:04:36,154 FATAL [MemStoreFlusher.0] regionserver.HRegionServer: 
 ABORTING region server 10.2.0.73,16020,1427216382590: Replay of WAL required. 
 Forcing server shutdown
 org.apache.hadoop.hbase.DroppedSnapshotException: region: 
 tsdb,,1427133969325.52bc1994da0fea97563a4a656a58bec2.
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1999)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1770)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1702)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:445)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:407)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:69)
   at 
 org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:225)
   at java.lang.Thread.run(Thread.java:745)
 Caused by: java.lang.ArrayIndexOutOfBoundsException: -32743
   at 
 org.apache.hadoop.hbase.CellComparator.getMinimumMidpointArray(CellComparator.java:478)
   at 
 org.apache.hadoop.hbase.CellComparator.getMidpoint(CellComparator.java:448)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.finishBlock(HFileWriterV2.java:165)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.checkBlockBoundary(HFileWriterV2.java:146)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV2.append(HFileWriterV2.java:263)
   at 
 org.apache.hadoop.hbase.io.hfile.HFileWriterV3.append(HFileWriterV3.java:87)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFile$Writer.append(StoreFile.java:932)
   at 
 org.apache.hadoop.hbase.regionserver.StoreFlusher.performFlush(StoreFlusher.java:121

Re: Backup/Restore (HBASE-7192) design doc

2015-07-03 Thread lars hofhansl
Lemme have a look. Very interesting in this.Did you see the snapshot bug I just 
recently fixed where each snapshot would leave a Zookeeper watcher around for 
each region server? Pretty bad, and nobody noticed.
-- Lars
  From: Vladimir Rodionov vladrodio...@gmail.com
 To: hbase-...@hadoop.apache.org 
 Sent: Thursday, July 2, 2015 1:05 PM
 Subject: Backup/Restore (HBASE-7192) design doc
   
Hi, folks

Kindly soliciting feedback on a latest design doc:

https://issues.apache.org/jira/browse/HBASE-7912

-Vlad


   

[jira] [Resolved] (HBASE-12765) SplitTransaction creates too many threads (potentially)

2015-07-03 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-12765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-12765.
---
Resolution: Invalid

 SplitTransaction creates too many threads (potentially)
 ---

 Key: HBASE-12765
 URL: https://issues.apache.org/jira/browse/HBASE-12765
 Project: HBase
  Issue Type: Brainstorming
Reporter: Lars Hofhansl
 Attachments: 12765.txt


 In splitStoreFiles(...) we create a new thread pool with as many threads as 
 there are files to split.
 We should be able to do better. During times of very heavy write loads there 
 might be a lot of files to split and multiple splits might be going on at the 
 same time on the same region server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >