Re: release of HBase 1.3.1?

2017-04-03 Thread Mikhail Antonov
Thanks Dima! That definitely would be handy.

-Mikhail

On Mon, Apr 3, 2017 at 10:55 PM, Dima Spivak  wrote:

> I can help with running the API compatibility tooling. It's the least I can
> do since I no longer have access to the computing resources I used to rely
> upon for testing releases. Just ping me when you need a hand, Mikhail.
>
> On Mon, Apr 3, 2017 at 10:35 PM Mikhail Antonov 
> wrote:
>
> > Hi,
> >
> > I've been planning to cut an RC for 1.3.1 for some time, apologize for
> the
> > delay here. I'm going to go over outstanding jiras tomorrow,
> > I think there are still few issues waiting for backports.
> >
> > Andrew - appreciate your offer! I have started preparations for 1.3.1
> > release, but any help with triaging changes, API compatibility tests
> > and general release testing would definitely be helpful.
> >
> > Thanks!
> > -Mikhail
> >
> > On Mon, Apr 3, 2017 at 9:52 PM, Andrew Purtell  >
> > wrote:
> >
> > > I'd be happy to RM 1.3.1 unless someone already has it waiting in the
> > > wings.
> > >
> > >
> > > > On Apr 3, 2017, at 5:43 PM, James Taylor 
> > wrote:
> > > >
> > > > Hello,
> > > > We'd like to start supporting releases of Phoenix that work with the
> > > HBase
> > > > 1.3 branch, but there's a committed fix on which we rely
> (HBASE-17587)
> > > for
> > > > Phoenix to function correctly. Is there a time frame for an HBase
> 1.3.1
> > > > release?
> > > > Thanks,
> > > > James
> > >
> >
> >
> >
> > --
> > Thanks,
> > Michael Antonov
> >
> --
> -Dima
>



-- 
Thanks,
Michael Antonov


Re: release of HBase 1.3.1?

2017-04-03 Thread Dima Spivak
I can help with running the API compatibility tooling. It's the least I can
do since I no longer have access to the computing resources I used to rely
upon for testing releases. Just ping me when you need a hand, Mikhail.

On Mon, Apr 3, 2017 at 10:35 PM Mikhail Antonov 
wrote:

> Hi,
>
> I've been planning to cut an RC for 1.3.1 for some time, apologize for the
> delay here. I'm going to go over outstanding jiras tomorrow,
> I think there are still few issues waiting for backports.
>
> Andrew - appreciate your offer! I have started preparations for 1.3.1
> release, but any help with triaging changes, API compatibility tests
> and general release testing would definitely be helpful.
>
> Thanks!
> -Mikhail
>
> On Mon, Apr 3, 2017 at 9:52 PM, Andrew Purtell 
> wrote:
>
> > I'd be happy to RM 1.3.1 unless someone already has it waiting in the
> > wings.
> >
> >
> > > On Apr 3, 2017, at 5:43 PM, James Taylor 
> wrote:
> > >
> > > Hello,
> > > We'd like to start supporting releases of Phoenix that work with the
> > HBase
> > > 1.3 branch, but there's a committed fix on which we rely (HBASE-17587)
> > for
> > > Phoenix to function correctly. Is there a time frame for an HBase 1.3.1
> > > release?
> > > Thanks,
> > > James
> >
>
>
>
> --
> Thanks,
> Michael Antonov
>
-- 
-Dima


[jira] [Resolved] (HBASE-17868) Backport HBASE-10205 to branch-1.3

2017-04-03 Thread ramkrishna.s.vasudevan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-17868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan resolved HBASE-17868.

Resolution: Duplicate

Dup of HBASE-HBASE-15691.

> Backport HBASE-10205 to branch-1.3
> --
>
> Key: HBASE-17868
> URL: https://issues.apache.org/jira/browse/HBASE-17868
> Project: HBase
>  Issue Type: Bug
>  Components: BucketCache
>Affects Versions: 1.3.0
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 1.3.1
>
>
> I got the similar ConcurrentModificationException with hbase-1.3.0 while 
> working with bucket cache. On verifying seems the issue is not been added to 
> hbase-1.3.0.
> We need to back port to hbase-1.3 and to other branches where ever it was not 
> applied.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: release of HBase 1.3.1?

2017-04-03 Thread Mikhail Antonov
Hi,

I've been planning to cut an RC for 1.3.1 for some time, apologize for the
delay here. I'm going to go over outstanding jiras tomorrow,
I think there are still few issues waiting for backports.

Andrew - appreciate your offer! I have started preparations for 1.3.1
release, but any help with triaging changes, API compatibility tests
and general release testing would definitely be helpful.

Thanks!
-Mikhail

On Mon, Apr 3, 2017 at 9:52 PM, Andrew Purtell 
wrote:

> I'd be happy to RM 1.3.1 unless someone already has it waiting in the
> wings.
>
>
> > On Apr 3, 2017, at 5:43 PM, James Taylor  wrote:
> >
> > Hello,
> > We'd like to start supporting releases of Phoenix that work with the
> HBase
> > 1.3 branch, but there's a committed fix on which we rely (HBASE-17587)
> for
> > Phoenix to function correctly. Is there a time frame for an HBase 1.3.1
> > release?
> > Thanks,
> > James
>



-- 
Thanks,
Michael Antonov


Re: release of HBase 1.3.1?

2017-04-03 Thread Andrew Purtell
I'd be happy to RM 1.3.1 unless someone already has it waiting in the wings. 


> On Apr 3, 2017, at 5:43 PM, James Taylor  wrote:
> 
> Hello,
> We'd like to start supporting releases of Phoenix that work with the HBase
> 1.3 branch, but there's a committed fix on which we rely (HBASE-17587) for
> Phoenix to function correctly. Is there a time frame for an HBase 1.3.1
> release?
> Thanks,
> James


[jira] [Created] (HBASE-17871) scan#setBatch(int) call leads wrong result of VerifyReplication

2017-04-03 Thread Tomu Tsuruhara (JIRA)
Tomu Tsuruhara created HBASE-17871:
--

 Summary: scan#setBatch(int) call leads wrong result of 
VerifyReplication
 Key: HBASE-17871
 URL: https://issues.apache.org/jira/browse/HBASE-17871
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.0.0, 1.4.0
Reporter: Tomu Tsuruhara
Assignee: Tomu Tsuruhara
Priority: Minor


VerifyReplication tool printed weird logs.

{noformat}
2017-04-03 23:30:50,252 ERROR [main] 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: 
CONTENT_DIFFERENT_ROWS, rowkey=a100193
2017-04-03 23:30:50,280 ERROR [main] 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: 
ONLY_IN_PEER_TABLE_ROWS, rowkey=a100193
2017-04-03 23:30:50,387 ERROR [main] 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: 
CONTENT_DIFFERENT_ROWS, rowkey=a100385
2017-04-03 23:30:50,414 ERROR [main] 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: 
ONLY_IN_PEER_TABLE_ROWS, rowkey=a100385
2017-04-03 23:30:50,480 ERROR [main] 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: 
CONTENT_DIFFERENT_ROWS, rowkey=a100532
2017-04-03 23:30:50,508 ERROR [main] 
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: 
ONLY_IN_PEER_TABLE_ROWS, rowkey=a100532
{noformat}

Here, each bad rows were marked as both {{CONTENT_DIFFERENT_ROWS}} and 
{{ONLY_IN_PEER_TABLE_ROWS}}.
This should never happen so I took a look at code and found scan.setBatch call.

{code}
@Override
public void map(ImmutableBytesWritable row, final Result value,
Context context)
throws IOException {
  if (replicatedScanner == null) {
...
final Scan scan = new Scan();
scan.setBatch(batch);
{code}

As stated in HBASE-16376, {{scan#setBatch(int)}} call implicitly allows scan 
results to be partial.

Since {{VerifyReplication}} is assuming each {{scanner.next()}} call returns 
entire row,
partial results break compare logic.

We should avoid setBatch call here.
Thanks to RPC chunking (explained in this blog 
https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1),
it's safe and acceptable I think.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: release of HBase 1.3.1?

2017-04-03 Thread York, Zach
+1

I meant to send this a while ago.

On 4/3/17, 5:43 PM, "James Taylor"  wrote:

Hello,
We'd like to start supporting releases of Phoenix that work with the HBase
1.3 branch, but there's a committed fix on which we rely (HBASE-17587) for
Phoenix to function correctly. Is there a time frame for an HBase 1.3.1
release?
Thanks,
James




release of HBase 1.3.1?

2017-04-03 Thread James Taylor
Hello,
We'd like to start supporting releases of Phoenix that work with the HBase
1.3 branch, but there's a committed fix on which we rely (HBASE-17587) for
Phoenix to function correctly. Is there a time frame for an HBase 1.3.1
release?
Thanks,
James


[jira] [Created] (HBASE-17870) Backport HBASE-12770 to branch-1.3

2017-04-03 Thread Ashu Pachauri (JIRA)
Ashu Pachauri created HBASE-17870:
-

 Summary: Backport HBASE-12770 to branch-1.3
 Key: HBASE-17870
 URL: https://issues.apache.org/jira/browse/HBASE-17870
 Project: HBase
  Issue Type: Improvement
  Components: Replication
Reporter: Ashu Pachauri
Assignee: Ashu Pachauri


Based on discussion on HBASE-12770, let's backport it to branch-1.3. This 
combined with zookeeper transport limit breaks replication quite often in large 
clusters.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Comments on HBase Architecture Document

2017-04-03 Thread 杨苏立 Yang Su Li
Hi HBase Developers,

The previous email I sent seem to spur more conversion on the durability of
HBase rather than its overall architecture; so I think i would send another
email to plead comments on our architectural document (
http://pages.cs.wisc.edu/~suli/hbase.pdf) .
We are doing some research on scheduling in storage systems, including
HBase. We want to make sure that we are making reasonable assumption
assumptions about the HBase architecture.

We did our best reading the code, and used some runtime tools to understand
the internal structure of HBase , but it would be best if some developers
of HBase could confirm that our understanding is correct (or, pointing out
if it is wrong or inaccurate).

We drew an document which describes HBase work-flow (
http://pages.cs.wisc.edu/~suli/hbase.pdf) . It emphasizes how different
threads in HBase interact with each other, as that's what we are most
interested in.

We are wondering if you could take a look at this document and let us know
your thoughts. We really appreciate your help.

Thanks a lot!


Suli

-- 
Suli Yang

Department of Physics
University of Wisconsin Madison

4257 Chamberlin Hall
Madison WI 53703


Re: hbase does not seem to handle mixed workloads well

2017-04-03 Thread Stack
On Fri, Mar 31, 2017 at 7:29 PM, 杨苏立 Yang Su Li  wrote:

> Hi,
>
> We found that when there is a mix of CPU-intensive and I/O intensive
> workload, HBase seems to slow everything down to the disk throughput level.
>
> This is shown in the performance graph at
> http://pages.cs.wisc.edu/~suli/blocking-orig.pdf : both client-1 and
> client-2 are issuing 1KB Gets. From second 0 , both repeatedly access a
> small set of data that is cachable and both get high throughput (~45k
> ops/s). At second 60, client-1 switch to an I/O intensive workload and
> begins to randomly access a large set of data (does not fit in cache).
> *Both* client-1 and client-2's throughput drops to ~0.5K ops/s.
>
> Is this acceptable behavior for HBase or is it considered a bug or
> performance drawback?
> I can find an old JIRA entry about similar problems (
> https://issues.apache.org/jira/browse/HBASE-8836), but that was never
> resolved.
>
>

Fairness is an old, hard, full-stack problem [1]. You want the hbase client
to characterize its read pattern and pass it down through hdfs to the os so
it might influence the disk scheduler? We do little in this regard.

What is client-1 doing out of interest when it switches to "i/o intensive
workload"? It seems to be soaking up all I/Os. Is it blowing the cache too?

(On HBASE-8836, on the end it refers to the scheduler which allows you
divide the requests at the front door by read/write/scan).

Thanks,

St.Ack
1. https://www.slideshare.net/cloudera/ecosystem-session-7b


> Thanks.
>
> Suli
>
> --
> Suli Yang
>
> Department of Physics
> University of Wisconsin Madison
>
> 4257 Chamberlin Hall
> Madison WI 53703
>


[jira] [Created] (HBASE-17869) UnsafeAvailChecker wrongly returns false on ppc

2017-04-03 Thread Jerry He (JIRA)
Jerry He created HBASE-17869:


 Summary: UnsafeAvailChecker wrongly returns false on ppc
 Key: HBASE-17869
 URL: https://issues.apache.org/jira/browse/HBASE-17869
 Project: HBase
  Issue Type: Bug
Affects Versions: 1.2.4
Reporter: Jerry He
Assignee: Jerry He
Priority: Minor


On ppc64 arch,  java.nio.Bits.unaligned() wrongly returns false due to a JDK 
bug.
https://bugs.openjdk.java.net/browse/JDK-8165231
This causes some problem for HBase. i.e. FuzzyRowFilter test fails.
Fix it by providing a hard-code workaround for the JDK bug.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: How threads interact with each other in HBase

2017-04-03 Thread Josh Elser
Yes, you are correct that there is an edge condition here when there is 
abrupt power-failure to a node. HDFS guards against most of this as 
there are multiple copies of your data spread across racks. However, if 
you have abrupt power failure across multiple racks (or your entire 
hardware), yes, you would likely lose some data. Having some form of 
redundant power-supply is a common deployment choice that further 
mitigates this risk. If this is not documented clearly enough, patches 
are welcome to improve this :)


IMO, all of this is an implementation detail, though, as I believe you 
already understand. It does not change the fact that 
architecturally/academically, HBase is a consistent system.


杨苏立 Yang Su Li wrote:

I understand why HBase by default does not use hsync -- it does come with
big performance cost (though for FSYNC_WAL which is not the default option,
you should probably do it because the documentation explicitly promised
it).


I just want to make sure my description about HBase is accurate, including
the durability aspect.

On Sun, Apr 2, 2017 at 12:19 PM, Ted Yu  wrote:


Suli:
Have you looked at HBASE-5954 ?

It gives some background on why hbase code is formulated the way it
currently is.

Cheers

On Sun, Apr 2, 2017 at 9:36 AM, 杨苏立 Yang Su Li  wrote:


Don't your second paragraph just prove my point? -- If data is not
persisted to disk, then it is not durable. That is the definition of
durability.

If you want the data to be durable, then you need to call hsync() instead
of hflush(), and that would be the correct behavior if you use FSYNC_WAL
flag (per HBase documentation).

However, HBase does not do that.

Suli

On Sun, Apr 2, 2017 at 11:26 AM, Josh Elser

wrote:

No, that's not correct. HBase would, by definition, not be a
consistent database if a write was not durable when a client sees a
successful write.

The point that I will concede to you is that the hflush call may, in
extenuating circumstances, may not be completely durable. For example,
HFlush does not actually force the data to disk. If an abrupt power
failure happens before this data is pushed to disk, HBase may think
that data was durable when it actually wasn't (at the HDFS level).

On Thu, Mar 30, 2017 at 4:26 PM, 杨苏立 Yang Su Li
wrote:

Also, please correct me if I am wrong, but I don't think a put is

durable

when an RPC returns to the client. Just its corresponding WAL entry

is

pushed to the memory of all three data nodes, so it has a low

probability

of being lost. But nothing is persisted at this point.

And this is true no mater you use SYNC_WAL or FSYNC_WAL flag.

On Tue, Mar 28, 2017 at 12:11 PM, Josh Elser

wrote:

1.1 ->  2: don't forget about the block cache which can invalidate

the

need

for any HDFS read.

I think you're over-simplifying the write-path quite a bit. I'm not

sure

what you mean by an 'asynchronous write', but that doesn't exist at

the

HBase RPC layer as that would invalidate the consistency guarantees

(if

an

RPC returns to the client that data was "put", then it is durable).

Going off of memory (sorry in advance if I misstate something): the
general way that data is written to the WAL is a "group commit". You

have

many threads all trying to append data to the WAL -- performance

would

be

terrible if you serially applied all of these writes. Instead, many

writes

can be accepted and a the caller receives a Future. The caller must

wait

for the Future to complete. What's happening behind the scene is

that

the

writes are being bundled together to reduce the number of syncs to

the

WAL

("grouping" the writes together). When one caller's future would

complete,

what really happened is that the write/sync which included the

caller's

update was committed (along with others). All of this is happening

inside

the RS's implementation of accepting an update.

https://github.com/apache/hbase/blob/55d6dcaf877cc5223e67973
6eb613173229c18be/hbase-server/src/main/java/org/

apache/hadoop/hbase/

regionserver/wal/FSHLog.java#L74-L106


杨苏立 Yang Su Li wrote:


The attachment can be found in the following URL:
http://pages.cs.wisc.edu/~suli/hbase.pdf

Sorry for the inconvenience...


On Mon, Mar 27, 2017 at 8:25 PM, Ted Yu

wrote:

Again, attachment didn't come thru.

Is it possible to formulate as google doc ?

Thanks

On Mon, Mar 27, 2017 at 6:19 PM, 杨苏立 Yang Su Li<

yangs...@gmail.com>

wrote:

Hi,

I am a graduate student working on scheduling on storage systems,

and we

are interested in how different threads in HBase interact with

each

other
and how it might affect scheduling.

I have written down my understanding on how HBase/HDFS works

based

on

its
current thread architecture (attached). I am wondering if the

developers

of


HBase could take a look at it and let me know if anything is

incorrect

or
inaccurate, or if I have missed 

[jira] [Created] (HBASE-17868) Backport HBASE-10205 to branch-1.3

2017-04-03 Thread ramkrishna.s.vasudevan (JIRA)
ramkrishna.s.vasudevan created HBASE-17868:
--

 Summary: Backport HBASE-10205 to branch-1.3
 Key: HBASE-17868
 URL: https://issues.apache.org/jira/browse/HBASE-17868
 Project: HBase
  Issue Type: Bug
  Components: BucketCache
Affects Versions: 1.3.0
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 1.3.1


I got the similar ConcurrentModificationException with hbase-1.3.0 while 
working with bucket cache. On verifying seems the issue is not been added to 
hbase-1.3.0.
We need to back port to hbase-1.3 and to other branches where ever it was not 
applied.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)