Re: release of HBase 1.3.1?
Thanks Dima! That definitely would be handy. -Mikhail On Mon, Apr 3, 2017 at 10:55 PM, Dima Spivakwrote: > I can help with running the API compatibility tooling. It's the least I can > do since I no longer have access to the computing resources I used to rely > upon for testing releases. Just ping me when you need a hand, Mikhail. > > On Mon, Apr 3, 2017 at 10:35 PM Mikhail Antonov > wrote: > > > Hi, > > > > I've been planning to cut an RC for 1.3.1 for some time, apologize for > the > > delay here. I'm going to go over outstanding jiras tomorrow, > > I think there are still few issues waiting for backports. > > > > Andrew - appreciate your offer! I have started preparations for 1.3.1 > > release, but any help with triaging changes, API compatibility tests > > and general release testing would definitely be helpful. > > > > Thanks! > > -Mikhail > > > > On Mon, Apr 3, 2017 at 9:52 PM, Andrew Purtell > > > wrote: > > > > > I'd be happy to RM 1.3.1 unless someone already has it waiting in the > > > wings. > > > > > > > > > > On Apr 3, 2017, at 5:43 PM, James Taylor > > wrote: > > > > > > > > Hello, > > > > We'd like to start supporting releases of Phoenix that work with the > > > HBase > > > > 1.3 branch, but there's a committed fix on which we rely > (HBASE-17587) > > > for > > > > Phoenix to function correctly. Is there a time frame for an HBase > 1.3.1 > > > > release? > > > > Thanks, > > > > James > > > > > > > > > > > -- > > Thanks, > > Michael Antonov > > > -- > -Dima > -- Thanks, Michael Antonov
Re: release of HBase 1.3.1?
I can help with running the API compatibility tooling. It's the least I can do since I no longer have access to the computing resources I used to rely upon for testing releases. Just ping me when you need a hand, Mikhail. On Mon, Apr 3, 2017 at 10:35 PM Mikhail Antonovwrote: > Hi, > > I've been planning to cut an RC for 1.3.1 for some time, apologize for the > delay here. I'm going to go over outstanding jiras tomorrow, > I think there are still few issues waiting for backports. > > Andrew - appreciate your offer! I have started preparations for 1.3.1 > release, but any help with triaging changes, API compatibility tests > and general release testing would definitely be helpful. > > Thanks! > -Mikhail > > On Mon, Apr 3, 2017 at 9:52 PM, Andrew Purtell > wrote: > > > I'd be happy to RM 1.3.1 unless someone already has it waiting in the > > wings. > > > > > > > On Apr 3, 2017, at 5:43 PM, James Taylor > wrote: > > > > > > Hello, > > > We'd like to start supporting releases of Phoenix that work with the > > HBase > > > 1.3 branch, but there's a committed fix on which we rely (HBASE-17587) > > for > > > Phoenix to function correctly. Is there a time frame for an HBase 1.3.1 > > > release? > > > Thanks, > > > James > > > > > > -- > Thanks, > Michael Antonov > -- -Dima
[jira] [Resolved] (HBASE-17868) Backport HBASE-10205 to branch-1.3
[ https://issues.apache.org/jira/browse/HBASE-17868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan resolved HBASE-17868. Resolution: Duplicate Dup of HBASE-HBASE-15691. > Backport HBASE-10205 to branch-1.3 > -- > > Key: HBASE-17868 > URL: https://issues.apache.org/jira/browse/HBASE-17868 > Project: HBase > Issue Type: Bug > Components: BucketCache >Affects Versions: 1.3.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 1.3.1 > > > I got the similar ConcurrentModificationException with hbase-1.3.0 while > working with bucket cache. On verifying seems the issue is not been added to > hbase-1.3.0. > We need to back port to hbase-1.3 and to other branches where ever it was not > applied. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: release of HBase 1.3.1?
Hi, I've been planning to cut an RC for 1.3.1 for some time, apologize for the delay here. I'm going to go over outstanding jiras tomorrow, I think there are still few issues waiting for backports. Andrew - appreciate your offer! I have started preparations for 1.3.1 release, but any help with triaging changes, API compatibility tests and general release testing would definitely be helpful. Thanks! -Mikhail On Mon, Apr 3, 2017 at 9:52 PM, Andrew Purtellwrote: > I'd be happy to RM 1.3.1 unless someone already has it waiting in the > wings. > > > > On Apr 3, 2017, at 5:43 PM, James Taylor wrote: > > > > Hello, > > We'd like to start supporting releases of Phoenix that work with the > HBase > > 1.3 branch, but there's a committed fix on which we rely (HBASE-17587) > for > > Phoenix to function correctly. Is there a time frame for an HBase 1.3.1 > > release? > > Thanks, > > James > -- Thanks, Michael Antonov
Re: release of HBase 1.3.1?
I'd be happy to RM 1.3.1 unless someone already has it waiting in the wings. > On Apr 3, 2017, at 5:43 PM, James Taylorwrote: > > Hello, > We'd like to start supporting releases of Phoenix that work with the HBase > 1.3 branch, but there's a committed fix on which we rely (HBASE-17587) for > Phoenix to function correctly. Is there a time frame for an HBase 1.3.1 > release? > Thanks, > James
[jira] [Created] (HBASE-17871) scan#setBatch(int) call leads wrong result of VerifyReplication
Tomu Tsuruhara created HBASE-17871: -- Summary: scan#setBatch(int) call leads wrong result of VerifyReplication Key: HBASE-17871 URL: https://issues.apache.org/jira/browse/HBASE-17871 Project: HBase Issue Type: Bug Affects Versions: 2.0.0, 1.4.0 Reporter: Tomu Tsuruhara Assignee: Tomu Tsuruhara Priority: Minor VerifyReplication tool printed weird logs. {noformat} 2017-04-03 23:30:50,252 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: CONTENT_DIFFERENT_ROWS, rowkey=a100193 2017-04-03 23:30:50,280 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: ONLY_IN_PEER_TABLE_ROWS, rowkey=a100193 2017-04-03 23:30:50,387 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: CONTENT_DIFFERENT_ROWS, rowkey=a100385 2017-04-03 23:30:50,414 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: ONLY_IN_PEER_TABLE_ROWS, rowkey=a100385 2017-04-03 23:30:50,480 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: CONTENT_DIFFERENT_ROWS, rowkey=a100532 2017-04-03 23:30:50,508 ERROR [main] org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication: ONLY_IN_PEER_TABLE_ROWS, rowkey=a100532 {noformat} Here, each bad rows were marked as both {{CONTENT_DIFFERENT_ROWS}} and {{ONLY_IN_PEER_TABLE_ROWS}}. This should never happen so I took a look at code and found scan.setBatch call. {code} @Override public void map(ImmutableBytesWritable row, final Result value, Context context) throws IOException { if (replicatedScanner == null) { ... final Scan scan = new Scan(); scan.setBatch(batch); {code} As stated in HBASE-16376, {{scan#setBatch(int)}} call implicitly allows scan results to be partial. Since {{VerifyReplication}} is assuming each {{scanner.next()}} call returns entire row, partial results break compare logic. We should avoid setBatch call here. Thanks to RPC chunking (explained in this blog https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1), it's safe and acceptable I think. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: release of HBase 1.3.1?
+1 I meant to send this a while ago. On 4/3/17, 5:43 PM, "James Taylor"wrote: Hello, We'd like to start supporting releases of Phoenix that work with the HBase 1.3 branch, but there's a committed fix on which we rely (HBASE-17587) for Phoenix to function correctly. Is there a time frame for an HBase 1.3.1 release? Thanks, James
release of HBase 1.3.1?
Hello, We'd like to start supporting releases of Phoenix that work with the HBase 1.3 branch, but there's a committed fix on which we rely (HBASE-17587) for Phoenix to function correctly. Is there a time frame for an HBase 1.3.1 release? Thanks, James
[jira] [Created] (HBASE-17870) Backport HBASE-12770 to branch-1.3
Ashu Pachauri created HBASE-17870: - Summary: Backport HBASE-12770 to branch-1.3 Key: HBASE-17870 URL: https://issues.apache.org/jira/browse/HBASE-17870 Project: HBase Issue Type: Improvement Components: Replication Reporter: Ashu Pachauri Assignee: Ashu Pachauri Based on discussion on HBASE-12770, let's backport it to branch-1.3. This combined with zookeeper transport limit breaks replication quite often in large clusters. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Comments on HBase Architecture Document
Hi HBase Developers, The previous email I sent seem to spur more conversion on the durability of HBase rather than its overall architecture; so I think i would send another email to plead comments on our architectural document ( http://pages.cs.wisc.edu/~suli/hbase.pdf) . We are doing some research on scheduling in storage systems, including HBase. We want to make sure that we are making reasonable assumption assumptions about the HBase architecture. We did our best reading the code, and used some runtime tools to understand the internal structure of HBase , but it would be best if some developers of HBase could confirm that our understanding is correct (or, pointing out if it is wrong or inaccurate). We drew an document which describes HBase work-flow ( http://pages.cs.wisc.edu/~suli/hbase.pdf) . It emphasizes how different threads in HBase interact with each other, as that's what we are most interested in. We are wondering if you could take a look at this document and let us know your thoughts. We really appreciate your help. Thanks a lot! Suli -- Suli Yang Department of Physics University of Wisconsin Madison 4257 Chamberlin Hall Madison WI 53703
Re: hbase does not seem to handle mixed workloads well
On Fri, Mar 31, 2017 at 7:29 PM, 杨苏立 Yang Su Liwrote: > Hi, > > We found that when there is a mix of CPU-intensive and I/O intensive > workload, HBase seems to slow everything down to the disk throughput level. > > This is shown in the performance graph at > http://pages.cs.wisc.edu/~suli/blocking-orig.pdf : both client-1 and > client-2 are issuing 1KB Gets. From second 0 , both repeatedly access a > small set of data that is cachable and both get high throughput (~45k > ops/s). At second 60, client-1 switch to an I/O intensive workload and > begins to randomly access a large set of data (does not fit in cache). > *Both* client-1 and client-2's throughput drops to ~0.5K ops/s. > > Is this acceptable behavior for HBase or is it considered a bug or > performance drawback? > I can find an old JIRA entry about similar problems ( > https://issues.apache.org/jira/browse/HBASE-8836), but that was never > resolved. > > Fairness is an old, hard, full-stack problem [1]. You want the hbase client to characterize its read pattern and pass it down through hdfs to the os so it might influence the disk scheduler? We do little in this regard. What is client-1 doing out of interest when it switches to "i/o intensive workload"? It seems to be soaking up all I/Os. Is it blowing the cache too? (On HBASE-8836, on the end it refers to the scheduler which allows you divide the requests at the front door by read/write/scan). Thanks, St.Ack 1. https://www.slideshare.net/cloudera/ecosystem-session-7b > Thanks. > > Suli > > -- > Suli Yang > > Department of Physics > University of Wisconsin Madison > > 4257 Chamberlin Hall > Madison WI 53703 >
[jira] [Created] (HBASE-17869) UnsafeAvailChecker wrongly returns false on ppc
Jerry He created HBASE-17869: Summary: UnsafeAvailChecker wrongly returns false on ppc Key: HBASE-17869 URL: https://issues.apache.org/jira/browse/HBASE-17869 Project: HBase Issue Type: Bug Affects Versions: 1.2.4 Reporter: Jerry He Assignee: Jerry He Priority: Minor On ppc64 arch, java.nio.Bits.unaligned() wrongly returns false due to a JDK bug. https://bugs.openjdk.java.net/browse/JDK-8165231 This causes some problem for HBase. i.e. FuzzyRowFilter test fails. Fix it by providing a hard-code workaround for the JDK bug. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: How threads interact with each other in HBase
Yes, you are correct that there is an edge condition here when there is abrupt power-failure to a node. HDFS guards against most of this as there are multiple copies of your data spread across racks. However, if you have abrupt power failure across multiple racks (or your entire hardware), yes, you would likely lose some data. Having some form of redundant power-supply is a common deployment choice that further mitigates this risk. If this is not documented clearly enough, patches are welcome to improve this :) IMO, all of this is an implementation detail, though, as I believe you already understand. It does not change the fact that architecturally/academically, HBase is a consistent system. 杨苏立 Yang Su Li wrote: I understand why HBase by default does not use hsync -- it does come with big performance cost (though for FSYNC_WAL which is not the default option, you should probably do it because the documentation explicitly promised it). I just want to make sure my description about HBase is accurate, including the durability aspect. On Sun, Apr 2, 2017 at 12:19 PM, Ted Yuwrote: Suli: Have you looked at HBASE-5954 ? It gives some background on why hbase code is formulated the way it currently is. Cheers On Sun, Apr 2, 2017 at 9:36 AM, 杨苏立 Yang Su Li wrote: Don't your second paragraph just prove my point? -- If data is not persisted to disk, then it is not durable. That is the definition of durability. If you want the data to be durable, then you need to call hsync() instead of hflush(), and that would be the correct behavior if you use FSYNC_WAL flag (per HBase documentation). However, HBase does not do that. Suli On Sun, Apr 2, 2017 at 11:26 AM, Josh Elser wrote: No, that's not correct. HBase would, by definition, not be a consistent database if a write was not durable when a client sees a successful write. The point that I will concede to you is that the hflush call may, in extenuating circumstances, may not be completely durable. For example, HFlush does not actually force the data to disk. If an abrupt power failure happens before this data is pushed to disk, HBase may think that data was durable when it actually wasn't (at the HDFS level). On Thu, Mar 30, 2017 at 4:26 PM, 杨苏立 Yang Su Li wrote: Also, please correct me if I am wrong, but I don't think a put is durable when an RPC returns to the client. Just its corresponding WAL entry is pushed to the memory of all three data nodes, so it has a low probability of being lost. But nothing is persisted at this point. And this is true no mater you use SYNC_WAL or FSYNC_WAL flag. On Tue, Mar 28, 2017 at 12:11 PM, Josh Elser wrote: 1.1 -> 2: don't forget about the block cache which can invalidate the need for any HDFS read. I think you're over-simplifying the write-path quite a bit. I'm not sure what you mean by an 'asynchronous write', but that doesn't exist at the HBase RPC layer as that would invalidate the consistency guarantees (if an RPC returns to the client that data was "put", then it is durable). Going off of memory (sorry in advance if I misstate something): the general way that data is written to the WAL is a "group commit". You have many threads all trying to append data to the WAL -- performance would be terrible if you serially applied all of these writes. Instead, many writes can be accepted and a the caller receives a Future. The caller must wait for the Future to complete. What's happening behind the scene is that the writes are being bundled together to reduce the number of syncs to the WAL ("grouping" the writes together). When one caller's future would complete, what really happened is that the write/sync which included the caller's update was committed (along with others). All of this is happening inside the RS's implementation of accepting an update. https://github.com/apache/hbase/blob/55d6dcaf877cc5223e67973 6eb613173229c18be/hbase-server/src/main/java/org/ apache/hadoop/hbase/ regionserver/wal/FSHLog.java#L74-L106 杨苏立 Yang Su Li wrote: The attachment can be found in the following URL: http://pages.cs.wisc.edu/~suli/hbase.pdf Sorry for the inconvenience... On Mon, Mar 27, 2017 at 8:25 PM, Ted Yu wrote: Again, attachment didn't come thru. Is it possible to formulate as google doc ? Thanks On Mon, Mar 27, 2017 at 6:19 PM, 杨苏立 Yang Su Li< yangs...@gmail.com> wrote: Hi, I am a graduate student working on scheduling on storage systems, and we are interested in how different threads in HBase interact with each other and how it might affect scheduling. I have written down my understanding on how HBase/HDFS works based on its current thread architecture (attached). I am wondering if the developers of HBase could take a look at it and let me know if anything is incorrect or inaccurate, or if I have missed
[jira] [Created] (HBASE-17868) Backport HBASE-10205 to branch-1.3
ramkrishna.s.vasudevan created HBASE-17868: -- Summary: Backport HBASE-10205 to branch-1.3 Key: HBASE-17868 URL: https://issues.apache.org/jira/browse/HBASE-17868 Project: HBase Issue Type: Bug Components: BucketCache Affects Versions: 1.3.0 Reporter: ramkrishna.s.vasudevan Assignee: ramkrishna.s.vasudevan Fix For: 1.3.1 I got the similar ConcurrentModificationException with hbase-1.3.0 while working with bucket cache. On verifying seems the issue is not been added to hbase-1.3.0. We need to back port to hbase-1.3 and to other branches where ever it was not applied. -- This message was sent by Atlassian JIRA (v6.3.15#6346)