[jira] [Resolved] (HBASE-20522) Maven artifacts for HBase 2 are not available at maven central
[ https://issues.apache.org/jira/browse/HBASE-20522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismaël Mejía resolved HBASE-20522. -- Resolution: Fixed The deps are available, I confirm it is fixed now. Thanks [~stack] > Maven artifacts for HBase 2 are not available at maven central > -- > > Key: HBASE-20522 > URL: https://issues.apache.org/jira/browse/HBASE-20522 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 2.0.0 >Reporter: Ismaël Mejía >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20530) Composition of backup directory incorrectly contains namespace when restoring
Ted Yu created HBASE-20530: -- Summary: Composition of backup directory incorrectly contains namespace when restoring Key: HBASE-20530 URL: https://issues.apache.org/jira/browse/HBASE-20530 Project: HBase Issue Type: Bug Reporter: Ted Yu Here is partial listing of output from incremental backup: {code} 5306 2018-05-04 02:38 hdfs://mycluster/user/hbase/backup_loc/backup_1525401467793/table_almphxih4u/cf1/5648501da7194783947bbf07b172f07e {code} When restoring, here is what HBackupFileSystem.getTableBackupDir returns: {code} fileBackupDir=hdfs://mycluster/user/hbase/backup_loc/backup_1525401467793/default/table_almphxih4u {code} You can see that namespace gets in the way, leading to inability of finding the proper hfile. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20529) Make sure that there are no remote wals when transit cluster from DA to A
Guanghao Zhang created HBASE-20529: -- Summary: Make sure that there are no remote wals when transit cluster from DA to A Key: HBASE-20529 URL: https://issues.apache.org/jira/browse/HBASE-20529 Project: HBase Issue Type: Sub-task Components: Replication Reporter: Guanghao Zhang Consider we have two clusters in A and S state, and then we transit A to DA. And later we want to transit DA to A, since the remote cluster is in S, we should be able to do it. But there are some remote wals on the HDFS for the cluster in S state, so we need to wait the remote wals was removed first before transiting the cluster in DA state to A. Need add a check for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-20510) Add a downloads page to hbase.apache.org to tie mirrored artifacts to their hash and signature
[ https://issues.apache.org/jira/browse/HBASE-20510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-20510. --- Resolution: Fixed Assignee: stack Fix Version/s: 3.0.0 Resolving for now. Can open new issue to prettify > Add a downloads page to hbase.apache.org to tie mirrored artifacts to their > hash and signature > -- > > Key: HBASE-20510 > URL: https://issues.apache.org/jira/browse/HBASE-20510 > Project: HBase > Issue Type: Task > Components: website >Reporter: stack >Assignee: stack >Priority: Major > Fix For: 3.0.0 > > Attachments: > 0001-HBASE-20510-Add-a-downloads-page-to-hbase.apache.org.patch, Screen Shot > 2018-04-30 at 4.50.45 PM.png > > > I tried to push to announce for 2.0.0 but it got blocked because mirrors > don't have signatures and hashes (I only just noticed. Its probably been this > way for ages). Signatures and hashes are hosted on www.apache.org only. > So, there needs to be a connection made between hashes and signatures hosted > on apache.org and the associated artifacts gotten from mirrors. Projects > usually do this on a 'download' page. Add one with 2.0.0 as first entry. > [~misty] FYI -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Changes to website re: hbasecon
Looks great. Thanks boys. S On Thu, May 3, 2018 at 3:11 PM, Josh Elser wrote: > ClayB was telling me last week that he had folks having a hard time > getting to the "current" HBaseCon 2018 page on our site via Google > (apparently they kept sending people to the previous site > hbase.apache.org/www.hbasecon.com) > > I just pushed a couple of changes: > > 1. JS redirect of h.a.o/www.hbasecon.com to h.a.o/hbasecon-2018 > 2. A tabular archive page at h.a.o/hbasecon-archives.html that Clay wrote > for us (super nice!) > 3. Updated hbasecon-2018 with a pointer to this archive page > > Please holler if this offends you greatly :) >
Changes to website re: hbasecon
ClayB was telling me last week that he had folks having a hard time getting to the "current" HBaseCon 2018 page on our site via Google (apparently they kept sending people to the previous site hbase.apache.org/www.hbasecon.com) I just pushed a couple of changes: 1. JS redirect of h.a.o/www.hbasecon.com to h.a.o/hbasecon-2018 2. A tabular archive page at h.a.o/hbasecon-archives.html that Clay wrote for us (super nice!) 3. Updated hbasecon-2018 with a pointer to this archive page Please holler if this offends you greatly :)
[jira] [Resolved] (HBASE-20498) [AMv2] Stuck in UnexpectedStateException
[ https://issues.apache.org/jira/browse/HBASE-20498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-20498. --- Resolution: Invalid Resolving. Logs are gone and my statement of '... we go assign regions' is misisng vital detail... Will keep an eye out for this... going forward but killing this JIRA. > [AMv2] Stuck in UnexpectedStateException > > > Key: HBASE-20498 > URL: https://issues.apache.org/jira/browse/HBASE-20498 > Project: HBase > Issue Type: Sub-task >Reporter: stack >Assignee: stack >Priority: Critical > Fix For: 2.0.1 > > Attachments: HBASE-20498.branch-2.001.patch > > > Here is how we get into a stuck scenario in hbase-2.0.0RC2. > * Assign a region. It is moved to OPENING state then RPCs the RS. > * RS opens region. Tells Master. > * Master tries to complete the assign by updating hbase:meta. > * hbase:meta is hosed because I'd deployed a bad patch that blocked > hbase:meta updates > * Master is stuck retrying RPCs to RS hosting hbase:meta; we want to update > our new OPEN state in hbase:meta. > * I kill Master because I want to fix the broke patch. > * On restart, a script sets table to be DISABLED. > * As part of startup, we go to assign regions. > * We skip assigning regions because the table is DISABLED; i.e. we skip the > replay of the unfinished assign. > * The region is now a free-agent; no lock held, so, the queued unassign that > is part of the disable table can run > * It fails because region is in OPENING state, an UnexpectedStateException > is thrown. > We loop complaining the above. > Resolution requires finishing previous assign first, then we can disable. > Let me try and write a test to manufacture this state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (HBASE-20507) Do not need to call recoverLease on the broken file when we fail to create a wal writer
[ https://issues.apache.org/jira/browse/HBASE-20507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-20507. --- Resolution: Fixed Fix Version/s: 3.0.0 Re-resolving. > Do not need to call recoverLease on the broken file when we fail to create a > wal writer > --- > > Key: HBASE-20507 > URL: https://issues.apache.org/jira/browse/HBASE-20507 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: 20507.addendum.patch, HBASE-20507.patch > > > I tried locally with a UT, if we overwrite a file which is currently being > written, the old file will be completed and then deleted. If you call close > on the previous file, a no lease exception will be thrown which means that > the file has already been completed. > So we do not need to close a file if it will be overwritten immediately, > since recoverLease may take a very long time... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (HBASE-20507) Do not need to call recoverLease on the broken file when we fail to create a wal writer
[ https://issues.apache.org/jira/browse/HBASE-20507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack reopened HBASE-20507: --- Reopening to add an addendum (nightlies are broke since this went in). > Do not need to call recoverLease on the broken file when we fail to create a > wal writer > --- > > Key: HBASE-20507 > URL: https://issues.apache.org/jira/browse/HBASE-20507 > Project: HBase > Issue Type: Improvement > Components: wal >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Fix For: 2.1.0, 2.0.1 > > Attachments: 20507.addendum.patch, HBASE-20507.patch > > > I tried locally with a UT, if we overwrite a file which is currently being > written, the old file will be completed and then deleted. If you call close > on the previous file, a no lease exception will be thrown which means that > the file has already been completed. > So we do not need to close a file if it will be overwritten immediately, > since recoverLease may take a very long time... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Fwd:
Wow, This explanation is really detailed. That helps me much! I totally understand the read process now. Thanks a million. Thanks, Alex 2018-05-02 22:33 GMT-07:00 ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com>: > Regarding the read flow this is what happens > > 1) Create a region level scanner > 2) the region level scanner can comprise of more than one store scanner > (each store scanner works on one column family). > 3) Every store scanner wil comprise of memstore scanner and a set of hfile > scanners (based on number of store files). > 4) The scan tries to read data in lexographical order. > For eg, for simplicty take you have row1 to row5 and there is only one > column family 'f1' and one column 'c1'. Assume row1 was already written and > it is flushed to a store file. Row2 to row5 are in the memstore . > When the scanner starts it will form a heap with all these memstore scanner > and store file (hfile) scanners. Internally since row1 is smaller > lexographically the row1 from the store file is retrieved first. This row1 > for the first time will be in HDFS (and not in block cache). The remaining > rows are fetched from memstore scanners. there is no block cache concept at > the memstore level. Memstore is just a simple Key value map. > > When the same scan is issued the next time we go through the above steps > but to fetch row1, the store file scanner that has row1, fetches the block > cache that has row1 (instead of HDFS) and returns the value from block > cache and the remaining rows are again fetched from memstore scanners from > the underlying memstore. > > Hope this helps. > > REgards > Ram > > On Thu, May 3, 2018 at 9:17 AM, Xi Yang wrote: > > > Hi Tim, > > > > Thanks for confirm the question. That question confused me for a long > > time. Really appreciate. > > > > > > About another question, I still don't know whether ModelA is correct or > > Model B is correct. Still confused > > > > > > Thanks, > > Alex > > > > 2018-05-02 13:53 GMT-07:00 Tim Robertson : > > > > > Thanks Alex, > > > > > > Yes, looking at that code I believe you are correct - the memStore > > scanner > > > is appended after the block scanners. > > > The block scanners may or may not see hits in the block cache when they > > > read. If they don't get a hit, they'll open the block from the > underlying > > > HFile(s). > > > > > > > > > > > > On Wed, May 2, 2018 at 10:41 PM, Xi Yang > wrote: > > > > > > > Hi Tim, > > > > > > > > Thank you for detailed explanation. Yes, that really helps me! I > really > > > > appreciate it! > > > > > > > > > > > > But I still confused about the sequence: > > > > > > > > I've read these codes in *HStore.getScanners* : > > > > > > > > > > > > *// TODO this used to get the store files in descending order,* > > > > *// but now we get them in ascending order, which I think is* > > > > *// actually more correct, since memstore get put at the end.* > > > > *List sfScanners = > > > > StoreFileScanner.getScannersForStoreFiles(storeFilesToScan,* > > > > * cacheBlocks, usePread, isCompaction, false, matcher, readPt);* > > > > *List scanners = new > > ArrayList<>(sfScanners.size() + > > > > 1);* > > > > *scanners.addAll(sfScanners);* > > > > *// Then the memstore scanners* > > > > *scanners.addAll(memStoreScanners);* > > > > > > > > > > > > Is it mean this step: > > > > > > > > > > > > *2) It looks in the memstore to see if there are any writes still in > > > > memoryready to flush down to the HFiles that needs merged with the > data > > > > read in 1) * > > > > > > > > is behind the following step? > > > > > > > > *c) the data is read from the opened block * > > > > > > > > > > > > > > > > > > > > Here are explanation of the images I drew before, so that we don't > need > > > the > > > > images: > > > > > > > > When a read request come in > > > > Model A > > > > > > > >1. get Scanners (including StoreScanner and MemStoreScanner). > > > >MemStoreScanner is the last one > > > >2. Begin with the first StoreScanner > > > >3. Try to get the block from BlockCache of the StoreScanner > > > >4. Try to get the block from HFile of the StoreScanner > > > >5. Go to the next StoreScanner > > > >6. Loop #2 - #5 until all StoreScanner been used > > > >7. Try to get the block from memStore > > > > > > > > > > > > Model B > > > > > > > >1. Try to get the block from BlockCache, if failed then go to #2 > > > >2. get Scanners (including StoreScanner and MemStoreScanner). > > > >MemStoreScanner is the last on > > > >3. Begin with the first StoreScanner > > > >4. Try to get the block from HFile of the StoreScanner > > > >5. Go to the next StoreScanner > > > >6. Loop #4 - #5 until all StoreScanner been used > > > >7. Try to get the block from memStore > > > > > > > > > > > > > > > > Thanks, > > > > Alex > > > > > > > > > > > > 2018-05-02 1:04 GMT-07:00 Tim Robertson : > > > > > > > > > Hi Alex, > > > > > > >
[jira] [Created] (HBASE-20528) Revise collections copying from iteration to built-in function
Hua-Yi Ho created HBASE-20528: - Summary: Revise collections copying from iteration to built-in function Key: HBASE-20528 URL: https://issues.apache.org/jira/browse/HBASE-20528 Project: HBase Issue Type: Improvement Reporter: Hua-Yi Ho Some collection codes in file StochasticLoadBalancer.java, AbstractHBaseTool.java, HFileInputFormat.java, Result.java, and WalPlayer.java, using iterations to copy whole data in collections. The iterations can just replace by just Colletions.addAll and Arrays.copyOf. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20527) Remove unused code in MetaTableAccessor.java
Mingdao Yang created HBASE-20527: Summary: Remove unused code in MetaTableAccessor.java Key: HBASE-20527 URL: https://issues.apache.org/jira/browse/HBASE-20527 Project: HBase Issue Type: Improvement Affects Versions: 3.0.0 Reporter: Mingdao Yang META_REGION_PREFIX isn't used. I'll clean it up. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[DISCUSS] Effective HBase in the Cloud
Hi, I'm pleased to finally be able to share this design document with you all. It's the result of internal review from half a dozen or so from within our community (Enis, Devaraj, Artem, and Clay easily come to mind) after multiple months of review and iteration. Abstract: Infrastructure as a service (IaaS) via public cloud infrastructure offerings (Cloud Iaas) has grown dramatically in popularity through services like Amazon EC2, Google Compute Engine, and Microsoft Azure Compute. Across Apache HBase users, the majority of new system architectures include some form of Cloud IaaS as a means to increase the capabilities and/or decrease the cost of operation of their system. However, deploying HBase on these platforms comes with difficulties as HBase has a non-optional dependency on Apache Hadoop HDFS to guarantee the durability of data written to HBase. This document outlines a proposal to remove HBase’s dependency on HDFS by replacing the current Write-Ahead-Log (WAL) implementation using Apache Ratis (incubating). It covers why the HDFS dependency is a problem on Cloud IaaS, how Ratis can be used to replace HDFS-based WALs, and a high-level development plan to effectively implement the replacement of this extremely critical HBase internal component without becoming tied to a single Cloud IaaS offering. The document is available on Google Docs[1] and there is also PDF available [2] of the current version. I'm happy to assist those who do not want to use the copy on a Google service (e.g. transcribe mailing-list chatter onto the Doc). Thanks to some of the same folks who helped with this document, I also have a fairly in-depth analysis of what we think the required work will entail. For the HBase specific changes, I'd like to avoid the pitfall we commonly face and work towards frequent merges into master that do not destabilize the build (keep things "Green") to avoid stalling our forward momentum after 2.0. If people are curious/interested, I'm happy to delve some more into how I think we can implement this. - Josh [1] https://docs.google.com/document/d/1Su5py_T5Ytfh9RoTTX2s20KbSJwBHVxbO7ge5ORqbCk/edit# [2] https://home.apache.org/~elserj/Effective%20HBase%20in%20the%20Cloud.pdf
Re: Looking forward to releasing 1.2.7
Hi! Thanks for bringing this up. I'm still the RM for 1.2 and I've been anxious to get the release cadence going again for almost 3 months now. Unfortunately, the nightly tests are still failing: https://builds.apache.org/view/H-L/view/HBase/job/HBase%20Nightly/job/branch-1.2/ For previous releases in this line, I relied on the nightly tests on ASF infra passing as a health check for the code branch. I've been tied up in HBase 2.0 and $dayjob work, so haven't managed to chase down the remaining failures. Maybe it's time I find a different way to assure myself. Andrew P, I think for the 1.4 release line you do unit test running on some non-ASF system. I am 100% certain I could get a VM somewhere to do the same for 1.2 RCs. Do you know off-hand what size I need to grab as far as cpu / mem is concerned? On Thu, May 3, 2018 at 7:31 AM, Kang Minwoo wrote: > Hello! > > I am looking forward to releasing HBase 1.2.7. > Do you know when 1.2.7 will be released? > > Best regards, > Minwoo Kang
Looking forward to releasing 1.2.7
Hello! I am looking forward to releasing HBase 1.2.7. Do you know when 1.2.7 will be released? Best regards, Minwoo Kang
[jira] [Created] (HBASE-20526) multithreads bulkload performance
Key Hutu created HBASE-20526: Summary: multithreads bulkload performance Key: HBASE-20526 URL: https://issues.apache.org/jira/browse/HBASE-20526 Project: HBase Issue Type: Improvement Components: mapreduce Affects Versions: 2.0.0 Reporter: Key Hutu Assignee: Key Hutu When doing bulkload , some interactive with zookeeper to getting region key range may be cost more time. In multithreads enviorment, the duration maybe cost 5 minute or more. >From the executor log, like 'Reading reply sessionid:0x262fb37f4a07080 , >packet:: clientPath:null server ...' contents appear many times. It likely to provide new method for bulkload, caching the key range outside -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20525) Refactoring the code of read path
Duo Zhang created HBASE-20525: - Summary: Refactoring the code of read path Key: HBASE-20525 URL: https://issues.apache.org/jira/browse/HBASE-20525 Project: HBase Issue Type: Umbrella Components: scan Reporter: Duo Zhang Fix For: 3.0.0 The known problems of the current implementation: 1. 'Seek or skip' should be decided at StoreFileScanner level, not StoreScanner. 2. As now we support creating multiple StoreFileReader instances for a single HFile, we do not need to load the file info and other meta infos every time when creating a new StoreFileReader instance. 3. 'Pread or stream' should be decided at StoreFileScanner level, not StoreScanner. 4. Make sure that we can return at any point during a scan, at least when filterRowKey we can not stop until we reach the next row, no matter how many cells we need to skip... 5. Doing bytes comparing everywhere, where we need to know if there is a row change, a family change, a qualifier change, etc. This is a performance killer. And the most important thing is that, the code is way too complicated now and become out of control... This should be done before our 3.0.0 release. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HBASE-20524) Need to clear metrics when ReplicationSourceManager refresh replication sources
Guanghao Zhang created HBASE-20524: -- Summary: Need to clear metrics when ReplicationSourceManager refresh replication sources Key: HBASE-20524 URL: https://issues.apache.org/jira/browse/HBASE-20524 Project: HBase Issue Type: Bug Reporter: Guanghao Zhang Assignee: Guanghao Zhang When ReplicationSourceManager refresh replication sources, it will close the old source first, then startup a new source. The new source will use a new metrics, but forgot to clear the metrics for old sources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)