[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase
[ https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027563#comment-13027563 ] dhruba borthakur commented on HBASE-3833: - another option would be to maintain the includes-list and excludes-list in zk itself. Instead of local files on the master, they could be zk nodes that contain a list of machine-names. what do people think about that? ability to support includes/excludes list in Hbase -- Key: HBASE-3833 URL: https://issues.apache.org/jira/browse/HBASE-3833 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur An HBase cluster currently does not have the ability to specify that the master should accept regionservers only from a specified list. This helps preventing administrative errors where the same machine could be included in two clusters. It also allows the administrator to easily remove un-ssh-able machines from the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase
[ https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027726#comment-13027726 ] Jean-Daniel Cryans commented on HBASE-3833: --- @Vishal, currently doing stop regionserver is probably not what you want to do in a cluster serving live requests (it closes all connections then closes the regions, meaning regions unavailable for a while). You'd want something more like bin/graceful_stop.sh @Dhruba, I think it would be cool to have that in ZK, but I also like having the same semantics as Hadoop. So I'm 0. ability to support includes/excludes list in Hbase -- Key: HBASE-3833 URL: https://issues.apache.org/jira/browse/HBASE-3833 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur An HBase cluster currently does not have the ability to specify that the master should accept regionservers only from a specified list. This helps preventing administrative errors where the same machine could be included in two clusters. It also allows the administrator to easily remove un-ssh-able machines from the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3842) Refactor Coprocessor Compaction API
[ https://issues.apache.org/jira/browse/HBASE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027746#comment-13027746 ] stack commented on HBASE-3842: -- +1 In postCompactSelection, do you need to know if you have all files or not (So you can figure if its a major or not)? Also, don't you want to pass the memstore to the preCompaction so we can implement the flushing compaction where we weave a flush into a compaction result so we don't always create new file on flush? Refactor Coprocessor Compaction API --- Key: HBASE-3842 URL: https://issues.apache.org/jira/browse/HBASE-3842 Project: HBase Issue Type: Improvement Components: coprocessors, regionserver Affects Versions: 0.92.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Labels: compaction Fix For: 0.92.0 After HBASE-3797, the compaction logic flow has been significantly altered. Because of this, the current compaction coprocessor API is insufficient for gaining full insight into compaction requests/results. Refactor coprocessor API after HBASE-3797 is committed to be more extensible and increase visibility. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3843) splitLogWorker starts too early
[ https://issues.apache.org/jira/browse/HBASE-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3843: - Resolution: Fixed Fix Version/s: 0.92.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Thanks Prakash. That makes sense. Committed to TRUNK. splitLogWorker starts too early --- Key: HBASE-3843 URL: https://issues.apache.org/jira/browse/HBASE-3843 Project: HBase Issue Type: Bug Reporter: Prakash Khemani Assignee: Prakash Khemani Fix For: 0.92.0 Attachments: 0001-HBASE-3843-start-splitLogWorker-later-at-region-serv.patch splitlogworker should be started in startServiceThreads() instead of in initializeZookeeper(). This will ensure that the region server accepts a split-logging tasks only after it has successfully done reportForDuty() to the master. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3670) Fix error handling in get(ListGet gets)
[ https://issues.apache.org/jira/browse/HBASE-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3670. -- Resolution: Fixed Hadoop Flags: [Reviewed] Thanks you for the patch Harsh. Applied to TRUNK. Fix error handling in get(ListGet gets) - Key: HBASE-3670 URL: https://issues.apache.org/jira/browse/HBASE-3670 Project: HBase Issue Type: Bug Components: client Affects Versions: 0.92.0 Reporter: Lars George Assignee: Harsh J Chouraria Fix For: 0.92.0 Attachments: HBASE-3670.r1.diff See HBASE-3634 for details. The get(ListGet gets) call needs to catch (or rather use a try/finally) the exception thrown by batch() and copy the Result instances over and return it. If that is not intended then we need to fix the JavaDoc in HTableInterface to reflect the new behavior. In general it seems to make sense to check the various methods (list based put, get, delete compared to batch) and agree on the correct behavior. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3841) HTable and HTableInterface docs are inconsistent with one another
[ https://issues.apache.org/jira/browse/HBASE-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027749#comment-13027749 ] stack commented on HBASE-3841: -- I agree. HTable and HTableInterface docs are inconsistent with one another - Key: HBASE-3841 URL: https://issues.apache.org/jira/browse/HBASE-3841 Project: HBase Issue Type: Improvement Components: documentation Affects Versions: 0.90.0 Reporter: Harsh J Chouraria Assignee: Harsh J Chouraria Priority: Trivial Labels: documentation Fix For: 0.92.0 Both HTI and HT carry javadocs in their methods. Ideally only HTI should carry them now (HBASE-3670), and where it doesn't matter, docs must be stripped out of HT's class (inherited instead). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3838) RegionCoprocesorHost.preWALRestore throws npe in case there is no RegionObserver registered.
[ https://issues.apache.org/jira/browse/HBASE-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3838. -- Resolution: Fixed Fix Version/s: 0.92.0 Assignee: Himanshu Vashishtha Hadoop Flags: [Reviewed] Committed to TRUNK. Thank you for the patch Himanshu. RegionCoprocesorHost.preWALRestore throws npe in case there is no RegionObserver registered. Key: HBASE-3838 URL: https://issues.apache.org/jira/browse/HBASE-3838 Project: HBase Issue Type: Bug Reporter: Himanshu Vashishtha Assignee: Himanshu Vashishtha Priority: Minor Fix For: 0.92.0 Attachments: patch.txt It seems the check to bypass the Observers chain is at wrong place in case of pre/post WALRestore. It should be inside the if statement that checks whether the CP is instance of RegionObserver or not. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3839) Expose in-progress tasks on web UIs
[ https://issues.apache.org/jira/browse/HBASE-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027753#comment-13027753 ] stack commented on HBASE-3839: -- Thats a pretty picture Expose in-progress tasks on web UIs --- Key: HBASE-3839 URL: https://issues.apache.org/jira/browse/HBASE-3839 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.92.0 Attachments: tasks.png HBASE-3836 adds a TaskMonitor class which collects info about what's going on inside processes. This ticket is to expose the task monitor info on the web UIs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3844) Book.xml (removing link to defunct wiki) and Performance.xml (adding client tip)
Book.xml (removing link to defunct wiki) and Performance.xml (adding client tip) Key: HBASE-3844 URL: https://issues.apache.org/jira/browse/HBASE-3844 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Book.xml - in the FAQ it had a link to a Frequently Seen Errors wiki page. This page is labeled as defunct and doesn't even have anything useful on it anyway. Removed the link to that page. Performance.xml - added tip in Performance under client about attribute selection. This is one of those obvious but not so obvious topics, if you only need 3 attributes don't select the entire column family. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3844) Book.xml (removing link to defunct wiki) and Performance.xml (adding client tip)
[ https://issues.apache.org/jira/browse/HBASE-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-3844: - Attachment: performance_HBASE_3844.xml.patch book_HBASE_3844.xml.patch Book.xml (removing link to defunct wiki) and Performance.xml (adding client tip) Key: HBASE-3844 URL: https://issues.apache.org/jira/browse/HBASE-3844 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_HBASE_3844.xml.patch, performance_HBASE_3844.xml.patch Book.xml - in the FAQ it had a link to a Frequently Seen Errors wiki page. This page is labeled as defunct and doesn't even have anything useful on it anyway. Removed the link to that page. Performance.xml - added tip in Performance under client about attribute selection. This is one of those obvious but not so obvious topics, if you only need 3 attributes don't select the entire column family. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3844) Book.xml (removing link to defunct wiki) and Performance.xml (adding client tip)
[ https://issues.apache.org/jira/browse/HBASE-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doug Meil updated HBASE-3844: - Status: Patch Available (was: Open) Book.xml (removing link to defunct wiki) and Performance.xml (adding client tip) Key: HBASE-3844 URL: https://issues.apache.org/jira/browse/HBASE-3844 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Attachments: book_HBASE_3844.xml.patch, performance_HBASE_3844.xml.patch Book.xml - in the FAQ it had a link to a Frequently Seen Errors wiki page. This page is labeled as defunct and doesn't even have anything useful on it anyway. Removed the link to that page. Performance.xml - added tip in Performance under client about attribute selection. This is one of those obvious but not so obvious topics, if you only need 3 attributes don't select the entire column family. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3839) Expose in-progress tasks on web UIs
[ https://issues.apache.org/jira/browse/HBASE-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027757#comment-13027757 ] Jean-Daniel Cryans commented on HBASE-3839: --- Todd you are a true artist. Expose in-progress tasks on web UIs --- Key: HBASE-3839 URL: https://issues.apache.org/jira/browse/HBASE-3839 Project: HBase Issue Type: New Feature Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.92.0 Attachments: tasks.png HBASE-3836 adds a TaskMonitor class which collects info about what's going on inside processes. This ticket is to expose the task monitor info on the web UIs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3836) Add facility to track currently progressing actions/workflows
[ https://issues.apache.org/jira/browse/HBASE-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027758#comment-13027758 ] stack commented on HBASE-3836: -- +1 looks good to me Todd. Will help. In another world, if we were starting over, all hbase 'components' would have state and even history; a visitor could iterate all registered components and then dump the output to UI or to monitoring tools via JMX, etc. dependent on how system was configured. I can do that in my next life. Add facility to track currently progressing actions/workflows - Key: HBASE-3836 URL: https://issues.apache.org/jira/browse/HBASE-3836 Project: HBase Issue Type: New Feature Components: master, regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.92.0 Attachments: hbase-3836.txt A lot of troubleshooting involves answering the question well, what is your server doing right now? Today, that involves some combination of interpreting jstack output and/or trudging through logs. Problems with these methods are: (a) users may not have direct ssh access to regionserver machines in production environments, (b) logs are very verbose, so hard to separate what's still going on vs stuff that might have completed, and (c) interpreting jstack requires a pretty good knowledge of the codebase plus diving into source code. I'd like to add a singleton (for now) which takes care of tracking any major actions going on in the region server and master. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3844) Book.xml (removing link to defunct wiki) and Performance.xml (adding client tip)
[ https://issues.apache.org/jira/browse/HBASE-3844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3844: - Resolution: Fixed Fix Version/s: 0.92.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Thank you for the patch Doug. Applied to TRUNK. Book.xml (removing link to defunct wiki) and Performance.xml (adding client tip) Key: HBASE-3844 URL: https://issues.apache.org/jira/browse/HBASE-3844 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Fix For: 0.92.0 Attachments: book_HBASE_3844.xml.patch, performance_HBASE_3844.xml.patch Book.xml - in the FAQ it had a link to a Frequently Seen Errors wiki page. This page is labeled as defunct and doesn't even have anything useful on it anyway. Removed the link to that page. Performance.xml - added tip in Performance under client about attribute selection. This is one of those obvious but not so obvious topics, if you only need 3 attributes don't select the entire column family. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3835) Switch web pages to Jamon template engine instead of JSP
[ https://issues.apache.org/jira/browse/HBASE-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027762#comment-13027762 ] stack commented on HBASE-3835: -- Looks good to me (I like the test). Will I bring over the other pages to use jamon? Switch web pages to Jamon template engine instead of JSP Key: HBASE-3835 URL: https://issues.apache.org/jira/browse/HBASE-3835 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.92.0 Attachments: hbase-3835.txt Jamon (http://jamon.org) is a template engine that I think is preferable to JSP. You can read an interview with some comparisons vs JSP here: http://www.artima.com/lejava/articles/jamon.html In particular, I think it will give us the following advantages: - Since we'll have a servlet in front of each template, it will encourage us to write less inline Java code and do more code in the servlets. - Makes proper unit testing easier since you can trivially render a template and pass in mock arguments without having to start a whole HTTP stack - Static typing of template arguments makes it easier to know at compile-time if you've made a mistake. Thoughts? I converted the Master UI yesterday and only took a couple hours. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3825) performance.xml - adding a few common configuration changes in the 'config' sub-section
[ https://issues.apache.org/jira/browse/HBASE-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3825: - Resolution: Fixed Fix Version/s: 0.92.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Commmitted to TRUNK. Thanks for the patch Doug. performance.xml - adding a few common configuration changes in the 'config' sub-section --- Key: HBASE-3825 URL: https://issues.apache.org/jira/browse/HBASE-3825 Project: HBase Issue Type: Improvement Reporter: Doug Meil Assignee: Doug Meil Priority: Minor Fix For: 0.92.0 Attachments: performance_HBASE_3825.xml.patch There are a few common configurations that people make to adjust the RegionServer memory behavior (% of block cache, upperLimit, etc.) Adding a few entries under the 'configurations' sub-section in performance to reference the configuration section for each item. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3835) Switch web pages to Jamon template engine instead of JSP
[ https://issues.apache.org/jira/browse/HBASE-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027778#comment-13027778 ] Todd Lipcon commented on HBASE-3835: Yea, I was planning on moving some more pages over if people seemed to like this choice. Let me ping the dev list and make sure a quorum of committers is on board, since it's something we will live with for a while. Switch web pages to Jamon template engine instead of JSP Key: HBASE-3835 URL: https://issues.apache.org/jira/browse/HBASE-3835 Project: HBase Issue Type: Improvement Components: master, regionserver Affects Versions: 0.92.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.92.0 Attachments: hbase-3835.txt Jamon (http://jamon.org) is a template engine that I think is preferable to JSP. You can read an interview with some comparisons vs JSP here: http://www.artima.com/lejava/articles/jamon.html In particular, I think it will give us the following advantages: - Since we'll have a servlet in front of each template, it will encourage us to write less inline Java code and do more code in the servlets. - Makes proper unit testing easier since you can trivially render a template and pass in mock arguments without having to start a whole HTTP stack - Static typing of template arguments makes it easier to know at compile-time if you've made a mistake. Thoughts? I converted the Master UI yesterday and only took a couple hours. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase
[ https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027784#comment-13027784 ] Todd Lipcon commented on HBASE-3833: If going for same semantics as Hadoop, we should try to match the semantics of HDFS-1547 which adds a new decom file, rather than include/exclude. Most ops people I've talked to find the include/exclude files super confusing (what does it mean to be both included and excluded, for example?) ability to support includes/excludes list in Hbase -- Key: HBASE-3833 URL: https://issues.apache.org/jira/browse/HBASE-3833 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur An HBase cluster currently does not have the ability to specify that the master should accept regionservers only from a specified list. This helps preventing administrative errors where the same machine could be included in two clusters. It also allows the administrator to easily remove un-ssh-able machines from the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3842) Refactor Coprocessor Compaction API
[ https://issues.apache.org/jira/browse/HBASE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027792#comment-13027792 ] Nicolas Spiegelberg commented on HBASE-3842: @stack: That's a good point about major compaction. Maybe I should have the input param be CompactionRequest, which contains the Store, File list, and isMajor. Would there be a problem with giving the user direct access to the Store? I was a little worried about giving that unless someone else concurred. Maybe make a separate ICompactionRequest IStore API for coprocessor contracts? If I passed the CompactionRequest object - which contains a Store - to the user, then the coprocessor client could access the MemStore through some Store API? Refactor Coprocessor Compaction API --- Key: HBASE-3842 URL: https://issues.apache.org/jira/browse/HBASE-3842 Project: HBase Issue Type: Improvement Components: coprocessors, regionserver Affects Versions: 0.92.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Labels: compaction Fix For: 0.92.0 After HBASE-3797, the compaction logic flow has been significantly altered. Because of this, the current compaction coprocessor API is insufficient for gaining full insight into compaction requests/results. Refactor coprocessor API after HBASE-3797 is committed to be more extensible and increase visibility. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3842) Refactor Coprocessor Compaction API
[ https://issues.apache.org/jira/browse/HBASE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027794#comment-13027794 ] Andrew Purtell commented on HBASE-3842: --- bq. If I passed the CompactionRequest object - which contains a Store - to the user, then the coprocessor client could access the MemStore through some Store API? +1 passing Store instead of merely the byte[] store name Refactor Coprocessor Compaction API --- Key: HBASE-3842 URL: https://issues.apache.org/jira/browse/HBASE-3842 Project: HBase Issue Type: Improvement Components: coprocessors, regionserver Affects Versions: 0.92.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Labels: compaction Fix For: 0.92.0 After HBASE-3797, the compaction logic flow has been significantly altered. Because of this, the current compaction coprocessor API is insufficient for gaining full insight into compaction requests/results. Refactor coprocessor API after HBASE-3797 is committed to be more extensible and increase visibility. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3842) Refactor Coprocessor Compaction API
[ https://issues.apache.org/jira/browse/HBASE-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027795#comment-13027795 ] stack commented on HBASE-3842: -- I like passing CompactionRequest notion (and it gives you access to Store if you need it). You could do an Interface to narrow what CPs can access. There may be concurrency/perf reasons for not letting CPs have direct access. I have no strong opinion on it N. Refactor Coprocessor Compaction API --- Key: HBASE-3842 URL: https://issues.apache.org/jira/browse/HBASE-3842 Project: HBase Issue Type: Improvement Components: coprocessors, regionserver Affects Versions: 0.92.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Labels: compaction Fix For: 0.92.0 After HBASE-3797, the compaction logic flow has been significantly altered. Because of this, the current compaction coprocessor API is insufficient for gaining full insight into compaction requests/results. Refactor coprocessor API after HBASE-3797 is committed to be more extensible and increase visibility. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Reporter: Prakash Khemani (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration
[ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027813#comment-13027813 ] stack commented on HBASE-3777: -- Just FYI, we have cluster uuid as of HBASE-3677. Redefine Identity Of HBase Configuration Key: HBASE-3777 URL: https://issues.apache.org/jira/browse/HBASE-3777 Project: HBase Issue Type: Improvement Components: client, ipc Affects Versions: 0.90.2 Reporter: Karthick Sankarachary Assignee: Karthick Sankarachary Priority: Minor Fix For: 0.92.0 Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we deep-compare {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is concerned that a single {{HConnection}} is insufficient for sharing amongst clients, to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being uniquely identifiable. Note that sharing connections makes clean up of {{HConnection}} instances a little awkward, unless of course, you apply the change described in HBASE-3766. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027816#comment-13027816 ] stack commented on HBASE-3845: -- The scenario you describe seems plausible Prakash. Let me up the priority of this issue. data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3845: - Priority: Critical (was: Major) Affects Version/s: 0.90.3 Filed against 0.90.3 and made critical. Any test to demo behavior P? data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Priority: Critical (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits
[ https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027823#comment-13027823 ] dhruba borthakur commented on HBASE-3845: - Good finding Prakash! data loss because lastSeqWritten can miss memstore edits Key: HBASE-3845 URL: https://issues.apache.org/jira/browse/HBASE-3845 Project: HBase Issue Type: Bug Affects Versions: 0.90.3 Reporter: Prakash Khemani Priority: Critical (I don't have a test case to prove this yet but I have run it by Dhruba and Kannan internally and wanted to put this up for some feedback.) In this discussion let us assume that the region has only one column family. That way I can use region/memstore interchangeably. After a memstore flush it is possible for lastSeqWritten to have a log-sequence-id for a region that is not the earliest log-sequence-id for that region's memstore. HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure that we only keep track of the earliest log-sequence-number that is present in the memstore. Every time the memstore is flushed we remove the region's entry in lastSequenceWritten and wait for the next append to populate this entry again. This is where the problem happens. step 1: flusher.prepare() snapshots the memstore under HRegion.updatesLock.writeLock(). step 2 : as soon as the updatesLock.writeLock() is released new entries will be added into the memstore. step 3 : wal.completeCacheFlush() is called. This method removes the region's entry from lastSeqWritten. step 4: the next append will create a new entry for the region in lastSeqWritten(). But this will be the log seq id of the current append. All the edits that were added in step 2 are missing. == as a temporary measure, instead of removing the region's entry in step 3 I will replace it with the log-seq-id of the region-flush-event. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3833) ability to support includes/excludes list in Hbase
[ https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027828#comment-13027828 ] dhruba borthakur commented on HBASE-3833: - +1 to Todd's idea of similarity with HDFS-1547 ability to support includes/excludes list in Hbase -- Key: HBASE-3833 URL: https://issues.apache.org/jira/browse/HBASE-3833 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur An HBase cluster currently does not have the ability to specify that the master should accept regionservers only from a specified list. This helps preventing administrative errors where the same machine could be included in two clusters. It also allows the administrator to easily remove un-ssh-able machines from the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration
[ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027832#comment-13027832 ] stack commented on HBASE-3777: -- I took a look at the posted patch. I'm thinking we should commit it as is (Any objections? I can address Ted Yu's last comment on commit). Unfortunately, it won't do for 0.90.x since its now polluted with TRUNKisms -- i.e. ServerName -- but thats probably ok since this is a big change. Let me try this patch out on a cluster in the meantime to make sure it basically works. Redefine Identity Of HBase Configuration Key: HBASE-3777 URL: https://issues.apache.org/jira/browse/HBASE-3777 Project: HBase Issue Type: Improvement Components: client, ipc Affects Versions: 0.90.2 Reporter: Karthick Sankarachary Assignee: Karthick Sankarachary Priority: Minor Fix For: 0.92.0 Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we deep-compare {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is concerned that a single {{HConnection}} is insufficient for sharing amongst clients, to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being uniquely identifiable. Note that sharing connections makes clean up of {{HConnection}} instances a little awkward, unless of course, you apply the change described in HBASE-3766. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration
[ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027839#comment-13027839 ] Karthick Sankarachary commented on HBASE-3777: -- bq. I took a look at the posted patch. I'm thinking we should commit it as is (Any objections? I can address Ted Yu's last comment on commit) Just an update, I ran the test today after rebasing it (yet again), and this time there were no failures period. I'll update the patch on the review board, so you don't have to rebase it. Redefine Identity Of HBase Configuration Key: HBASE-3777 URL: https://issues.apache.org/jira/browse/HBASE-3777 Project: HBase Issue Type: Improvement Components: client, ipc Affects Versions: 0.90.2 Reporter: Karthick Sankarachary Assignee: Karthick Sankarachary Priority: Minor Fix For: 0.92.0 Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we deep-compare {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is concerned that a single {{HConnection}} is insufficient for sharing amongst clients, to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being uniquely identifiable. Note that sharing connections makes clean up of {{HConnection}} instances a little awkward, unless of course, you apply the change described in HBASE-3766. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration
[ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027844#comment-13027844 ] jirapos...@reviews.apache.org commented on HBASE-3777: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ --- (Updated 2011-05-02 20:59:23.844076) Review request for hbase and Ted Yu. Changes --- I ran the test today after rebasing it (yet again), and this time there were no failures period. Summary --- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we deep-compare HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is concerned that a single HConnection is insufficient for sharing amongst clients, to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being uniquely identifiable. This addresses bug HBASE-3777. https://issues.apache.org/jira/browse/HBASE-3777 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275 src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing --- mvn test Thanks, Karthick Redefine Identity Of HBase Configuration Key: HBASE-3777 URL: https://issues.apache.org/jira/browse/HBASE-3777 Project: HBase Issue Type: Improvement Components: client, ipc Affects Versions: 0.90.2 Reporter: Karthick Sankarachary Assignee: Karthick Sankarachary Priority: Minor Fix For: 0.92.0 Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one
[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration
[ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027860#comment-13027860 ] jirapos...@reviews.apache.org commented on HBASE-3777: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/ --- (Updated 2011-05-02 21:34:35.203784) Review request for hbase and Ted Yu. Changes --- As Ted suggsted, added a log statement for the case where connectSucceeded is false. Summary --- Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we deep-compare HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is concerned that a single HConnection is insufficient for sharing amongst clients, to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being uniquely identifiable. This addresses bug HBASE-3777. https://issues.apache.org/jira/browse/HBASE-3777 Diffs (updated) - src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411 src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456 src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275 src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb Diff: https://reviews.apache.org/r/643/diff Testing --- mvn test Thanks, Karthick Redefine Identity Of HBase Configuration Key: HBASE-3777 URL: https://issues.apache.org/jira/browse/HBASE-3777 Project: HBase Issue Type: Improvement Components: client, ipc Affects Versions: 0.90.2 Reporter: Karthick Sankarachary Assignee: Karthick Sankarachary Priority: Minor Fix For: 0.92.0 Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between
[jira] [Created] (HBASE-3846) Set RIT timeout higher
Set RIT timeout higher -- Key: HBASE-3846 URL: https://issues.apache.org/jira/browse/HBASE-3846 Project: HBase Issue Type: Task Affects Versions: 0.90.2 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Fix For: 0.90.3 As I was talking in HBASE-3669, it is really easy with the current RIT timeout to end up in situations where regions are doubly assigned, not assigned at all or assigned but the master doesn't know about it. As a bandaid, we should set hbase.master.assignment.timeoutmonitor.timeout to what the ZK session timeout is. We had to do that to one of our clusters to be able to start it, else the master kept racing with itself. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3846) Set RIT timeout higher
[ https://issues.apache.org/jira/browse/HBASE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-3846: -- Attachment: HBASE-3846.patch This patch is what I'm suggesting. Set RIT timeout higher -- Key: HBASE-3846 URL: https://issues.apache.org/jira/browse/HBASE-3846 Project: HBase Issue Type: Task Affects Versions: 0.90.2 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Fix For: 0.90.3 Attachments: HBASE-3846.patch As I was talking in HBASE-3669, it is really easy with the current RIT timeout to end up in situations where regions are doubly assigned, not assigned at all or assigned but the master doesn't know about it. As a bandaid, we should set hbase.master.assignment.timeoutmonitor.timeout to what the ZK session timeout is. We had to do that to one of our clusters to be able to start it, else the master kept racing with itself. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3669) Region in PENDING_OPEN keeps being bounced between RS and master
[ https://issues.apache.org/jira/browse/HBASE-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-3669: -- Fix Version/s: (was: 0.90.3) Fixing anything here would require some heavy handed reworks so I opened HBASE-3846 to address upping the timeout and I'm moving this to 0.92 Region in PENDING_OPEN keeps being bounced between RS and master Key: HBASE-3669 URL: https://issues.apache.org/jira/browse/HBASE-3669 Project: HBase Issue Type: Bug Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.0 Attachments: HBASE-3669-debug-v1.patch After going crazy killing region servers after HBASE-3668, most of the cluster recovered except for 3 regions that kept being refused by the region servers. One the master I would see: {code} 2011-03-17 22:23:14,828 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21. state=PENDING_OPEN, ts=1300400554826 2011-03-17 22:23:14,828 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_OPEN for too long, reassigning region=supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21. 2011-03-17 22:23:14,828 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; was=supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21. state=PENDING_OPEN, ts=1300400554826 2011-03-17 22:23:14,828 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan was found (or we are ignoring an existing plan) for supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21. so generated a random one; hri=supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21., src=, dest=sv2borg171,60020,1300399357135; 17 (online=17, exclude=null) available servers 2011-03-17 22:23:14,828 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Assigning region supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21. to sv2borg171,60020,1300399357135 {code} Then on the region server: {code} 2011-03-17 22:23:14,829 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x22d627c142707d2 Attempting to transition node f11849557c64c4efdbe0498f3fe97a21 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING 2011-03-17 22:23:14,832 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: regionserver:60020-0x22d627c142707d2 Retrieved 166 byte(s) of data from znode /hbase/unassigned/f11849557c64c4efdbe0498f3fe97a21; data=region=supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21., server=sv2borg180,60020,1300384550966, state=RS_ZK_REGION_OPENING 2011-03-17 22:23:14,832 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x22d627c142707d2 Attempt to transition the unassigned node for f11849557c64c4efdbe0498f3fe97a21 from M_ZK_REGION_OFFLINE to RS_ZK_REGION_OPENING failed, the node existed but was in the state RS_ZK_REGION_OPENING 2011-03-17 22:23:14,832 WARN org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed transition from OFFLINE to OPENING for region=f11849557c64c4efdbe0498f3fe97a21 {code} I'm not sure I fully understand what was going on... the master was suppose to OFFLINE the znode but then that's not what the region server was seeing? In any case, I was able to recover by doing a force unassign for each region and then assign. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3777) Redefine Identity Of HBase Configuration
[ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027914#comment-13027914 ] jirapos...@reviews.apache.org commented on HBASE-3777: -- --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/643/#review632 --- Ship it! I think a patch for 0.90 should be produced separately. We have informed hbase users of this change. They would expect to benefit from it in 0.90 - Ted On 2011-05-02 21:34:35, Karthick Sankarachary wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/643/ bq. --- bq. bq. (Updated 2011-05-02 21:34:35) bq. bq. bq. Review request for hbase and Ted Yu. bq. bq. bq. Summary bq. --- bq. bq. Judging from the javadoc in HConnectionManager, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create HTable instances using a given Configuration instance and a copy thereof, we end up with two distinct HConnection instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? bq. bq. Here, I'd like to play devil's advocate and propose that we deep-compare HBaseConfiguration instances, so that multiple HBaseConfiguration instances that have the same properties map to the same HConnection instance. In case one is concerned that a single HConnection is insufficient for sharing amongst clients, to quote the javadoc, then one should be able to mark a given HBaseConfiguration instance as being uniquely identifiable. bq. bq. bq. This addresses bug HBASE-3777. bq. https://issues.apache.org/jira/browse/HBASE-3777 bq. bq. bq. Diffs bq. - bq. bq.src/main/java/org/apache/hadoop/hbase/HConstants.java 0911375 bq.src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java feed777 bq.src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java fa7448b bq.src/main/java/org/apache/hadoop/hbase/client/HConnection.java 1beedaf bq.src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java ded51c8 bq.src/main/java/org/apache/hadoop/hbase/client/HTable.java 46bac9f bq.src/main/java/org/apache/hadoop/hbase/client/HTablePool.java 88827a8 bq.src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 26d0b31 bq. src/main/java/org/apache/hadoop/hbase/client/replication/ReplicationAdmin.java d76e333 bq.src/main/java/org/apache/hadoop/hbase/mapred/TableOutputFormat.java 80284bb bq.src/main/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java 05d9395 bq. src/main/java/org/apache/hadoop/hbase/mapreduce/replication/VerifyReplication.java ed88bfa bq.src/main/java/org/apache/hadoop/hbase/master/HMaster.java f526411 bq.src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 834c456 bq.src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 421f275 bq. src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java 133da33 bq. src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java fc71f03 bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 1c1d94b bq.src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 39f3af2 bq.src/main/java/org/apache/hadoop/hbase/util/HMerge.java c447287 bq.src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java a4def94 bq.src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java dc471c4 bq.src/test/java/org/apache/hadoop/hbase/TestRegionRebalancing.java 75613b8 bq.src/test/java/org/apache/hadoop/hbase/catalog/TestCatalogTracker.java ae333bb bq.src/test/java/org/apache/hadoop/hbase/catalog/TestMetaReaderEditor.java 18e647e bq.src/test/java/org/apache/hadoop/hbase/client/TestHCM.java 5d71d75 bq.src/test/java/org/apache/hadoop/hbase/mapreduce/TestTableMapReduce.java 624f4a8 bq.src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java 752e12b bq.src/test/java/org/apache/hadoop/hbase/util/TestMergeTable.java 8992dbb bq. bq. Diff: https://reviews.apache.org/r/643/diff bq. bq. bq. Testing bq. --- bq. bq. mvn test bq. bq. bq. Thanks, bq. bq. Karthick bq. bq. Redefine Identity Of HBase Configuration Key: HBASE-3777
[jira] [Commented] (HBASE-3846) Set RIT timeout higher
[ https://issues.apache.org/jira/browse/HBASE-3846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027919#comment-13027919 ] Andrew Purtell commented on HBASE-3846: --- +1 Set RIT timeout higher -- Key: HBASE-3846 URL: https://issues.apache.org/jira/browse/HBASE-3846 Project: HBase Issue Type: Task Affects Versions: 0.90.2 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Critical Fix For: 0.90.3 Attachments: HBASE-3846.patch As I was talking in HBASE-3669, it is really easy with the current RIT timeout to end up in situations where regions are doubly assigned, not assigned at all or assigned but the master doesn't know about it. As a bandaid, we should set hbase.master.assignment.timeoutmonitor.timeout to what the ZK session timeout is. We had to do that to one of our clusters to be able to start it, else the master kept racing with itself. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3847) Turn off DEBUG logging of RPCs in WriteableRPCEngine on TRUNK
Turn off DEBUG logging of RPCs in WriteableRPCEngine on TRUNK - Key: HBASE-3847 URL: https://issues.apache.org/jira/browse/HBASE-3847 Project: HBase Issue Type: Bug Reporter: stack -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3847) Turn off DEBUG logging of RPCs in WriteableRPCEngine on TRUNK
[ https://issues.apache.org/jira/browse/HBASE-3847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3847. -- Resolution: Fixed Fix Version/s: 0.92.0 Assignee: stack Turn off DEBUG logging of RPCs in WriteableRPCEngine on TRUNK - Key: HBASE-3847 URL: https://issues.apache.org/jira/browse/HBASE-3847 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.0 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3848) request is always zero in WebUI for region server
request is always zero in WebUI for region server - Key: HBASE-3848 URL: https://issues.apache.org/jira/browse/HBASE-3848 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.2 Reporter: gaojinchao Priority: Minor request is always zero in WebUI for region server Metrics request=0.0, regions=36, stores=36, storefiles=148, storefileIndexSize=29, memstoreSize=253, compactionQueueSize=24, flushQueueSize=0, usedHeap=655, maxHeap=8175, blockCacheSize=14230920, blockCacheFree=1700269560, blockCacheCount=21, blockCacheHitCount=2887, blockCacheMissCount=204829, blockCacheEvictedCount=0, blockCacheHitRatio=1, blockCacheHitCachingRatio=99 requests is not zero in WebUI for Hmaster requests=15000, regions=35, usedHeap=513, maxHeap=8175 Is there any different for these metrics? How do I use it? Thanks. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3848) request is always zero in WebUI for region server
[ https://issues.apache.org/jira/browse/HBASE-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] gaojinchao updated HBASE-3848: -- Attachment: RegionseverMetric_TrunkPath.patch RegionseverMetric0.90Path request is always zero in WebUI for region server - Key: HBASE-3848 URL: https://issues.apache.org/jira/browse/HBASE-3848 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.2 Reporter: gaojinchao Priority: Minor Attachments: RegionseverMetric0.90Path, RegionseverMetric_TrunkPath.patch request is always zero in WebUI for region server Metrics request=0.0, regions=36, stores=36, storefiles=148, storefileIndexSize=29, memstoreSize=253, compactionQueueSize=24, flushQueueSize=0, usedHeap=655, maxHeap=8175, blockCacheSize=14230920, blockCacheFree=1700269560, blockCacheCount=21, blockCacheHitCount=2887, blockCacheMissCount=204829, blockCacheEvictedCount=0, blockCacheHitRatio=1, blockCacheHitCachingRatio=99 requests is not zero in WebUI for Hmaster requests=15000, regions=35, usedHeap=513, maxHeap=8175 Is there any different for these metrics? How do I use it? Thanks. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3777) Redefine Identity Of HBase Configuration
[ https://issues.apache.org/jira/browse/HBASE-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-3777: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed to TRUNK after trying w/ 500 ycsb clients (It comes up and runs rather than pre-patch it fails). Thank you for your persistence Karthick (and to the reviewers). Redefine Identity Of HBase Configuration Key: HBASE-3777 URL: https://issues.apache.org/jira/browse/HBASE-3777 Project: HBase Issue Type: Improvement Components: client, ipc Affects Versions: 0.90.2 Reporter: Karthick Sankarachary Assignee: Karthick Sankarachary Priority: Minor Fix For: 0.92.0 Attachments: 3777-TOF.patch, HBASE-3777-V2.patch, HBASE-3777-V3.patch, HBASE-3777-V4.patch, HBASE-3777-V6.patch, HBASE-3777.patch Judging from the javadoc in {{HConnectionManager}}, sharing connections across multiple clients going to the same cluster is supposedly a good thing. However, the fact that there is a one-to-one mapping between a configuration and connection instance, kind of works against that goal. Specifically, when you create {{HTable}} instances using a given {{Configuration}} instance and a copy thereof, we end up with two distinct {{HConnection}} instances under the covers. Is this really expected behavior, especially given that the configuration instance gets cloned a lot? Here, I'd like to play devil's advocate and propose that we deep-compare {{HBaseConfiguration}} instances, so that multiple {{HBaseConfiguration}} instances that have the same properties map to the same {{HConnection}} instance. In case one is concerned that a single {{HConnection}} is insufficient for sharing amongst clients, to quote the javadoc, then one should be able to mark a given {{HBaseConfiguration}} instance as being uniquely identifiable. Note that sharing connections makes clean up of {{HConnection}} instances a little awkward, unless of course, you apply the change described in HBASE-3766. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3833) ability to support includes/excludes list in Hbase
[ https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vishal Kathuria updated HBASE-3833: --- Attachment: excl-patch.txt Patch for the fix. ability to support includes/excludes list in Hbase -- Key: HBASE-3833 URL: https://issues.apache.org/jira/browse/HBASE-3833 Project: HBase Issue Type: Improvement Components: client, regionserver Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: excl-patch.txt An HBase cluster currently does not have the ability to specify that the master should accept regionservers only from a specified list. This helps preventing administrative errors where the same machine could be included in two clusters. It also allows the administrator to easily remove un-ssh-able machines from the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-3833) ability to support includes/excludes list in Hbase
[ https://issues.apache.org/jira/browse/HBASE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vishal Kathuria updated HBASE-3833: --- Affects Version/s: 0.90.2 Release Note: This change adds support for includes and excludes list in HBase. To use it, specify the following properties in the hbase configuration file: hbase.hosts - for the includes file hbase.hosts.exclude - for the excludes file Both the files are optional. If the includes file is missing, any node is allowed (except the ones in the exclude file) If the includes file is present, then a node has to be present in the includes file and absent from the excludes file, to be able to successfully join. These files can be changed when the server is running. After changing these files, you can use the hbase console to issue 'refresh_nodes' command for the master to pick up the new config. If an online region server is added to the excludes file, 'refresh_nodes' will kick it out, with a specific exception. When the region server gets that exception, it shuts itself down. Status: Patch Available (was: Open) This change also includes the unit tests, which tests the key scenario of adding an online region server to the excludes file and then refreshing the excludes file. ability to support includes/excludes list in Hbase -- Key: HBASE-3833 URL: https://issues.apache.org/jira/browse/HBASE-3833 Project: HBase Issue Type: Improvement Components: client, regionserver Affects Versions: 0.90.2 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: excl-patch.txt An HBase cluster currently does not have the ability to specify that the master should accept regionservers only from a specified list. This helps preventing administrative errors where the same machine could be included in two clusters. It also allows the administrator to easily remove un-ssh-able machines from the cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-3849) Fix master ui; hbase-1502 broke requests/second
Fix master ui; hbase-1502 broke requests/second --- Key: HBASE-3849 URL: https://issues.apache.org/jira/browse/HBASE-3849 Project: HBase Issue Type: Bug Reporter: stack -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-3849) Fix master ui; hbase-1502 broke requests/second
[ https://issues.apache.org/jira/browse/HBASE-3849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-3849. -- Resolution: Fixed Fix Version/s: 0.92.0 Assignee: stack Fix master ui; hbase-1502 broke requests/second --- Key: HBASE-3849 URL: https://issues.apache.org/jira/browse/HBASE-3849 Project: HBase Issue Type: Bug Reporter: stack Assignee: stack Fix For: 0.92.0 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira