[jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash
[ https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13258422#comment-13258422 ] Jean-Daniel Cryans commented on HBASE-5844: --- It is deleted automatically, it's an ephemeral znode so it takes zk.session.timeout time to see it disappear. Delete the region servers znode after a regions server crash Key: HBASE-5844 URL: https://issues.apache.org/jira/browse/HBASE-5844 Project: HBase Issue Type: Improvement Components: regionserver, scripts Affects Versions: 0.96.0 Reporter: nkeywal Assignee: nkeywal Attachments: 5844.v1.patch today, if the regions server crashes, its znode is not deleted in ZooKeeper. So the recovery process will stop only after a timeout, usually 30s. By deleting the znode in start script, we remove this delay and the recovery starts immediately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255784#comment-13255784 ] Jean-Daniel Cryans commented on HBASE-5782: --- bq. The pendingWrites are appended strictly in order, so there is a very short race that might lead to sync be issued multiple time when only one was necessary (it seems the same race condition existed before). A short race is better than what we currently do in 0.92 (and before) where everything syncs everything. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13256061#comment-13256061 ] Jean-Daniel Cryans commented on HBASE-5782: --- For my part I tested on a cluster with 1kb values, huge batches, and I don't see any improvement with Lars' patch. Todd's peaks 20% higher. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782-lars-v2.txt, 5782-sketch.txt, 5782.txt, 5782.unfinished-stack.txt, HBASE-5782.patch, hbase-5782.txt Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254805#comment-13254805 ] Jean-Daniel Cryans commented on HBASE-5782: --- bq. Based on one of our requirements i was thinking that the log syncer thread should be configurable like either to use it or not use it. As long as it's enabled by default I'm good with that. Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254892#comment-13254892 ] Jean-Daniel Cryans commented on HBASE-5782: --- Looking more into this, I think HBASE-4487 is the real issue. I think I can also prove that you can get the issue even with a disabled {{LogSyncer}}. t1 does {{appendNoSync}} of k1 t1 does {{syncer}} up to {{getPendingWrites}} t2 does {{appendNoSync}} of k2 t2 does {{syncer}} up to the end In the log you'd see k2 then k1 so what's really wrong to me is this: {code} // Done in parallel for all writer threads, thanks to HDFS-895 ListEntry pending = logSyncerThread.getPendingWrites(); {code} Although accessing pending writes is done in sync, you can apply them in whichever way. Furthermore, {{logSyncerThread.hlogFlush}} can also append entries to the WAL in any order. For example, if both t1 and t2 have multiple edits they could end up intermingled in the WAL simply by doing {{hlogFlush}} at the same time. If {{LogSyncer}} was really an issue then {{HRegion.put}} and {{HRegion.delete}} would need to be disabled too since they don't use {{appendNoSync}} and just sync everything :) How this used to work is that threads could only append to the WAL under the {{updateLock}} and that was done at the same time as the {{doWrite}} which creates the key. The call to sync could be done by any number of threads at the same time. If this is right, then we should pull back HBASE-4487 or add more locks. We should also change this Jira's title once we get a better understanding of the problem because it's not a region assignment problem. Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254932#comment-13254932 ] Jean-Daniel Cryans commented on HBASE-5782: --- bq. In order to get 0.94.0 out the door, can we pull back HBASE-4487 in 0.94 and pursue the locking approach in trunk (or separate branch) ? +1, we might want to review HBASE-4282 too as it seems to do something similar with the transaction ids. bq. So the problem is that logSyncerThread keeps the edit in order but the syncer then applies the pending batches potentially out of order? It's sad that the pending edits live in {{LogSyncer}}, that thread is really just suppose to call sync... but yeah so they are added there in order but then it's a free for all in {{syncer}}. Adding a sync there could solve the issue but in the end what it does is moving the log from appending (pre HBASE-4487) to syncing plus a _ton_ of new complexity in HLog. I'd prefer solution that doesn't add a lock to patch something that's broken. Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Not all the regions are getting assigned after the log splitting.
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255130#comment-13255130 ] Jean-Daniel Cryans commented on HBASE-5782: --- bq. Don't hate me, just throwing this out there. LOL My current concerns: - Performance, and I'm not the only one concerned. - OOME, what happens if you enable deferred log flush and HDFS is slowing down? That's actually an issue with HBASE-4487 since the {{LinkedList}} of pending writes is unbounded whereas before simply appending to the file would slow you down. - This patch will make it that threads can sync data from threads that came in later. You need to check {{txid = this.syncedTillHere}} again once you are past the {{synchronized(syncLock)}} and return if it was taken care of while you were blocking. - Deadlocks, you never know when adding new locks :) Not all the regions are getting assigned after the log splitting. - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782.txt, HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5782) Edits can be appended out of seqid order since HBASE-4487
[ https://issues.apache.org/jira/browse/HBASE-5782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255172#comment-13255172 ] Jean-Daniel Cryans commented on HBASE-5782: --- bq. It seems even without HLog.appendNoSync() is is possible that one threads flushes an entire batch of pending write before an thread that started earlier can get to it. We didn't have pending writes before, it was inside the sequence file writer, and we append under lock. Managing those pending writes is what's giving us trouble. Edits can be appended out of seqid order since HBASE-4487 - Key: HBASE-5782 URL: https://issues.apache.org/jira/browse/HBASE-5782 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Gopinathan A Assignee: ramkrishna.s.vasudevan Priority: Blocker Fix For: 0.94.0 Attachments: 5782.txt, HBASE-5782.patch Create a table with 1000 splits, after the region assignemnt, kill the regionserver wich contains META table. Here few regions are missing after the log splitting and region assigment. HBCK report shows multiple region holes are got created. Same scenario was verified mulitple times in 0.92.1, no issues. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255203#comment-13255203 ] Jean-Daniel Cryans commented on HBASE-5778: --- bq. The files need to be read from the beginning to build up the dictionary. Aren't the dictionary entries spread out in the log? If so, it should be possible to slowly build it up as we tail the log (that's another feature that's broken, tailing). Then if you replay so WAL from another region server, for the first log you'd read from the beginning in order to build up the dict then when you hit the offset that's in ZK you start shipping. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.96.0, 0.94.1 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13255225#comment-13255225 ] Jean-Daniel Cryans commented on HBASE-5778: --- I think everything is fine then :) Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.96.0, 0.94.1 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253529#comment-13253529 ] Jean-Daniel Cryans commented on HBASE-5778: --- Sorry for all the trouble guys, I thought the feature was more tested than that :( Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13253611#comment-13253611 ] Jean-Daniel Cryans commented on HBASE-5778: --- I haven't had a look, but I'd guess that if we're reading files that are being written then we don't have access to the dict. Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Priority: Blocker Fix For: 0.96.0, 0.94.1 Attachments: 5778-addendum.txt, 5778.addendum, HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2223) Handle 10min+ network partitions between clusters
[ https://issues.apache.org/jira/browse/HBASE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252602#comment-13252602 ] Jean-Daniel Cryans commented on HBASE-2223: --- @Himanshu, It's been in the code base since 0.90.0 but there were a lot of other changes on top of it. Handle 10min+ network partitions between clusters - Key: HBASE-2223 URL: https://issues.apache.org/jira/browse/HBASE-2223 Project: HBase Issue Type: Sub-task Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.90.0 Attachments: HBASE-2223.patch We need a nice way of handling long network partitions without impacting a master cluster (which pushes the data). Currently it will just retry over and over again. I think we could: - Stop replication to a slave cluster if it didn't respond for more than 10 minutes - Keep track of the duration of the partition - When the slave cluster comes back, initiate a MR job like HBASE-2221 Maybe we want less than 10 minutes, maybe we want this to be all automatic or just the first 2 parts. Discuss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix
[ https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252729#comment-13252729 ] Jean-Daniel Cryans commented on HBASE-3443: --- Correctness should always come first IMO. ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix -- Key: HBASE-3443 URL: https://issues.apache.org/jira/browse/HBASE-3443 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 0.92.0, 0.92.1 Reporter: Kannan Muthukkaruppan Assignee: Lars Hofhansl Priority: Critical Labels: corruption Fix For: 0.96.0 Attachments: 3443.txt For incrementColumnValue() HBASE-3082 adds an optimization to check memstores first, and only if not present in the memstore then check the store files. In the presence of deletes, the above optimization is not reliable. If the column is marked as deleted in the memstore, one should not look further into the store files. But currently, the code does so. Sample test code outline: {code} admin.createTable(desc) table = HTable.new(conf, tableName) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); admin.flush(tableName) sleep(2) del = Delete.new(Bytes.toBytes(row)) table.delete(del) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); get = Get.new(Bytes.toBytes(row)) keyValues = table.get(get).raw() keyValues.each do |keyValue| puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}; end {code} The above prints: {code} Expect 5; Got Value=10 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3443) ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix
[ https://issues.apache.org/jira/browse/HBASE-3443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252730#comment-13252730 ] Jean-Daniel Cryans commented on HBASE-3443: --- The release notes should mention this tho. ICV optimization to look in memstore first and then store files (HBASE-3082) does not work when deletes are in the mix -- Key: HBASE-3443 URL: https://issues.apache.org/jira/browse/HBASE-3443 Project: HBase Issue Type: Bug Components: regionserver Affects Versions: 0.90.0, 0.90.1, 0.90.2, 0.90.3, 0.90.4, 0.90.5, 0.90.6, 0.92.0, 0.92.1 Reporter: Kannan Muthukkaruppan Assignee: Lars Hofhansl Priority: Critical Labels: corruption Fix For: 0.96.0 Attachments: 3443.txt For incrementColumnValue() HBASE-3082 adds an optimization to check memstores first, and only if not present in the memstore then check the store files. In the presence of deletes, the above optimization is not reliable. If the column is marked as deleted in the memstore, one should not look further into the store files. But currently, the code does so. Sample test code outline: {code} admin.createTable(desc) table = HTable.new(conf, tableName) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); admin.flush(tableName) sleep(2) del = Delete.new(Bytes.toBytes(row)) table.delete(del) table.incrementColumnValue(Bytes.toBytes(row), cf1name, Bytes.toBytes(column), 5); get = Get.new(Bytes.toBytes(row)) keyValues = table.get(get).raw() keyValues.each do |keyValue| puts Expect 5; Got Value=#{Bytes.toLong(keyValue.getValue())}; end {code} The above prints: {code} Expect 5; Got Value=10 {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5778) Turn on WAL compression by default
[ https://issues.apache.org/jira/browse/HBASE-5778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13252927#comment-13252927 ] Jean-Daniel Cryans commented on HBASE-5778: --- It's not in there, do we want it since we turn it on? Or do we act like we always had it? :) Turn on WAL compression by default -- Key: HBASE-5778 URL: https://issues.apache.org/jira/browse/HBASE-5778 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0, 0.96.0 Attachments: HBASE-5778.patch I ran some tests to verify if WAL compression should be turned on by default. For a use case where it's not very useful (values two order of magnitude bigger than the keys), the insert time wasn't different and the CPU usage 15% higher (150% CPU usage VS 130% when not compressing the WAL). When values are smaller than the keys, I saw a 38% improvement for the insert run time and CPU usage was 33% higher (600% CPU usage VS 450%). I'm not sure WAL compression accounts for all the additional CPU usage, it might just be that we're able to insert faster and we spend more time in the MemStore per second (because our MemStores are bad when they contain tens of thousands of values). Those are two extremes, but it shows that for the price of some CPU we can save a lot. My machines have 2 quads with HT, so I still had a lot of idle CPUs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5740) Compaction interruption may be due to balacing
[ https://issues.apache.org/jira/browse/HBASE-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13248751#comment-13248751 ] Jean-Daniel Cryans commented on HBASE-5740: --- Or disabling, or online altering (open/close). Compaction interruption may be due to balacing -- Key: HBASE-5740 URL: https://issues.apache.org/jira/browse/HBASE-5740 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jimmy Xiang Assignee: Jimmy Xiang Priority: Trivial Fix For: 0.96.0 Attachments: hbase-5740.patch Currently, the log shows Aborting compaction of store LOG in region because user requested stop. But it is actually because of balancing. Currently, there is no way to figure out who closed the region. So it is better to change the message to say it is because of user, or balancing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5716) Make HBASE-4608 easier to use
[ https://issues.apache.org/jira/browse/HBASE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247445#comment-13247445 ] Jean-Daniel Cryans commented on HBASE-5716: --- bq. Was there a great correlation between size on disk and size of blog? I'm not sure I understand the question. Make HBASE-4608 easier to use - Key: HBASE-5716 URL: https://issues.apache.org/jira/browse/HBASE-5716 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Li Pi Fix For: 0.96.0, 0.94.1 HBASE-4608 is a nice feature but after playing with it for a while I think the following should be fixed to make it easier to use by someone who's not a dev: - Add some signal that says that the feature is turned on. Right now you can {{jstack | grep KeyValueCompression}} a couple of times and if you get a hit you definitely know it's on, but otherwise the random user wouldn't know without going through the jira. - Add documentation in the reference guide. At the minimum add {{hbase.regionserver.wal.enablecompression}} in there with a small description. Better would be to add a section in {{Appendix B}} or something like that and describe the functionality a bit and who it's useful for. For example, flush from your brain the knowledge of the patch and read the name of the configuration... now let's say you have a use case that involves writing easily compressible values. Any normal user would believe that this is a good tuning parameter for them, but it's just going to waste CPU cycles. - Add some metrics like we have for HFiles where you get a clue about the compression ratio. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5715) Revert 'Instant schema alter' for now, HBASE-4213
[ https://issues.apache.org/jira/browse/HBASE-5715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247525#comment-13247525 ] Jean-Daniel Cryans commented on HBASE-5715: --- +1 on the revert patch. Revert 'Instant schema alter' for now, HBASE-4213 - Key: HBASE-5715 URL: https://issues.apache.org/jira/browse/HBASE-5715 Project: HBase Issue Type: Task Reporter: stack Attachments: revert.txt, revert.v2.txt, revert.v3.txt, revert.v4.txt See this discussion: http://search-hadoop.com/m/NxCQh1KlSxR1/Pull+instant+schema+updating+out%253Fsubj=Pull+instant+schema+updating+out+ Pull out hbase-4213 for now. Can add it back later. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5716) Make HBASE-4608 easier to use
[ https://issues.apache.org/jira/browse/HBASE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13247831#comment-13247831 ] Jean-Daniel Cryans commented on HBASE-5716: --- I'm asking the same question, I don't know. Make HBASE-4608 easier to use - Key: HBASE-5716 URL: https://issues.apache.org/jira/browse/HBASE-5716 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Li Pi Fix For: 0.96.0, 0.94.1 HBASE-4608 is a nice feature but after playing with it for a while I think the following should be fixed to make it easier to use by someone who's not a dev: - Add some signal that says that the feature is turned on. Right now you can {{jstack | grep KeyValueCompression}} a couple of times and if you get a hit you definitely know it's on, but otherwise the random user wouldn't know without going through the jira. - Add documentation in the reference guide. At the minimum add {{hbase.regionserver.wal.enablecompression}} in there with a small description. Better would be to add a section in {{Appendix B}} or something like that and describe the functionality a bit and who it's useful for. For example, flush from your brain the knowledge of the patch and read the name of the configuration... now let's say you have a use case that involves writing easily compressible values. Any normal user would believe that this is a good tuning parameter for them, but it's just going to waste CPU cycles. - Add some metrics like we have for HFiles where you get a clue about the compression ratio. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5359) Alter in the shell can be too quick and return before the table is altered
[ https://issues.apache.org/jira/browse/HBASE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246528#comment-13246528 ] Jean-Daniel Cryans commented on HBASE-5359: --- I'm giving this patch a spin on 0.94 Alter in the shell can be too quick and return before the table is altered -- Key: HBASE-5359 URL: https://issues.apache.org/jira/browse/HBASE-5359 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Fix For: 0.96.0 Attachments: HBASE-5359.patch This seems to be a recent change in behavior but I'm still not sure where it's coming from. The shell is able to call HMaster.getAlterStatus before the TableEventHandler is able call AM.setRegionsToReopen so that the returned status shows no pending regions. It means that the alter seems instantaneous although it's far from completed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5359) Alter in the shell can be too quick and return before the table is altered
[ https://issues.apache.org/jira/browse/HBASE-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246540#comment-13246540 ] Jean-Daniel Cryans commented on HBASE-5359: --- Looks like it works as expected, +1 Alter in the shell can be too quick and return before the table is altered -- Key: HBASE-5359 URL: https://issues.apache.org/jira/browse/HBASE-5359 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Fix For: 0.96.0 Attachments: HBASE-5359.patch This seems to be a recent change in behavior but I'm still not sure where it's coming from. The shell is able to call HMaster.getAlterStatus before the TableEventHandler is able call AM.setRegionsToReopen so that the returned status shows no pending regions. It means that the alter seems instantaneous although it's far from completed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5716) Make HBASE-4608 easier to use
[ https://issues.apache.org/jira/browse/HBASE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246605#comment-13246605 ] Jean-Daniel Cryans commented on HBASE-5716: --- bq. Is there any situation in which using it isn't a good idea? The one I described. Make HBASE-4608 easier to use - Key: HBASE-5716 URL: https://issues.apache.org/jira/browse/HBASE-5716 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Fix For: 0.96.0, 0.94.1 HBASE-4608 is a nice feature but after playing with it for a while I think the following should be fixed to make it easier to use by someone who's not a dev: - Add some signal that says that the feature is turned on. Right now you can {{jstack | grep KeyValueCompression}} a couple of times and if you get a hit you definitely know it's on, but otherwise the random user wouldn't know without going through the jira. - Add documentation in the reference guide. At the minimum add {{hbase.regionserver.wal.enablecompression}} in there with a small description. Better would be to add a section in {{Appendix B}} or something like that and describe the functionality a bit and who it's useful for. For example, flush from your brain the knowledge of the patch and read the name of the configuration... now let's say you have a use case that involves writing easily compressible values. Any normal user would believe that this is a good tuning parameter for them, but it's just going to waste CPU cycles. - Add some metrics like we have for HFiles where you get a clue about the compression ratio. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5716) Make HBASE-4608 easier to use
[ https://issues.apache.org/jira/browse/HBASE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246699#comment-13246699 ] Jean-Daniel Cryans commented on HBASE-5716: --- On thing I just thought about, enabling this feature doesn't change that we currently still roll on the total size of the file... meaning that you can pack a lot more data per HLog, this should have some impact on the log replay time. Not sure if it's for better or worse as you have to read/write more data, but it's compressed on the wire... Make HBASE-4608 easier to use - Key: HBASE-5716 URL: https://issues.apache.org/jira/browse/HBASE-5716 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Fix For: 0.96.0, 0.94.1 HBASE-4608 is a nice feature but after playing with it for a while I think the following should be fixed to make it easier to use by someone who's not a dev: - Add some signal that says that the feature is turned on. Right now you can {{jstack | grep KeyValueCompression}} a couple of times and if you get a hit you definitely know it's on, but otherwise the random user wouldn't know without going through the jira. - Add documentation in the reference guide. At the minimum add {{hbase.regionserver.wal.enablecompression}} in there with a small description. Better would be to add a section in {{Appendix B}} or something like that and describe the functionality a bit and who it's useful for. For example, flush from your brain the knowledge of the patch and read the name of the configuration... now let's say you have a use case that involves writing easily compressible values. Any normal user would believe that this is a good tuning parameter for them, but it's just going to waste CPU cycles. - Add some metrics like we have for HFiles where you get a clue about the compression ratio. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5716) Make HBASE-4608 easier to use
[ https://issues.apache.org/jira/browse/HBASE-5716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13246736#comment-13246736 ] Jean-Daniel Cryans commented on HBASE-5716: --- bq. should we roll on number of edits or size of the file? Number of edits is rarely good, you always end up with someone with a degenerate case that has values MBs big so we should stick to the file size. Rolling on the uncompressed size would be good because then we keep the same behavior but this is going to be at the expense more code to keep track of it. Rolling on the actual size with compression turned on could make system behave differently for good or worse, I'm not sure which. Make HBASE-4608 easier to use - Key: HBASE-5716 URL: https://issues.apache.org/jira/browse/HBASE-5716 Project: HBase Issue Type: Improvement Reporter: Jean-Daniel Cryans Assignee: Li Pi Fix For: 0.96.0, 0.94.1 HBASE-4608 is a nice feature but after playing with it for a while I think the following should be fixed to make it easier to use by someone who's not a dev: - Add some signal that says that the feature is turned on. Right now you can {{jstack | grep KeyValueCompression}} a couple of times and if you get a hit you definitely know it's on, but otherwise the random user wouldn't know without going through the jira. - Add documentation in the reference guide. At the minimum add {{hbase.regionserver.wal.enablecompression}} in there with a small description. Better would be to add a section in {{Appendix B}} or something like that and describe the functionality a bit and who it's useful for. For example, flush from your brain the knowledge of the patch and read the name of the configuration... now let's say you have a use case that involves writing easily compressible values. Any normal user would believe that this is a good tuning parameter for them, but it's just going to waste CPU cycles. - Add some metrics like we have for HFiles where you get a clue about the compression ratio. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5702) MasterSchemaChangeTracker.excludeRegionServerForSchemaChanges leaks a MonitoredTask per call
[ https://issues.apache.org/jira/browse/HBASE-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13245793#comment-13245793 ] Jean-Daniel Cryans commented on HBASE-5702: --- Yeah when I opened this jira it had a narrow scope but I think it could be much bigger. Basically if you try to instant alter a table you'll see there's about two or three (sorry I can't be more precise at the moment, I'm testing something else right now) tasks that are leaked. I can see some issues just looking at the code: MasterSchemaChangeTracker - processCompletedSchemaChanges: creates a task, sets the status, never closes it. It's a misuse of MonitoredTask I think, a task is normally something that's long running and you need to report progress. - processAlterStatus: creates a task but stops it right away, also creates one then kills it. - handleFailedOrExpiredSchemaChanges: creates a task, sets the status, never closes it. Also it has an extra white space. - createSchemaChangeNode: creates a task then closes/aborts it right away SchemaChangeTracker - handleSchemaChange: creates a task, sets the status, never closes it. - reportAndLogSchemaRefreshError: can create task and set a status but doesn't close it. There really should only be 1 task throughout the alter process. MasterSchemaChangeTracker.excludeRegionServerForSchemaChanges leaks a MonitoredTask per call Key: HBASE-5702 URL: https://issues.apache.org/jira/browse/HBASE-5702 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Jean-Daniel Cryans Assignee: Subbu M Iyer Priority: Critical Fix For: 0.94.0, 0.96.0 This bug is so easy to reproduce I'm wondering why it hasn't been reported yet. Stop any number of region servers on a 0.94/6 cluster and you'll see in the master interface one task per stopped region server saying the following: |Processing schema change exclusion for region server = sv4r27s44,62023,1333402175340|RUNNING (since 5sec ago)|No schema change in progress. Skipping exclusion for server = sv4r27s44,62023,1333402175340 (since 5sec ago)| It's gonna stay there until the master cleans it: bq. WARN org.apache.hadoop.hbase.monitoring.TaskMonitor: Status Processing schema change exclusion for region server = sv4r27s44,62023,1333402175340: status=No schema change in progress. Skipping exclusion for server = sv4r27s44,62023,1333402175340, state=RUNNING, startTime=1333404636419, completionTime=-1 appears to have been leaked It's not clear to me why it's using a MonitoredTask in the first place. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244759#comment-13244759 ] Jean-Daniel Cryans commented on HBASE-3134: --- +1 [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134-v3.txt, 3134-v4.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5702) MasterSchemaChangeTracker.excludeRegionServerForSchemaChanges leaks a MonitoredTask per call
[ https://issues.apache.org/jira/browse/HBASE-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244801#comment-13244801 ] Jean-Daniel Cryans commented on HBASE-5702: --- Looking at the code it seems it's leaking in other places... and that message shouldn't even be there in the first place because hbase.instant.schema.alter.enabled isn't enabled on this cluster. MasterSchemaChangeTracker.excludeRegionServerForSchemaChanges leaks a MonitoredTask per call Key: HBASE-5702 URL: https://issues.apache.org/jira/browse/HBASE-5702 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Jean-Daniel Cryans Fix For: 0.94.0, 0.96.0 This bug is so easy to reproduce I'm wondering why it hasn't been reported yet. Stop any number of region servers on a 0.94/6 cluster and you'll see in the master interface one task per stopped region server saying the following: |Processing schema change exclusion for region server = sv4r27s44,62023,1333402175340|RUNNING (since 5sec ago)|No schema change in progress. Skipping exclusion for server = sv4r27s44,62023,1333402175340 (since 5sec ago)| It's gonna stay there until the master cleans it: bq. WARN org.apache.hadoop.hbase.monitoring.TaskMonitor: Status Processing schema change exclusion for region server = sv4r27s44,62023,1333402175340: status=No schema change in progress. Skipping exclusion for server = sv4r27s44,62023,1333402175340, state=RUNNING, startTime=1333404636419, completionTime=-1 appears to have been leaked It's not clear to me why it's using a MonitoredTask in the first place. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5702) MasterSchemaChangeTracker.excludeRegionServerForSchemaChanges leaks a MonitoredTask per call
[ https://issues.apache.org/jira/browse/HBASE-5702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13244816#comment-13244816 ] Jean-Daniel Cryans commented on HBASE-5702: --- Now that I've tested instant schema updates, there's many more issues with MonitoredTask than I originally thought :( MasterSchemaChangeTracker.excludeRegionServerForSchemaChanges leaks a MonitoredTask per call Key: HBASE-5702 URL: https://issues.apache.org/jira/browse/HBASE-5702 Project: HBase Issue Type: Bug Affects Versions: 0.94.0 Reporter: Jean-Daniel Cryans Fix For: 0.94.0, 0.96.0 This bug is so easy to reproduce I'm wondering why it hasn't been reported yet. Stop any number of region servers on a 0.94/6 cluster and you'll see in the master interface one task per stopped region server saying the following: |Processing schema change exclusion for region server = sv4r27s44,62023,1333402175340|RUNNING (since 5sec ago)|No schema change in progress. Skipping exclusion for server = sv4r27s44,62023,1333402175340 (since 5sec ago)| It's gonna stay there until the master cleans it: bq. WARN org.apache.hadoop.hbase.monitoring.TaskMonitor: Status Processing schema change exclusion for region server = sv4r27s44,62023,1333402175340: status=No schema change in progress. Skipping exclusion for server = sv4r27s44,62023,1333402175340, state=RUNNING, startTime=1333404636419, completionTime=-1 appears to have been leaked It's not clear to me why it's using a MonitoredTask in the first place. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239929#comment-13239929 ] Jean-Daniel Cryans commented on HBASE-3134: --- Lars, do you want to pull this into 0.94.0 or are you too far down the process? [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.1 Attachments: 3134-v2.txt, 3134-v3.txt, 3134-v4.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239951#comment-13239951 ] Jean-Daniel Cryans commented on HBASE-3134: --- No problem, next RC then :) [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.1 Attachments: 3134-v2.txt, 3134-v3.txt, 3134-v4.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5623) Race condition when rolling the HLog and hlogFlush
[ https://issues.apache.org/jira/browse/HBASE-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238802#comment-13238802 ] Jean-Daniel Cryans commented on HBASE-5623: --- Funny, I just saw that NPE for the first time in my testing. Race condition when rolling the HLog and hlogFlush -- Key: HBASE-5623 URL: https://issues.apache.org/jira/browse/HBASE-5623 Project: HBase Issue Type: Bug Components: wal Affects Versions: 0.94.0 Reporter: Enis Soztutar Assignee: Enis Soztutar Priority: Critical Fix For: 0.94.0 Attachments: 5623-suggestion.txt, 5623-v7.txt, 5623-v8.txt, 5623.txt, 5623v2.txt, HBASE-5623_v0.patch, HBASE-5623_v4.patch, HBASE-5623_v5.patch, HBASE-5623_v6-alt.patch, HBASE-5623_v6-alt.patch When doing a ycsb test with a large number of handlers (regionserver.handler.count=60), I get the following exceptions: {code} Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.io.SequenceFile$Writer.getLength(SequenceFile.java:1099) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.getLength(SequenceFileLogWriter.java:314) at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1291) at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1388) at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192) at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985) at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400) at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351) at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:920) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:152) at $Proxy1.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1691) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1689) at org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:214) {code} and {code} java.lang.NullPointerException at org.apache.hadoop.io.SequenceFile$Writer.checkAndWriteSync(SequenceFile.java:1026) at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1068) at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1035) at org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter.append(SequenceFileLogWriter.java:279) at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.hlogFlush(HLog.java:1237) at org.apache.hadoop.hbase.regionserver.wal.HLog.syncer(HLog.java:1271) at org.apache.hadoop.hbase.regionserver.wal.HLog.sync(HLog.java:1391) at org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchPut(HRegion.java:2192) at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1985) at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3400) at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:366) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1351) {code} It seems the root cause of the issue is that we open a new log writer and close the old one at HLog#rollWriter() holding the updateLock, but the other threads doing syncer() calls {code} logSyncerThread.hlogFlush(this.writer); {code} without holding the updateLock. LogSyncer only synchronizes against concurrent appends and flush(), but not on the passed writer, which can be closed already by rollWriter(). In this case, since SequenceFile#Writer.close() sets it's out field as null, we get the NPE. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators:
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238807#comment-13238807 ] Jean-Daniel Cryans commented on HBASE-3134: --- +1 on the patch you put up on review board Teruyoshi, can you attach it here and grant the license so that I can commit? Thanks! [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.1 Attachments: 3134-v2.txt, 3134-v3.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4993) Performance regression in minicluster creation
[ https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238962#comment-13238962 ] Jean-Daniel Cryans commented on HBASE-4993: --- It seems this would work better, also reviewing the 0.90/0.92 code it think we should keep the new logic you introduced in this jira (with the fixed code). I opened HBASE-5639 and assigned it to you. Performance regression in minicluster creation -- Key: HBASE-4993 URL: https://issues.apache.org/jira/browse/HBASE-4993 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Fix For: 0.94.0 Attachments: 4993.patch, 4993.v3.patch Side effect of 4610: the mini cluster needs 4,5 seconds to start -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken
[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238973#comment-13238973 ] Jean-Daniel Cryans commented on HBASE-5639: --- Oh I forgot to mention that I'm marking this as a blocker for 0.94.0 because right now if you start a sizable cluster you may end up with region servers that checkin too late and miss the re-assignment of regions. The logic used in waiting for region servers during startup is broken - Key: HBASE-5639 URL: https://issues.apache.org/jira/browse/HBASE-5639 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: nkeywal Priority: Blocker Fix For: 0.94.0 See the tail of HBASE-4993, which I'll report here: Me: {quote} I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time And the code that verifies that: !(lastCountChange+interval now count = minToStart) {quote} Nic: {quote} It seems that changing the code to (count minToStart || lastCountChange+interval now) would make the code works as documented. If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true. If you have 0 region servers but you are above the interval, you wait: (true or false) = true. If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5222) Stopping replication via the stop_replication command in hbase shell on a slave cluster isn't acknowledged in the replication sink
[ https://issues.apache.org/jira/browse/HBASE-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236767#comment-13236767 ] Jean-Daniel Cryans commented on HBASE-5222: --- bq. So, let's see what JD says. Here he goes: bq. When you want to use replication, you ought to run these commands Not sure which commands you're talking about. In the specific case of {{stop_replication}}, it's a kill switch in the proper sense (quote from wikipedia): bq. a kill switch is designed and configured to a) completely abort the operation at all costs and b) be operable in a manner that is quick, simple (so that even a panicking user with impaired executive function can operate it), and, usually, c) be obvious even to an untrained operator or a bystander We hit on a) and b), the c) part might not be there yet. The issue here is that the command is respected on the master cluster (when ran there) but not on the slave cluster (when ran there). bq. If you stop replication on the master, the logs are no longer stored to be pushed down stream like they would with replication enabled. Yep. bq. The bug, however, causes the slave to keep accepting logs even while disabled although the other processes on slave cluster respect the disabled flag Since it's a kill switch, what's going to happen is the slave cluster is going to *drop the log edits*. This is not what you want, you want is HBASE-3134. bq. So, afaik, running commands on the slave cluster are futile as its the master cluster which does all the work. I think you understand the issue here reasonably well, and indeed most of the commands won't do anything on the slave cluster, except here the kill switch should stop all replication-related activity including applying incoming logs. Stopping replication via the stop_replication command in hbase shell on a slave cluster isn't acknowledged in the replication sink Key: HBASE-5222 URL: https://issues.apache.org/jira/browse/HBASE-5222 Project: HBase Issue Type: Bug Components: replication, shell Affects Versions: 0.90.4 Reporter: Josh Wymer After running stop_replication in the hbase shell on our slave cluster we saw replication continue for weeks. Turns out that the replication sink is missing a check to get the replication state and therefore continued to write. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size
[ https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236790#comment-13236790 ] Jean-Daniel Cryans commented on HBASE-5190: --- It's not exactly 10, it's num_handlers * 10 so by default 100. Is that better? Limit the IPC queue size based on calls' payload size - Key: HBASE-5190 URL: https://issues.apache.org/jira/browse/HBASE-5190 Project: HBase Issue Type: Improvement Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.94.0 Attachments: HBASE-5190-v2.patch, HBASE-5190-v3.patch, HBASE-5190.patch Currently we limit the number of calls in the IPC queue only on their count. It used to be really high and was dropped down recently to num_handlers * 10 (so 100 by default) because it was easy to OOME yourself when huge calls were being queued. It's still possible to hit this problem if you use really big values and/or a lot of handlers, so the idea is that we should take into account the payload size. I can see 3 solutions: - Do the accounting outside of the queue itself for all calls coming in and out and when a call doesn't fit, throw a retryable exception. - Same accounting but instead block the call when it comes in until space is made available. - Add a new parameter for the maximum size (in bytes) of a Call and then set the size the IPC queue (in terms of the number of items) so that it could only contain as many items as some predefined maximum size (in bytes) for the whole queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size
[ https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237146#comment-13237146 ] Jean-Daniel Cryans commented on HBASE-5190: --- Yep, right after I finishing running medium tests on trunk. Limit the IPC queue size based on calls' payload size - Key: HBASE-5190 URL: https://issues.apache.org/jira/browse/HBASE-5190 Project: HBase Issue Type: Improvement Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.94.0 Attachments: HBASE-5190-v2.patch, HBASE-5190-v3.patch, HBASE-5190.patch Currently we limit the number of calls in the IPC queue only on their count. It used to be really high and was dropped down recently to num_handlers * 10 (so 100 by default) because it was easy to OOME yourself when huge calls were being queued. It's still possible to hit this problem if you use really big values and/or a lot of handlers, so the idea is that we should take into account the payload size. I can see 3 solutions: - Do the accounting outside of the queue itself for all calls coming in and out and when a call doesn't fit, throw a retryable exception. - Same accounting but instead block the call when it comes in until space is made available. - Add a new parameter for the maximum size (in bytes) of a Call and then set the size the IPC queue (in terms of the number of items) so that it could only contain as many items as some predefined maximum size (in bytes) for the whole queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4993) Performance regression in minicluster creation
[ https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13237196#comment-13237196 ] Jean-Daniel Cryans commented on HBASE-4993: --- @Nic I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: {code} - the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time {code} And the code that verifies that: {code} !(lastCountChange+interval now count = minToStart) {code} If you have 0 region servers that checked in and you are under the interval, you wait: not (true and false) = true. If you have 0 region servers but you are above the interval, you wait: not (false and false) = true. If you have 1 or more region servers that checked in and you are under the interval, you continue: not (true and true) = false. Here's an example: {noformat} 2012-03-23 21:45:22,002 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 0 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2012-03-23 21:45:22,882 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r27s44,62023,1332539122398 2012-03-23 21:45:22,883 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r29s44,62023,1332539122438 2012-03-23 21:45:22,883 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r25s44,62023,1332539122404 2012-03-23 21:45:22,885 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r6s38,62023,1332539122354 2012-03-23 21:45:22,885 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r8s38,62023,1332539122396 2012-03-23 21:45:22,886 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r5s38,62023,1332539122427 2012-03-23 21:45:22,886 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r28s44,62023,1332539122402 2012-03-23 21:45:22,887 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r31s44,62023,1332539122387 2012-03-23 21:45:22,887 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r30s44,62023,1332539122392 2012-03-23 21:45:22,906 INFO org.apache.hadoop.hbase.master.ServerManager: Finished waiting for region servers count to settle; checked in 9, slept for 904 ms, expecting minimum of 1, maximum of 2147483647, master is running. {noformat} As you can see we haven't waited a second and the master is proceeding. This is here not too bad because in the cluster I have 9 servers, but the first time I ran 0.94 it proceeded with only 1 server. This could be disastrous at scale, we really need to wait more than that here. In fact I think I preferred the old way of doing it. Performance regression in minicluster creation -- Key: HBASE-4993 URL: https://issues.apache.org/jira/browse/HBASE-4993 Project: HBase Issue Type: Bug Components: master Affects Versions: 0.94.0 Environment: all Reporter: nkeywal Assignee: nkeywal Fix For: 0.94.0 Attachments: 4993.patch, 4993.v3.patch Side effect of 4610: the mini cluster needs 4,5 seconds to start -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5222) Stopping replication via the stop_replication command in hbase shell on a slave cluster isn't acknowledged in the replication sink
[ https://issues.apache.org/jira/browse/HBASE-5222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235795#comment-13235795 ] Jean-Daniel Cryans commented on HBASE-5222: --- {{stop_replication}} is a kill switch that should normally kill everything that's related to replication. In this case, it's not stopping the region servers from accepting incoming replication traffic. Stopping replication via the stop_replication command in hbase shell on a slave cluster isn't acknowledged in the replication sink Key: HBASE-5222 URL: https://issues.apache.org/jira/browse/HBASE-5222 Project: HBase Issue Type: Bug Components: replication, shell Affects Versions: 0.90.4 Reporter: Josh Wymer After running stop_replication in the hbase shell on our slave cluster we saw replication continue for weeks. Turns out that the replication sink is missing a check to get the replication state and therefore continued to write. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size
[ https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235852#comment-13235852 ] Jean-Daniel Cryans commented on HBASE-5190: --- The current patch works, I've tested it extensively through massive imports. Current concerns: - I haven't done a performance comparison, like is it going to slow down traffic because of additional checks? Most of my testing was done so that I'm hitting the limit all the time, so that does definitely slow down my throughput but it's expected :) - The exception Call queue already full doesn't make it to the client, what happens is that it's being printed server-side and the client gets an EOF. That's bad. - What default should we use? In my testing I saw that 100MB might be too small, but ideally that needs to scale with the amount of memory. I don't mind finishing this for 0.94 if there's demand/motivation for it. Limit the IPC queue size based on calls' payload size - Key: HBASE-5190 URL: https://issues.apache.org/jira/browse/HBASE-5190 Project: HBase Issue Type: Improvement Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.94.0 Attachments: HBASE-5190.patch Currently we limit the number of calls in the IPC queue only on their count. It used to be really high and was dropped down recently to num_handlers * 10 (so 100 by default) because it was easy to OOME yourself when huge calls were being queued. It's still possible to hit this problem if you use really big values and/or a lot of handlers, so the idea is that we should take into account the payload size. I can see 3 solutions: - Do the accounting outside of the queue itself for all calls coming in and out and when a call doesn't fit, throw a retryable exception. - Same accounting but instead block the call when it comes in until space is made available. - Add a new parameter for the maximum size (in bytes) of a Call and then set the size the IPC queue (in terms of the number of items) so that it could only contain as many items as some predefined maximum size (in bytes) for the whole queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size
[ https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235909#comment-13235909 ] Jean-Daniel Cryans commented on HBASE-5190: --- bq. 100MB is per RegionServer, right? Does seem a bit small. Maybe 1G? Might be a good default, those with the default heap will definitely not get any help here though. bq. I take it that maybe this is something to consider for 0.96. Agreed? The more I think about it, the more I want this in 0.94 because it can really give us a better understanding of those issues we see on the mailing list. Limit the IPC queue size based on calls' payload size - Key: HBASE-5190 URL: https://issues.apache.org/jira/browse/HBASE-5190 Project: HBase Issue Type: Improvement Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.94.0 Attachments: HBASE-5190.patch Currently we limit the number of calls in the IPC queue only on their count. It used to be really high and was dropped down recently to num_handlers * 10 (so 100 by default) because it was easy to OOME yourself when huge calls were being queued. It's still possible to hit this problem if you use really big values and/or a lot of handlers, so the idea is that we should take into account the payload size. I can see 3 solutions: - Do the accounting outside of the queue itself for all calls coming in and out and when a call doesn't fit, throw a retryable exception. - Same accounting but instead block the call when it comes in until space is made available. - Add a new parameter for the maximum size (in bytes) of a Call and then set the size the IPC queue (in terms of the number of items) so that it could only contain as many items as some predefined maximum size (in bytes) for the whole queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4657) Improve the efficiency of our MR jobs with a few configurations
[ https://issues.apache.org/jira/browse/HBASE-4657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236004#comment-13236004 ] Jean-Daniel Cryans commented on HBASE-4657: --- +1 Improve the efficiency of our MR jobs with a few configurations --- Key: HBASE-4657 URL: https://issues.apache.org/jira/browse/HBASE-4657 Project: HBase Issue Type: Improvement Affects Versions: 0.90.4 Reporter: Jean-Daniel Cryans Assignee: Lars Hofhansl Fix For: 0.94.0 Attachments: 4657.txt This is a low hanging fruit, some of our MR jobs like RowCounter and CopyTable don't even setCacheBlocks on the scan object which out of the box completely screws up a running system. Another thing would be to disable speculative execution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size
[ https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13236037#comment-13236037 ] Jean-Daniel Cryans commented on HBASE-5190: --- Instead of throwing the IOE it's better if I put it in the try block just below in the code as it already handles the setting up of a response. I just need to change the messaging there and refactor some bits. Limit the IPC queue size based on calls' payload size - Key: HBASE-5190 URL: https://issues.apache.org/jira/browse/HBASE-5190 Project: HBase Issue Type: Improvement Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.94.0 Attachments: HBASE-5190.patch Currently we limit the number of calls in the IPC queue only on their count. It used to be really high and was dropped down recently to num_handlers * 10 (so 100 by default) because it was easy to OOME yourself when huge calls were being queued. It's still possible to hit this problem if you use really big values and/or a lot of handlers, so the idea is that we should take into account the payload size. I can see 3 solutions: - Do the accounting outside of the queue itself for all calls coming in and out and when a call doesn't fit, throw a retryable exception. - Same accounting but instead block the call when it comes in until space is made available. - Add a new parameter for the maximum size (in bytes) of a Call and then set the size the IPC queue (in terms of the number of items) so that it could only contain as many items as some predefined maximum size (in bytes) for the whole queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5586) [replication] NPE in ReplicationSource when creating a stream to an inexistent cluster
[ https://issues.apache.org/jira/browse/HBASE-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13234894#comment-13234894 ] Jean-Daniel Cryans commented on HBASE-5586: --- Yep I have a tested patch but no unit tests. I'm trying to get HBASE-3134 tested here first. [replication] NPE in ReplicationSource when creating a stream to an inexistent cluster -- Key: HBASE-5586 URL: https://issues.apache.org/jira/browse/HBASE-5586 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Fix For: 0.90.7, 0.92.2, 0.94.0 This is from 0.92.1-ish: {noformat} 2012-03-15 09:52:16,589 ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unexpected exception in ReplicationSource, currentPath=null java.lang.NullPointerException at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.chooseSinks(ReplicationSource.java:223) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.connectToPeers(ReplicationSource.java:442) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:246) {noformat} I wanted to add a replication stream to a cluster that wasn't existing yet so that the logs would be buffered until then. This should just be treated as if there was no region servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235013#comment-13235013 ] Jean-Daniel Cryans commented on HBASE-3134: --- I found one corner case: # start replication with a peer # insert some data # kill the peer # disable replication # start the peer The issue is that when the peer is restarted the region servers will be able to replicate the batch they are currently trying to send. Here's the log: {noformat} 2012-03-21 20:40:14,775 INFO org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Slave cluster looks down: Failed setting up proxy interface org.apache.hadoop.hbase.ipc.HRegionInterface to sv4r9s44/10.4.9.44:62023 after attempts=1 2012-03-21 20:40:15,774 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Since we are unable to replicate, sleeping 1000 times 9 2012-03-21 20:40:24,786 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Replicating 149 2012-03-21 20:40:30,826 INFO org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: Going to report log #sv4r8s38%2C62023%2C1332361576338.1332361577015 for position 303630357 in hdfs://sv4r11s38:9100/hbase/.logs/sv4r8s38,62023,1332361576338/sv4r8s38%2C62023%2C1332361576338.1332361577015 2012-03-21 20:40:30,830 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Replicated in total: 149 2012-03-21 20:40:30,830 DEBUG org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Replication is disabled, sleeping 1000 times 1 {noformat} Right now {{shipEdits}} won't let you throw an {{Exception}} as it catches everything, so we could change that or we could change the loop at the end of the method to stay in there until replication is re-enabled (at which point the edits will be sent directly). I'm also opened to other suggestions. Another thing I saw is that both {{enable_peer}} and {{disable_peer}} in the shell still show CURRENTLY UNSUPPORTED in their help. I'm sorry it took me so long to get to this jira. [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134-v3.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235055#comment-13235055 ] Jean-Daniel Cryans commented on HBASE-3134: --- I did some more testing, couldn't find anything else. [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134-v3.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5586) [replication] NPE in ReplicationSource when creating a stream to an inexistent cluster
[ https://issues.apache.org/jira/browse/HBASE-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235204#comment-13235204 ] Jean-Daniel Cryans commented on HBASE-5586: --- bq. return Collections.emptyList() instead? But then it won't be a one liner as I'll have to refactor the other lines like that! :) It would probably be better to use that yeah, and more readable. [replication] NPE in ReplicationSource when creating a stream to an inexistent cluster -- Key: HBASE-5586 URL: https://issues.apache.org/jira/browse/HBASE-5586 Project: HBase Issue Type: Bug Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.90.7, 0.92.2, 0.94.0 Attachments: HBASE-5586.java, HBASE-5586.java This is from 0.92.1-ish: {noformat} 2012-03-15 09:52:16,589 ERROR org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Unexpected exception in ReplicationSource, currentPath=null java.lang.NullPointerException at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.chooseSinks(ReplicationSource.java:223) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.connectToPeers(ReplicationSource.java:442) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:246) {noformat} I wanted to add a replication stream to a cluster that wasn't existing yet so that the logs would be buffered until then. This should just be treated as if there was no region servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5328) Small changes to Master to make it more testable
[ https://issues.apache.org/jira/browse/HBASE-5328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235229#comment-13235229 ] Jean-Daniel Cryans commented on HBASE-5328: --- +1 on v12 Small changes to Master to make it more testable Key: HBASE-5328 URL: https://issues.apache.org/jira/browse/HBASE-5328 Project: HBase Issue Type: Task Reporter: stack Assignee: stack Attachments: 5328.txt, 5328v12.txt, 5328v2.txt, 5328v2.txt, 5328v3.txt, 5328v4.txt, 5328v8.txt Here are some small changes in Master that make it more testable. Included tests stand up a Master and then fake it into thinking that three regionservers are registering making master assign root and meta, etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3692) Handle RejectedExecutionException in HTable
[ https://issues.apache.org/jira/browse/HBASE-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232995#comment-13232995 ] Jean-Daniel Cryans commented on HBASE-3692: --- Chinmay, Could it be that you are re-inserting closed HTables in that pool? At this point no one has been able to provide code here that we could test that shows a bug. As far as I can tell there's no bug just API misusage (and poor error reporting on HBase's end). Handle RejectedExecutionException in HTable --- Key: HBASE-3692 URL: https://issues.apache.org/jira/browse/HBASE-3692 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Attachments: test_datanucleus.zip A user on IRC yesterday had an issue with RejectedExecutionException coming out of HTable sometimes. Apart from being very confusing to the user as it comes with no message at all, it exposes the HTable internals. I think we should handle it and instead throw something like DontUseHTableInMultipleThreadsException or something more clever. In his case, the user had a HTable leak with the pool that he was able to figure out once I told him what to look for. It could be an unchecked exception and we could consider adding in 0.90 but marking for 0.92 at the moment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5594) Unable to stop a master that's waiting on -ROOT- during initialization
[ https://issues.apache.org/jira/browse/HBASE-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233000#comment-13233000 ] Jean-Daniel Cryans commented on HBASE-5594: --- Hey Ram, Didn't you write that that thread dump on Dec. 9th wasn't for 4951 and that the issue wasn't in 0.92? Unable to stop a master that's waiting on -ROOT- during initialization -- Key: HBASE-5594 URL: https://issues.apache.org/jira/browse/HBASE-5594 Project: HBase Issue Type: Bug Affects Versions: 0.92.1 Reporter: Jean-Daniel Cryans Fix For: 0.92.2, 0.94.0, 0.96.0 We just had a case where the master (that was just restarted) was having a hard time assigning -ROOT- (all the PRI handlers were full already) so we tried to shutdown the cluster and even though all the RS closed down properly the master kept running being blocked on: {noformat} master-sv4r20s12,10302,1331916142866 prio=10 tid=0x7f3708008800 nid=0x4b20 in Object.wait() [0x7f370d1d] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x0006030be3f8 (a org.apache.hadoop.hbase.zookeeper.RootRegionTracker) at java.lang.Object.wait(Object.java:485) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:131) - locked 0x0006030be3f8 (a org.apache.hadoop.hbase.zookeeper.RootRegionTracker) at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.blockUntilAvailable(ZooKeeperNodeTracker.java:104) - locked 0x0006030be3f8 (a org.apache.hadoop.hbase.zookeeper.RootRegionTracker) at org.apache.hadoop.hbase.catalog.CatalogTracker.waitForRoot(CatalogTracker.java:313) at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:571) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:501) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:336) at java.lang.Thread.run(Thread.java:662) {noformat} I haven't checked the 0.90 code, we got this on 0.92.1 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5580) Publish Thrift-generated files for other languages
[ https://issues.apache.org/jira/browse/HBASE-5580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229459#comment-13229459 ] Jean-Daniel Cryans commented on HBASE-5580: --- IMO it could be added as soon as 0.90, it's just adding code that users won't have to generate. 0 side effect. Publish Thrift-generated files for other languages -- Key: HBASE-5580 URL: https://issues.apache.org/jira/browse/HBASE-5580 Project: HBase Issue Type: New Feature Affects Versions: 0.90.4 Reporter: Patrick Angeles Fix For: 0.96.0 HBase ships with Thrift-generated Java files for use with the ThriftServer. For convenience (and to save users the frustration of having to compile and install the Thrift compiler), HBase can ship with the thrift-generated files for other languages as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5556) The JRuby jar we're shipping has a readline problem on some OS
[ https://issues.apache.org/jira/browse/HBASE-5556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226546#comment-13226546 ] Jean-Daniel Cryans commented on HBASE-5556: --- 1.6.7 isn't any better in that regard. The JRuby jar we're shipping has a readline problem on some OS -- Key: HBASE-5556 URL: https://issues.apache.org/jira/browse/HBASE-5556 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Fix For: 0.92.2, 0.94.0, 0.96.0 I started seeing this problem on our Ubuntu servers since 0.92.0, ^H isn't detected correctly anymore in the readline rb version that's shipped with jruby 1.6.5 It works when I use the 1.6.0 jar. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5525) Truncate and preserve region boundaries option
[ https://issues.apache.org/jira/browse/HBASE-5525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13226651#comment-13226651 ] Jean-Daniel Cryans commented on HBASE-5525: --- Sounds like a plan, if by remove all content from fs under the table's dir you mean under the table's regions' dir. Truncate and preserve region boundaries option -- Key: HBASE-5525 URL: https://issues.apache.org/jira/browse/HBASE-5525 Project: HBase Issue Type: New Feature Reporter: Jean-Daniel Cryans Fix For: 0.96.0 A tool that would be useful for testing (and maybe in prod too) would be a truncate option to keep the current region boundaries. Right now what you have to do is completely kill the table and recreate it with the correct regions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5325) Expose basic information about the master-status through jmx beans
[ https://issues.apache.org/jira/browse/HBASE-5325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13225794#comment-13225794 ] Jean-Daniel Cryans commented on HBASE-5325: --- Just a note here, this patch added yet another service to our mbeans. Before we had HBase and RegionServer, now this patches adds org.apache.hbase. Expose basic information about the master-status through jmx beans --- Key: HBASE-5325 URL: https://issues.apache.org/jira/browse/HBASE-5325 Project: HBase Issue Type: Improvement Reporter: Hitesh Shah Assignee: Hitesh Shah Priority: Minor Fix For: 0.92.1, 0.94.0 Attachments: HBASE-5325.1.patch, HBASE-5325.2.patch, HBASE-5325.3.branch-0.92.patch, HBASE-5325.3.patch, HBASE-5325.wip.patch Similar to the Namenode and Jobtracker, it would be good if the hbase master could expose some information through mbeans. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5537) MXBean shouldn't have a dependence on InterfaceStability until 0.96
[ https://issues.apache.org/jira/browse/HBASE-5537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13224577#comment-13224577 ] Jean-Daniel Cryans commented on HBASE-5537: --- +1 MXBean shouldn't have a dependence on InterfaceStability until 0.96 --- Key: HBASE-5537 URL: https://issues.apache.org/jira/browse/HBASE-5537 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: stack Fix For: 0.92.1, 0.94.0 Attachments: 5537.txt HBASE-5325 has a dependence on InterfaceStability.Evolving in 0.92 and it shouldn't have it until 0.96. One problem it currently causes is that 0.92 can't be compiled against CDH3u3. Assigning to Stack. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4890) fix possible NPE in HConnectionManager
[ https://issues.apache.org/jira/browse/HBASE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13221358#comment-13221358 ] Jean-Daniel Cryans commented on HBASE-4890: --- Sorry, I got distracted (I even forgot about this issue) so nothing new. fix possible NPE in HConnectionManager -- Key: HBASE-4890 URL: https://issues.apache.org/jira/browse/HBASE-4890 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Priority: Blocker Fix For: 0.92.1 I was running YCSB against a 0.92 branch and encountered this error message: {code} 11/11/29 08:47:16 WARN client.HConnectionManager$HConnectionImplementation: Failed all from region=usertable,user3917479014967760871,1322555655231.f78d161e5724495a9723bcd972f97f41., hostname=c0316.hal.cloudera.com, port=57020 java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1501) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1353) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:898) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:775) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:750) at com.yahoo.ycsb.db.HBaseClient.update(Unknown Source) at com.yahoo.ycsb.DBWrapper.update(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransactionUpdate(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransaction(Unknown Source) at com.yahoo.ycsb.ClientThread.run(Unknown Source) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1315) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1327) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1325) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:158) at $Proxy4.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1330) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1328) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1309) ... 7 more {code} It looks like the NPE is caused by server being null in the MultiRespone call() method. {code} public MultiResponse call() throws IOException { return getRegionServerWithoutRetries( new ServerCallableMultiResponse(connection, tableName, null) { public MultiResponse call() throws IOException { return server.multi(multi); } @Override public void connect(boolean reload) throws IOException { server = connection.getHRegionConnection(loc.getHostname(), loc.getPort()); } } ); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4890) fix possible NPE in HConnectionManager
[ https://issues.apache.org/jira/browse/HBASE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216515#comment-13216515 ] Jean-Daniel Cryans commented on HBASE-4890: --- Very unlikely, 5336's NPE is coming from Hadoop land instead of our IPC and to me it looks like that file was already closed. fix possible NPE in HConnectionManager -- Key: HBASE-4890 URL: https://issues.apache.org/jira/browse/HBASE-4890 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Priority: Blocker Fix For: 0.92.1 I was running YCSB against a 0.92 branch and encountered this error message: {code} 11/11/29 08:47:16 WARN client.HConnectionManager$HConnectionImplementation: Failed all from region=usertable,user3917479014967760871,1322555655231.f78d161e5724495a9723bcd972f97f41., hostname=c0316.hal.cloudera.com, port=57020 java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1501) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1353) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:898) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:775) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:750) at com.yahoo.ycsb.db.HBaseClient.update(Unknown Source) at com.yahoo.ycsb.DBWrapper.update(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransactionUpdate(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransaction(Unknown Source) at com.yahoo.ycsb.ClientThread.run(Unknown Source) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1315) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1327) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1325) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:158) at $Proxy4.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1330) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1328) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1309) ... 7 more {code} It looks like the NPE is caused by server being null in the MultiRespone call() method. {code} public MultiResponse call() throws IOException { return getRegionServerWithoutRetries( new ServerCallableMultiResponse(connection, tableName, null) { public MultiResponse call() throws IOException { return server.multi(multi); } @Override public void connect(boolean reload) throws IOException { server = connection.getHRegionConnection(loc.getHostname(), loc.getPort()); } } ); {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215798#comment-13215798 ] Jean-Daniel Cryans commented on HBASE-4365: --- FWIW running a 5TB upload took 18h. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Assignee: stack Priority: Critical Labels: usability Fix For: 0.94.0 Attachments: 4365-v2.txt, 4365-v3.txt, 4365-v4.txt, 4365-v5.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215803#comment-13215803 ] Jean-Daniel Cryans commented on HBASE-4365: --- Oh and no concurrent mode failures, as I don't use dumb configurations. Also my ZK timeout is set to 20s. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Assignee: stack Priority: Critical Labels: usability Fix For: 0.94.0 Attachments: 4365-v2.txt, 4365-v3.txt, 4365-v4.txt, 4365-v5.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4890) fix possible NPE in HConnectionManager
[ https://issues.apache.org/jira/browse/HBASE-4890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13216196#comment-13216196 ] Jean-Daniel Cryans commented on HBASE-4890: --- More progress, we're setting a null exception on the call in this code: {code} protected void cleanupCalls(long rpcTimeout) { IteratorEntryInteger, Call itor = calls.entrySet().iterator(); while (itor.hasNext()) { Call c = itor.next().getValue(); long waitTime = System.currentTimeMillis() - c.getStartTime(); if (waitTime = rpcTimeout) { c.setException(closeException); // local exception synchronized (c) { c.notifyAll() ; } {code} Now adding some debugging in there (printing a WARN and doing a continue instead of setting the exception), the call never gets a SocketTimeoutException set like it's supposed to be. It's just hanging around... fix possible NPE in HConnectionManager -- Key: HBASE-4890 URL: https://issues.apache.org/jira/browse/HBASE-4890 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jonathan Hsieh Priority: Blocker Fix For: 0.92.1 I was running YCSB against a 0.92 branch and encountered this error message: {code} 11/11/29 08:47:16 WARN client.HConnectionManager$HConnectionImplementation: Failed all from region=usertable,user3917479014967760871,1322555655231.f78d161e5724495a9723bcd972f97f41., hostname=c0316.hal.cloudera.com, port=57020 java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.NullPointerException at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1501) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1353) at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:898) at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:775) at org.apache.hadoop.hbase.client.HTable.put(HTable.java:750) at com.yahoo.ycsb.db.HBaseClient.update(Unknown Source) at com.yahoo.ycsb.DBWrapper.update(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransactionUpdate(Unknown Source) at com.yahoo.ycsb.workloads.CoreWorkload.doTransaction(Unknown Source) at com.yahoo.ycsb.ClientThread.run(Unknown Source) Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1315) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1327) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1325) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:158) at $Proxy4.multi(Unknown Source) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1330) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1328) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1309) ... 7 more {code} It looks like the NPE is caused by server being null in the MultiRespone call() method. {code} public MultiResponse call() throws IOException { return getRegionServerWithoutRetries( new ServerCallableMultiResponse(connection, tableName, null) { public MultiResponse call() throws IOException { return server.multi(multi); } @Override public void connect(boolean reload) throws IOException { server = connection.getHRegionConnection(loc.getHostname(), loc.getPort()); } } ); {code} -- This message is automatically
[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload
[ https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214944#comment-13214944 ] Jean-Daniel Cryans commented on HBASE-5349: --- @Mubarak We already have that through HeapSize, it's really just a matter of knowing what to auto-tune and when. Automagically tweak global memstore and block cache sizes based on workload --- Key: HBASE-5349 URL: https://issues.apache.org/jira/browse/HBASE-5349 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Fix For: 0.94.0 Hypertable does a neat thing where it changes the size given to the CellCache (our MemStores) and Block Cache based on the workload. If you need an image, scroll down at the bottom of this link: http://www.hypertable.com/documentation/architecture/ That'd be one less thing to configure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214984#comment-13214984 ] Jean-Daniel Cryans commented on HBASE-3134: --- Patch looks good. Was it tested outside of unit tests? [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134-v3.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215080#comment-13215080 ] Jean-Daniel Cryans commented on HBASE-4365: --- Conclusion for the 1TB upload: Flush size: 512MB Split size: 20GB Without patch: 18012s With patch: 12505s It's 1.44x better, so a huge improvement. The difference here is due to the fact that it takes an awfully long time to split the first few regions without the patch. In the past I was starting the test with a smaller split size and then once I got a good distribution I was doing an online alter to set it to 20GB. Not anymore with this patch :) Another observation: the upload in general is slowed down by too many store files blocking. I could trace this to compactions taking a long time to get rid of reference files (3.5GB taking more than 10 minutes) and during that time you can hit the block multiple times. We really ought to see how we can optimize the compactions, consider compacting those big files in many threads instead of only one, and enable referencing reference files to skip some compactions altogether. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365-v2.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5415) FSTableDescriptors should handle random folders in hbase.root.dir better
[ https://issues.apache.org/jira/browse/HBASE-5415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215175#comment-13215175 ] Jean-Daniel Cryans commented on HBASE-5415: --- bq. Whats difference between miscellaneous dirs under hbase.rootdir and an actual table directory that is missing its .tableinfo file? Former's HTD is null, latter gets a FNFE. bq. We're changing our API when we remove TEE from public methods? Technically no, TEE (and FNFE FWIW) are both IOEs so there's no change there. I removed TEE specifically because it isn't thrown anymore. FSTableDescriptors should handle random folders in hbase.root.dir better Key: HBASE-5415 URL: https://issues.apache.org/jira/browse/HBASE-5415 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Priority: Critical Fix For: 0.92.1, 0.94.0 Attachments: HBASE-5415.patch I faked an upgrade on a test cluster using our dev data so I had to distcp the data between the two clusters, but after starting up and doing the migration and whatnot the web UI didn't show any table. The reason was in the master's log: {quote} org.apache.hadoop.hbase.TableExistsException: No descriptor for _distcp_logs_e0ehek at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164) at org.apache.hadoop.hbase.util.FSTableDescriptors.getAll(FSTableDescriptors.java:182) at org.apache.hadoop.hbase.master.HMaster.getHTableDescriptors(HMaster.java:1554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364) at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1326) {quote} I don't think we need to show a full stack (just a WARN maybe), this shouldn't kill the request (still see tables in the web UI), and why is that a TableExistsException? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13215226#comment-13215226 ] Jean-Daniel Cryans commented on HBASE-4365: --- bq. One question I had: Did you observe write blocking - due to the number of store files - more frequently than without the patch (because with the patch we tend to get more store-files). I does happen a lot more in the beginning, growing out of the first few regions is really hard. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365-v2.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214122#comment-13214122 ] Jean-Daniel Cryans commented on HBASE-4365: --- The latest patch is looking good on my test cluster, will let the import finish before giving my +1 tho. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365-v2.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214134#comment-13214134 ] Jean-Daniel Cryans commented on HBASE-4365: --- bq. Got any comparison numbers for total import time, for say 100G load? Not yet, but I can definitely see that it jumpstarts the import. bq. Would be good to know that the new heuristic is definitely advantageous. It is, I don't need numbers to tell you that. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365-v2.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4365) Add a decent heuristic for region size
[ https://issues.apache.org/jira/browse/HBASE-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13214226#comment-13214226 ] Jean-Daniel Cryans commented on HBASE-4365: --- Alright I did a macrotest with 100GB. Configurations: Good old 15 machines test cluster (1 master), 2xquad, 14GB given to HBase, 4x SATA. The table is configured to flush at 256MB, split at 2GB. 40 clients that use a 12MB buffer, collocated on the RS. Higher threshold for compactions. Without patch: 1558s With patch: 1457s 1.07x improvement. Then what I saw is that once we've split a few times and that the load got balanced, the performance is exactly the same. That's expected. Also it seems that my split-after-flush patch also goes into full effect. I'm running another experiment right now uploading 1TB with flush set at 512MB and split at 20GB. I assume an even bigger difference. The reason to use 20GB is that with bigger data sets you need bigger regions, and starting such a load from scratch is currently horrible but this is what this jira is about. Add a decent heuristic for region size -- Key: HBASE-4365 URL: https://issues.apache.org/jira/browse/HBASE-4365 Project: HBase Issue Type: Improvement Affects Versions: 0.92.1, 0.94.0 Reporter: Todd Lipcon Priority: Critical Labels: usability Attachments: 4365-v2.txt, 4365.txt A few of us were brainstorming this morning about what the default region size should be. There were a few general points made: - in some ways it's better to be too-large than too-small, since you can always split a table further, but you can't merge regions currently - with HFile v2 and multithreaded compactions there are fewer reasons to avoid very-large regions (10GB+) - for small tables you may want a small region size just so you can distribute load better across a cluster - for big tables, multi-GB is probably best -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5402) PerformanceEvaluation creates the wrong number of rows in randomWrite
[ https://issues.apache.org/jira/browse/HBASE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209531#comment-13209531 ] Jean-Daniel Cryans commented on HBASE-5402: --- bq. The problem is that the resulting table is what is used in any subsequent scan tests using PE, which are then double reading some rows, rather than reading every row once Following this logic, how would a random read test work with keys that are UUIDs? You'll have to be lucky to get a couple of hits :) bq. This is counter intuitive, and also introduces the possibility of cache hits, which I think is not what is expected by users doing a scan test. Considering that blocks are 64KB and rows are ~1.5KB (keys+value), cache hits is going to happen no matter what. PerformanceEvaluation creates the wrong number of rows in randomWrite - Key: HBASE-5402 URL: https://issues.apache.org/jira/browse/HBASE-5402 Project: HBase Issue Type: Bug Components: test Reporter: Oliver Meyn The command line 'hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 10' should result in a table with 10 * (1024 * 1024) rows (so 10485760). Instead what happens is that the randomWrite job reports writing that many rows (exactly) but running rowcounter against the table reveals only e.g 6549899 rows. A second attempt to build the table produced slightly different results (e.g. 6627689). I see a similar discrepancy when using 50 instead of 10 clients (~35% smaller than expected). Further experimentation reveals that the problem is key collision - by removing the % totalRows in getRandomRow I saw a reduction in collisions (table was ~8M rows instead of 6.6M). Replacing the random row key with UUIDs instead of Integers solved the problem and produced exactly 10485760 rows. But that makes the key size 16 bytes instead of the current 10, so I'm not sure that's an acceptable solution. Here's the UUID code I used: public static byte[] format(final UUID uuid) { long msb = uuid.getMostSignificantBits(); long lsb = uuid.getLeastSignificantBits(); byte[] buffer = new byte[16]; for (int i = 0; i 8; i++) { buffer[i] = (byte) (msb 8 * (7 - i)); } for (int i = 8; i 16; i++) { buffer[i] = (byte) (lsb 8 * (7 - i)); } return buffer; } which is invoked within getRandomRow with return format(UUID.randomUUID()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5402) PerformanceEvaluation creates the wrong number of rows in randomWrite
[ https://issues.apache.org/jira/browse/HBASE-5402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208618#comment-13208618 ] Jean-Daniel Cryans commented on HBASE-5402: --- I don't see this is an issue, it does create the right amount of rows if you consider counting versions. PerformanceEvaluation creates the wrong number of rows in randomWrite - Key: HBASE-5402 URL: https://issues.apache.org/jira/browse/HBASE-5402 Project: HBase Issue Type: Bug Components: test Reporter: Oliver Meyn The command line 'hbase org.apache.hadoop.hbase.PerformanceEvaluation randomWrite 10' should result in a table with 10 * (1024 * 1024) rows (so 10485760). Instead what happens is that the randomWrite job reports writing that many rows (exactly) but running rowcounter against the table reveals only e.g 6549899 rows. A second attempt to build the table produced slightly different results (e.g. 6627689). I see a similar discrepancy when using 50 instead of 10 clients (~35% smaller than expected). Further experimentation reveals that the problem is key collision - by removing the % totalRows in getRandomRow I saw a reduction in collisions (table was ~8M rows instead of 6.6M). Replacing the random row key with UUIDs instead of Integers solved the problem and produced exactly 10485760 rows. But that makes the key size 16 bytes instead of the current 10, so I'm not sure that's an acceptable solution. Here's the UUID code I used: public static byte[] format(final UUID uuid) { long msb = uuid.getMostSignificantBits(); long lsb = uuid.getLeastSignificantBits(); byte[] buffer = new byte[16]; for (int i = 0; i 8; i++) { buffer[i] = (byte) (msb 8 * (7 - i)); } for (int i = 8; i 16; i++) { buffer[i] = (byte) (lsb 8 * (7 - i)); } return buffer; } which is invoked within getRandomRow with return format(UUID.randomUUID()); -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5407) Show the per-region level request count in the web ui
[ https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208736#comment-13208736 ] Jean-Daniel Cryans commented on HBASE-5407: --- Isn't that already in 0.92.0? Here's what it looks like: {quote} numberOfStores=1, numberOfStorefiles=1, storefileUncompressedSizeMB=20984, storefileSizeMB=20307, compressionRatio=0.9677, memstoreSizeMB=0, storefileIndexSizeMB=0, readRequestsCount=0, writeRequestsCount=0, rootIndexSizeKB=13, totalStaticIndexSizeKB=18788, totalStaticBloomSizeKB=23296, totalCompactingKVs=40636900, currentCompactedKVs=20887752, compactionProgressPct=0.0, coprocessors=[] {quote} Show the per-region level request count in the web ui - Key: HBASE-5407 URL: https://issues.apache.org/jira/browse/HBASE-5407 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang It would be nice to show the per-region level request count in the web ui, especially when debugging the hot region problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5407) Show the per-region level request/sec count in the web ui
[ https://issues.apache.org/jira/browse/HBASE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13208961#comment-13208961 ] Jean-Daniel Cryans commented on HBASE-5407: --- It's the total number. Show the per-region level request/sec count in the web ui - Key: HBASE-5407 URL: https://issues.apache.org/jira/browse/HBASE-5407 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang It would be nice to show the per-region level request/sec count in the web ui, especially when debugging the hot region problem. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13207099#comment-13207099 ] Jean-Daniel Cryans commented on HBASE-3134: --- bq. I think it is better to remove the peerStateTrackers and manage peers with one HashMap(the peerClusters) by moving the PeerStateTracker to ReplicationPeer Yeah it'd be worth trying. bq. The sourceEnabled and ReplicationPeer.peerEnabled seem to have a same role. Can the sourceEnable be removed? Yeah sourceEnabled was a placeholder, if you think it's better in ReplicationPeer then go for it. Just don't forget to do the check in there to verify what's in ZK. [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5161) Compaction algorithm should prioritize reference files
[ https://issues.apache.org/jira/browse/HBASE-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13206330#comment-13206330 ] Jean-Daniel Cryans commented on HBASE-5161: --- Just got this again running a 5TB upload, started seeing regions of 50GB that couldn't split. Compaction algorithm should prioritize reference files -- Key: HBASE-5161 URL: https://issues.apache.org/jira/browse/HBASE-5161 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Priority: Minor Fix For: 0.94.0, 0.92.1 I got myself into a state where my table was un-splittable as long as the insert load was coming in. Emergency flushes because of the low memory barrier don't check the number of store files so it never blocks, to a point where I had in one case 45 store files and the compactions were almost never done on the reference files (had 15 of them, went down by one in 20 minutes). Since you can't split regions with reference files, that region couldn't split and was doomed to just get more store files until the load stopped. Marking this as a minor issue, what we really need is a better pushback mechanism but not prioritizing reference files seems wrong. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5381) Make memstore.flush.size as a table level configuration
[ https://issues.apache.org/jira/browse/HBASE-5381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205617#comment-13205617 ] Jean-Daniel Cryans commented on HBASE-5381: --- It already is, see MEMSTORE_FLUSHSIZE in the shell or HTD.setMemStoreFlushSize(). Am I missing something? Make memstore.flush.size as a table level configuration --- Key: HBASE-5381 URL: https://issues.apache.org/jira/browse/HBASE-5381 Project: HBase Issue Type: Improvement Reporter: Liyin Tang Assignee: Liyin Tang Currently the region server will flush mem store of the region based on the limitation of the global mem store flush size and global low water mark. However, It will cause the hot tables, which serve more write traffic, to flush too frequently even though the overall mem store heap usage is quite low. Too frequently flush would also contribute to too many minor compactions. So if we can make memstore.flush.size as a table level configuration, it would be more flexible to config different tables with different desired mem store flush size based on compaction ratio, recovery time and put ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5382) Test that we always cache index and bloom blocks
[ https://issues.apache.org/jira/browse/HBASE-5382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205652#comment-13205652 ] Jean-Daniel Cryans commented on HBASE-5382: --- +1 Test that we always cache index and bloom blocks Key: HBASE-5382 URL: https://issues.apache.org/jira/browse/HBASE-5382 Project: HBase Issue Type: Test Reporter: Mikhail Bautin Assignee: Mikhail Bautin Attachments: TestForceCacheImportantBlocks-2012-02-10_11_07_15.patch This is a unit test that should have been part of HBASE-4683 but was not committed. The original test was reviewed as part of https://reviews.facebook.net/D807. Submitting unit test as a separate JIRA and patch, and extending the scope of the test to also handle the case when block cache is enabled for the column family. The new review is at https://reviews.facebook.net/D1695. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13205764#comment-13205764 ] Jean-Daniel Cryans commented on HBASE-3134: --- bq. Its usage is in line with that of peerClusters. Since the key is peer id and peer should be registered first, I don't see a problem here. It should be changed too, originally it wasn't used by multiple threads because there was a maximum of one peer. Just to be safe. [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: 3134-v2.txt, 3134.txt, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params
[ https://issues.apache.org/jira/browse/HBASE-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204701#comment-13204701 ] Jean-Daniel Cryans commented on HBASE-2375: --- A bunch of things changed since this jira was created: - we now split based on the store size - regions split at 1GB - memstores flush at 128MB - there's been a lot of work on tuning the store file selection algorithm My understanding of this jira is that it aims at making the out of the box mass import experience better. Now that we have bulk loads and pre-splitting this use case is becoming less and less important... although we still see people trying to benchmark it (hi hypertable). I see three things we could do: - Trigger splits after flushes, I hacked a patch and it works awesomely - Have a lower split size for newly created tables. Hypertable does this with a soft limit that gets doubled every time the table splits until it reaches the normal split size - Have multi-way splits (Todd's idea), so that if you have enough data that you know you're going to be splitting after the current split then just spawn as many daughters as you need. I'm planning on just fixing the first bullet point in the context of this jira. Maybe there's another stuff from the patch in this jira that we could fit in. Make decision to split based on aggregate size of all StoreFiles and revisit related config params -- Key: HBASE-2375 URL: https://issues.apache.org/jira/browse/HBASE-2375 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.20.3 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Labels: moved_from_0_20_5 Attachments: HBASE-2375-v8.patch Currently we will make the decision to split a region when a single StoreFile in a single family exceeds the maximum region size. This issue is about changing the decision to split to be based on the aggregate size of all StoreFiles in a single family (but still not aggregating across families). This would move a check to split after flushes rather than after compactions. This issue should also deal with revisiting our default values for some related configuration parameters. The motivating factor for this change comes from watching the behavior of RegionServers during heavy write scenarios. Today the default behavior goes like this: - We fill up regions, and as long as you are not under global RS heap pressure, you will write out 64MB (hbase.hregion.memstore.flush.size) StoreFiles. - After we get 3 StoreFiles (hbase.hstore.compactionThreshold) we trigger a compaction on this region. - Compaction queues notwithstanding, this will create a 192MB file, not triggering a split based on max region size (hbase.hregion.max.filesize). - You'll then flush two more 64MB MemStores and hit the compactionThreshold and trigger a compaction. - You end up with 192 + 64 + 64 in a single compaction. This will create a single 320MB and will trigger a split. - While you are performing the compaction (which now writes out 64MB more than the split size, so is about 5X slower than the time it takes to do a single flush), you are still taking on additional writes into MemStore. - Compaction finishes, decision to split is made, region is closed. The region now has to flush whichever edits made it to MemStore while the compaction ran. This flushing, in our tests, is by far the dominating factor in how long data is unavailable during a split. We measured about 1 second to do the region closing, master assignment, reopening. Flushing could take 5-6 seconds, during which time the region is unavailable. - The daughter regions re-open on the same RS. Immediately when the StoreFiles are opened, a compaction is triggered across all of their StoreFiles because they contain references. Since we cannot currently split a split, we need to not hang on to these references for long. This described behavior is really bad because of how often we have to rewrite data onto HDFS. Imports are usually just IO bound as the RS waits to flush and compact. In the above example, the first cell to be inserted into this region ends up being written to HDFS 4 times (initial flush, first compaction w/ no split decision, second compaction w/ split decision, third compaction on daughter region). In addition, we leave a large window where we take on edits (during the second compaction of 320MB) and then must make the region unavailable as we flush it. If we increased the compactionThreshold to be 5 and determined splits based on aggregate size, the behavior becomes: - We
[jira] [Commented] (HBASE-2375) Make decision to split based on aggregate size of all StoreFiles and revisit related config params
[ https://issues.apache.org/jira/browse/HBASE-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204797#comment-13204797 ] Jean-Daniel Cryans commented on HBASE-2375: --- bq. Doing first bullet point only sounds good. Lets file issues for the split other suggestions. Kewl. bq. Upping compactionThreshold from 3 to 5 where 5 is than the number of flushes it would take to make us splittable; i.e. the intent is no compaction before first split. Sounds like a change that can have a bigger impact but that mostly helps this specific use case... bq. Instead up the compactionThreshold and down the default regionsize from 1G to 512M and keep flush at 128M? I'd rather split earlier for the first regions. Make decision to split based on aggregate size of all StoreFiles and revisit related config params -- Key: HBASE-2375 URL: https://issues.apache.org/jira/browse/HBASE-2375 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.20.3 Reporter: Jonathan Gray Assignee: Jonathan Gray Priority: Critical Labels: moved_from_0_20_5 Attachments: HBASE-2375-v8.patch Currently we will make the decision to split a region when a single StoreFile in a single family exceeds the maximum region size. This issue is about changing the decision to split to be based on the aggregate size of all StoreFiles in a single family (but still not aggregating across families). This would move a check to split after flushes rather than after compactions. This issue should also deal with revisiting our default values for some related configuration parameters. The motivating factor for this change comes from watching the behavior of RegionServers during heavy write scenarios. Today the default behavior goes like this: - We fill up regions, and as long as you are not under global RS heap pressure, you will write out 64MB (hbase.hregion.memstore.flush.size) StoreFiles. - After we get 3 StoreFiles (hbase.hstore.compactionThreshold) we trigger a compaction on this region. - Compaction queues notwithstanding, this will create a 192MB file, not triggering a split based on max region size (hbase.hregion.max.filesize). - You'll then flush two more 64MB MemStores and hit the compactionThreshold and trigger a compaction. - You end up with 192 + 64 + 64 in a single compaction. This will create a single 320MB and will trigger a split. - While you are performing the compaction (which now writes out 64MB more than the split size, so is about 5X slower than the time it takes to do a single flush), you are still taking on additional writes into MemStore. - Compaction finishes, decision to split is made, region is closed. The region now has to flush whichever edits made it to MemStore while the compaction ran. This flushing, in our tests, is by far the dominating factor in how long data is unavailable during a split. We measured about 1 second to do the region closing, master assignment, reopening. Flushing could take 5-6 seconds, during which time the region is unavailable. - The daughter regions re-open on the same RS. Immediately when the StoreFiles are opened, a compaction is triggered across all of their StoreFiles because they contain references. Since we cannot currently split a split, we need to not hang on to these references for long. This described behavior is really bad because of how often we have to rewrite data onto HDFS. Imports are usually just IO bound as the RS waits to flush and compact. In the above example, the first cell to be inserted into this region ends up being written to HDFS 4 times (initial flush, first compaction w/ no split decision, second compaction w/ split decision, third compaction on daughter region). In addition, we leave a large window where we take on edits (during the second compaction of 320MB) and then must make the region unavailable as we flush it. If we increased the compactionThreshold to be 5 and determined splits based on aggregate size, the behavior becomes: - We fill up regions, and as long as you are not under global RS heap pressure, you will write out 64MB (hbase.hregion.memstore.flush.size) StoreFiles. - After each MemStore flush, we calculate the aggregate size of all StoreFiles. We can also check the compactionThreshold. For the first three flushes, both would not hit the limit. On the fourth flush, we would see total aggregate size = 256MB and determine to make a split. - Decision to split is made, region is closed. This time, the region just has to flush out whichever edits made it to the MemStore during the snapshot/flush
[jira] [Commented] (HBASE-5349) Automagically tweak global memstore and block cache sizes based on workload
[ https://issues.apache.org/jira/browse/HBASE-5349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13204087#comment-13204087 ] Jean-Daniel Cryans commented on HBASE-5349: --- Good question, I don't think looking at requests is good enough... instead we could look at how both are used and if there's adjustment to be made. For example, if you have a read heavy workload then the memstores would not see a lot of usage... same with write heavy, the block cache would be close to empty. Those two are clear cuts, now for those workloads in between it gets a bit harder. Maybe at first we shouldn't even try to optimize them. I think it should also be done incrementally, move like 3-5% of the heap from one place to the other every few minutes until it settles. Automagically tweak global memstore and block cache sizes based on workload --- Key: HBASE-5349 URL: https://issues.apache.org/jira/browse/HBASE-5349 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Fix For: 0.94.0 Hypertable does a neat thing where it changes the size given to the CellCache (our MemStores) and Block Cache based on the workload. If you need an image, scroll down at the bottom of this link: http://www.hypertable.com/documentation/architecture/ That'd be one less thing to configure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4072) zoo.cfg inconsistently used and sometimes we use non-zk names for zk attributes
[ https://issues.apache.org/jira/browse/HBASE-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13202558#comment-13202558 ] Jean-Daniel Cryans commented on HBASE-4072: --- Personally that wouldn't bother me, but maybe it will break a few installs. zoo.cfg inconsistently used and sometimes we use non-zk names for zk attributes --- Key: HBASE-4072 URL: https://issues.apache.org/jira/browse/HBASE-4072 Project: HBase Issue Type: Bug Reporter: stack This issue was found by Lars: http://search-hadoop.com/m/n04sthNcji2/zoo.cfg+vs+hbase-site.xmlsubj=Re+zoo+cfg+vs+hbase+site+xml Lets fix the inconsistency found and fix the places where we use non-zk attribute name for a zk attribute in hbase (There's only a few places that I remember -- maximum client connections is one IIRC) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3134) [replication] Add the ability to enable/disable streams
[ https://issues.apache.org/jira/browse/HBASE-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13201862#comment-13201862 ] Jean-Daniel Cryans commented on HBASE-3134: --- We can't hit ZK every time we replicate in order to see what the state is, each RS instead should have a watcher and the check should be done locally. The rest looks good, thanks a lot for working on this. [replication] Add the ability to enable/disable streams --- Key: HBASE-3134 URL: https://issues.apache.org/jira/browse/HBASE-3134 Project: HBase Issue Type: New Feature Components: replication Reporter: Jean-Daniel Cryans Assignee: Teruyoshi Zenmyo Priority: Minor Labels: replication Fix For: 0.94.0 Attachments: HBASE-3134.patch, HBASE-3134.patch, HBASE-3134.patch This jira was initially in the scope of HBASE-2201, but was pushed out since it has low value compared to the required effort (and when want to ship 0.90.0 rather soonish). We need to design a way to enable/disable replication streams in a determinate fashion. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5333) Introduce Memstore backpressure for writes
[ https://issues.apache.org/jira/browse/HBASE-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13201880#comment-13201880 ] Jean-Daniel Cryans commented on HBASE-5333: --- I've done some brainstorming with Stack and the result was HBASE-5162. Introduce Memstore backpressure for writes Key: HBASE-5333 URL: https://issues.apache.org/jira/browse/HBASE-5333 Project: HBase Issue Type: Improvement Reporter: Lars Hofhansl Currently if the memstore/flush/compaction cannot keep up with the writeload, we block writers up to hbase.hstore.blockingWaitTime milliseconds (default is 9). Would be nice if there was a concept of a soft backpressure that slows writing clients gracefully *before* we reach this condition. From the log: 2012-02-04 00:00:06,963 WARN org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Region table,,1328313512779.c2761757621ddf8fb78baf5288d71271. has too many store files; delaying flush up to 9ms -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5267) Add a configuration to disable the slab cache by default
[ https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13201884#comment-13201884 ] Jean-Daniel Cryans commented on HBASE-5267: --- I'm +1 with the patch, but eventually I'd still like to see something in the book about it. Add a configuration to disable the slab cache by default Key: HBASE-5267 URL: https://issues.apache.org/jira/browse/HBASE-5267 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Li Pi Priority: Blocker Fix For: 0.94.0, 0.92.1 Attachments: 5267.txt, 5267v2.txt, 5267v3.txt From what I commented at the tail of HBASE-4027: {quote} I changed the release note, the patch doesn't have a hbase.offheapcachesize configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize (which is actually a big problem when you consider this: http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). {quote} We need to add hbase.offheapcachesize and set it to false by default. Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5267) Add a configuration to disable the slab cache by default
[ https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13201909#comment-13201909 ] Jean-Daniel Cryans commented on HBASE-5267: --- Pointing to it might be a good option, along with a line or two on how to use it. Add a configuration to disable the slab cache by default Key: HBASE-5267 URL: https://issues.apache.org/jira/browse/HBASE-5267 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Li Pi Priority: Blocker Fix For: 0.94.0, 0.92.1 Attachments: 5267.txt, 5267v2.txt, 5267v3.txt From what I commented at the tail of HBASE-4027: {quote} I changed the release note, the patch doesn't have a hbase.offheapcachesize configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize (which is actually a big problem when you consider this: http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). {quote} We need to add hbase.offheapcachesize and set it to false by default. Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5267) Add a configuration to disable the slab cache by default
[ https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192304#comment-13192304 ] Jean-Daniel Cryans commented on HBASE-5267: --- I'd like to see this configuration added to hbase-default.xml with some documentation, also it should be documented in the book. hbase-env.sh also needs some fixin: bq. # Set hbase.offheapcachesize in hbase-site.xml Add a configuration to disable the slab cache by default Key: HBASE-5267 URL: https://issues.apache.org/jira/browse/HBASE-5267 Project: HBase Issue Type: Bug Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Assignee: Li Pi Priority: Blocker Fix For: 0.94.0, 0.92.1 Attachments: 5267.txt From what I commented at the tail of HBASE-4027: {quote} I changed the release note, the patch doesn't have a hbase.offheapcachesize configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize (which is actually a big problem when you consider this: http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). {quote} We need to add hbase.offheapcachesize and set it to false by default. Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.90
[ https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189244#comment-13189244 ] Jean-Daniel Cryans commented on HBASE-5231: --- I'm -1, let's keep 0.90 stable and not add new behaviors in late releases. Also HBASE-3373 wasn't even applied to 0.92 and has no assignee. Backport HBASE-3373 (per-table load balancing) to 0.90 -- Key: HBASE-5231 URL: https://issues.apache.org/jira/browse/HBASE-5231 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu This JIRA backports per-table load balancing to 0.90 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92
[ https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189255#comment-13189255 ] Jean-Daniel Cryans commented on HBASE-5231: --- I'm -0 on 0.92. Backport HBASE-3373 (per-table load balancing) to 0.92 -- Key: HBASE-5231 URL: https://issues.apache.org/jira/browse/HBASE-5231 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu This JIRA backports per-table load balancing to 0.90 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92
[ https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189294#comment-13189294 ] Jean-Daniel Cryans commented on HBASE-5231: --- bq. If user has to use script to achieve the above, we should implement it in core. I think you already did. Backport HBASE-3373 (per-table load balancing) to 0.92 -- Key: HBASE-5231 URL: https://issues.apache.org/jira/browse/HBASE-5231 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu This JIRA backports per-table load balancing to 0.90 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92
[ https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189349#comment-13189349 ] Jean-Daniel Cryans commented on HBASE-5231: --- {quote} Some of my work in load balancer was checked into 0.92 half a year ago. I wish customers can use the basic feature early. {quote} We've been testing 0.92 for about that time, this release has way too much load. bq. HBASE-3373 being in TRUNK wouldn't mean much to majority of users. The way it works is if you can get a +1 and no -1 then you can commit, my current -0 won't block you from backporting. Backport HBASE-3373 (per-table load balancing) to 0.92 -- Key: HBASE-5231 URL: https://issues.apache.org/jira/browse/HBASE-5231 Project: HBase Issue Type: Improvement Reporter: Zhihong Yu This JIRA backports per-table load balancing to 0.90 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5207) Apply HBASE-5155 to trunk
[ https://issues.apache.org/jira/browse/HBASE-5207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13187156#comment-13187156 ] Jean-Daniel Cryans commented on HBASE-5207: --- Collision with HBASE-5206? Apply HBASE-5155 to trunk -- Key: HBASE-5207 URL: https://issues.apache.org/jira/browse/HBASE-5207 Project: HBase Issue Type: Bug Reporter: ramkrishna.s.vasudevan The issue HBASE-5155 has been fixed on branch(0.90). The same has to be applied on trunk also. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor
[ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185697#comment-13185697 ] Jean-Daniel Cryans commented on HBASE-5174: --- bq. Failed or aborted tasks should not be displayed after the retry is succeeded. Otherwise, will it cause confusion? I'd rather want to know that something went wrong, and since it's ordered by time you can see that it eventually succeeds. Coalesce aborted tasks in the TaskMonitor - Key: HBASE-5174 URL: https://issues.apache.org/jira/browse/HBASE-5174 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Fix For: 0.94.0, 0.92.1 Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this: {noformat} 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false {noformat} But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x: {noformat} Tue Jan 10 19:28:29 UTC 2012 Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. ABORTED (since 31sec ago) Not flushing since writes not enabled (since 31sec ago) {noformat} It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size
[ https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185909#comment-13185909 ] Jean-Daniel Cryans commented on HBASE-5190: --- About #3. I like the elegance of that solution since we don't have to keep track of the calls in flight but I see 2 big issues: - if you set a max call size you need to keep both clients and servers in sink and also decide who's going to do the check. - if you plan for big calls by default, you may end up with a tiny size for the queue. For example, let's say you cap calls at 10% of the heap and set their max individual size at 10MB, it means that you can only allow 10 items in the queue (and you don't account listeners). Limit the IPC queue size based on calls' payload size - Key: HBASE-5190 URL: https://issues.apache.org/jira/browse/HBASE-5190 Project: HBase Issue Type: Improvement Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Fix For: 0.94.0 Currently we limit the number of calls in the IPC queue only on their count. It used to be really high and was dropped down recently to num_handlers * 10 (so 100 by default) because it was easy to OOME yourself when huge calls were being queued. It's still possible to hit this problem if you use really big values and/or a lot of handlers, so the idea is that we should take into account the payload size. I can see 3 solutions: - Do the accounting outside of the queue itself for all calls coming in and out and when a call doesn't fit, throw a retryable exception. - Same accounting but instead block the call when it comes in until space is made available. - Add a new parameter for the maximum size (in bytes) of a Call and then set the size the IPC queue (in terms of the number of items) so that it could only contain as many items as some predefined maximum size (in bytes) for the whole queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size
[ https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185943#comment-13185943 ] Jean-Daniel Cryans commented on HBASE-5190: --- About #1. The issue with this is that by throwing the call back to the client you generate a lot more IO and CPU (serializing, deserializing) and there's the possibility the of starving those clients that have bigger calls. Limit the IPC queue size based on calls' payload size - Key: HBASE-5190 URL: https://issues.apache.org/jira/browse/HBASE-5190 Project: HBase Issue Type: Improvement Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Fix For: 0.94.0 Currently we limit the number of calls in the IPC queue only on their count. It used to be really high and was dropped down recently to num_handlers * 10 (so 100 by default) because it was easy to OOME yourself when huge calls were being queued. It's still possible to hit this problem if you use really big values and/or a lot of handlers, so the idea is that we should take into account the payload size. I can see 3 solutions: - Do the accounting outside of the queue itself for all calls coming in and out and when a call doesn't fit, throw a retryable exception. - Same accounting but instead block the call when it comes in until space is made available. - Add a new parameter for the maximum size (in bytes) of a Call and then set the size the IPC queue (in terms of the number of items) so that it could only contain as many items as some predefined maximum size (in bytes) for the whole queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5190) Limit the IPC queue size based on calls' payload size
[ https://issues.apache.org/jira/browse/HBASE-5190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186064#comment-13186064 ] Jean-Daniel Cryans commented on HBASE-5190: --- bq. The TODO above can be dropped My understanding of that todo is that it shouldn't be setting as many PRI as there's normal handlers, so it still stands. bq. Todd's comment makes sense I agree. bq. But since it requires more work, we can checkin this patch for now. I disagree, I'll investigate his idea and I'm in no hurry to check this in since it's targeted for 0.94 Limit the IPC queue size based on calls' payload size - Key: HBASE-5190 URL: https://issues.apache.org/jira/browse/HBASE-5190 Project: HBase Issue Type: Improvement Affects Versions: 0.90.5 Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Fix For: 0.94.0 Attachments: HBASE-5190.patch Currently we limit the number of calls in the IPC queue only on their count. It used to be really high and was dropped down recently to num_handlers * 10 (so 100 by default) because it was easy to OOME yourself when huge calls were being queued. It's still possible to hit this problem if you use really big values and/or a lot of handlers, so the idea is that we should take into account the payload size. I can see 3 solutions: - Do the accounting outside of the queue itself for all calls coming in and out and when a call doesn't fit, throw a retryable exception. - Same accounting but instead block the call when it comes in until space is made available. - Add a new parameter for the maximum size (in bytes) of a Call and then set the size the IPC queue (in terms of the number of items) so that it could only contain as many items as some predefined maximum size (in bytes) for the whole queue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5174) Coalesce aborted tasks in the TaskMonitor
[ https://issues.apache.org/jira/browse/HBASE-5174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185237#comment-13185237 ] Jean-Daniel Cryans commented on HBASE-5174: --- bq. so perhaps failed or aborted tasks could remain displayed for a longer period of time. Agreed, and if for each time you coalesce tasks together you reset the timer then it could stick around for a while. Coalesce aborted tasks in the TaskMonitor - Key: HBASE-5174 URL: https://issues.apache.org/jira/browse/HBASE-5174 Project: HBase Issue Type: Improvement Affects Versions: 0.92.0 Reporter: Jean-Daniel Cryans Fix For: 0.94.0, 0.92.1 Some tasks can get repeatedly canceled like flushing when splitting is going on, in the logs it looks like this: {noformat} 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush thread woke up because memory above low water=1.6g 2012-01-10 19:28:29,164 INFO org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Flush of region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. due to global heap pressure 2012-01-10 19:28:29,164 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: NOT flushing memstore for region test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c., flushing=false, writesEnabled=false {noformat} But in the TaskMonitor UI you'll get MAX_TASKS (1000) displayed on top of the regions. Basically 1000x: {noformat} Tue Jan 10 19:28:29 UTC 2012 Flushing test1,,1326223218996.3eea0d89af7b851c3a9b4246389a4f2c. ABORTED (since 31sec ago) Not flushing since writes not enabled (since 31sec ago) {noformat} It's ugly and I'm sure some users will freak out seeing this, plus you have to scroll down all the way to see your regions. Coalescing consecutive aborted tasks seems like a good solution. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4224) Need a flush by regionserver rather than by table option
[ https://issues.apache.org/jira/browse/HBASE-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185244#comment-13185244 ] Jean-Daniel Cryans commented on HBASE-4224: --- Not a fan of the patch, the first problem I see is that flushing a list of regions and keeping the handler open will trigger socket exceptions too easily when you have more than a handful of regions. Also it adds a lot of code. Unfortunately I don't have a better option in mind and the original problem seems sketchy to me. There's no easy way to tell when all flushes are done (as shown by the acrobatics of the latest patch) and currently flushes are done in sync with the client call. At this point I'm of the opinion that we should just: - add an async call to flush (that does what compact currently does). - add the functionality in HBA to request the flushing of all regions on one region server. It would call the RS x times and those calls would be quick since it just queues the flush request. You can then ask for the flush queue size through JMX (and I'm sure there are other means) so that when you are close to one you call for the log roll. This last step could also be manual or scripted. Need a flush by regionserver rather than by table option Key: HBASE-4224 URL: https://issues.apache.org/jira/browse/HBASE-4224 Project: HBase Issue Type: Bug Components: shell Reporter: stack Assignee: Akash Ashok Attachments: HBase-4224-v2.patch, HBase-4224.patch This evening needed to clean out logs on the cluster. logs are by regionserver. to let go of logs, we need to have all edits emptied from memory. only flush is by table or region. We need to be able to flush the regionserver. Need to add this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3692) Handle RejectedExecutionException in HTable
[ https://issues.apache.org/jira/browse/HBASE-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185306#comment-13185306 ] Jean-Daniel Cryans commented on HBASE-3692: --- @Andy, I tried the attached test (which gave me a hard time with mvn 3.0), by adding some debugging in HTable I can see that HBasePersistenceHandler repeatedly closes and reuses the same HTables. That close() method got beefed up it seems during our 0.90 point releases and it now closes the TPE which generates the rejected execution you see. It also closes the connection to HBase. The fix would be to not reuse the HTables that are closed. To add to this jira, we should handle RejectedExecutionException when it comes from a closed HTable. Handle RejectedExecutionException in HTable --- Key: HBASE-3692 URL: https://issues.apache.org/jira/browse/HBASE-3692 Project: HBase Issue Type: Improvement Affects Versions: 0.90.1 Reporter: Jean-Daniel Cryans Attachments: test_datanucleus.zip A user on IRC yesterday had an issue with RejectedExecutionException coming out of HTable sometimes. Apart from being very confusing to the user as it comes with no message at all, it exposes the HTable internals. I think we should handle it and instead throw something like DontUseHTableInMultipleThreadsException or something more clever. In his case, the user had a HTable leak with the pool that he was able to figure out once I told him what to look for. It could be an unchecked exception and we could consider adding in 0.90 but marking for 0.92 at the moment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5120) Timeout monitor races with table disable handler
[ https://issues.apache.org/jira/browse/HBASE-5120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185369#comment-13185369 ] Jean-Daniel Cryans commented on HBASE-5120: --- Testing the patch with a low timeout, I can answer the question in the code that asks We don't abort if the delete node returns false. Is there any such corner case? and yes here it is: {noformat} 2012-01-13 00:53:39,053 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. state=PENDING_CLOSE, ts=1326415997208, server=null 2012-01-13 00:53:39,053 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. 2012-01-13 00:53:39,053 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. (offlining) 2012-01-13 00:53:39,254 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. which is already PENDING_CLOSE but forcing to send a CLOSE RPC again 2012-01-13 00:53:39,255 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: update TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. state=PENDING_CLOSE, ts=1326416019254, server=null the timestamp. 2012-01-13 00:53:39,256 INFO org.apache.hadoop.hbase.master.AssignmentManager: Server sv4r12s38,62023,1326415651391 returned org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: org.apache.hadoop.hbase.regionserver.RegionAlreadyInTransitionException: Received:CLOSE for the region:TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. ,which we are already trying to CLOSE. for 0784c045e00205949461cb21b8f4cd6a 2012-01-13 00:54:09,051 INFO org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed out: TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. state=PENDING_CLOSE, ts=1326416019256, server=null 2012-01-13 00:54:09,051 INFO org.apache.hadoop.hbase.master.AssignmentManager: Region has been PENDING_CLOSE for too long, running forced unassign again on region=TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. 2012-01-13 00:54:09,051 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting unassignment of region TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. (offlining) 2012-01-13 00:54:09,126 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Attempting to unassign region TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a. which is already PENDING_CLOSE but forcing to send a CLOSE RPC again 2012-01-13 00:54:09,127 INFO org.apache.hadoop.hbase.master.AssignmentManager: While trying to recover the table TestTable to DISABLED state the region {NAME = 'TestTable,0006605550,1326415764458.0784c045e00205949461cb21b8f4cd6a.', STARTKEY = '0006605550', ENDKEY = '0006616035', ENCODED = 0784c045e00205949461cb21b8f4cd6a,} was offlined but the table was in DISABLING state 2012-01-13 00:54:09,127 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:62003-0x134589d3db06d62 Deleting existing unassigned node for 0784c045e00205949461cb21b8f4cd6a that is in expected state M_ZK_REGION_CLOSING 2012-01-13 00:54:09,128 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: master:62003-0x134589d3db06d62 Attempting to delete unassigned node 0784c045e00205949461cb21b8f4cd6a in M_ZK_REGION_CLOSING state but node is in RS_ZK_REGION_CLOSED state 2012-01-13 00:54:09,128 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:62003-0x134589d3db06d62 Deleting existing unassigned node for 0784c045e00205949461cb21b8f4cd6a that is in expected state RS_ZK_REGION_CLOSED 2012-01-13 00:54:09,140 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Handling transition=RS_ZK_REGION_CLOSED, server=sv4r12s38,62023,1326415651391, region=0784c045e00205949461cb21b8f4cd6a 2012-01-13 00:54:09,140 WARN org.apache.hadoop.hbase.master.AssignmentManager: Received CLOSED for region 0784c045e00205949461cb21b8f4cd6a from server sv4r12s38,62023,1326415651391 but region was in the state null and not in expected PENDING_CLOSE or CLOSING states 2012-01-13 00:54:09,148 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: master:62003-0x134589d3db06d62 Successfully deleted unassigned node for region 0784c045e00205949461cb21b8f4cd6a in expected state RS_ZK_REGION_CLOSED 2012-01-13 00:54:09,148 ERROR org.apache.hadoop.hbase.master.AssignmentManager: The deletion of the CLOSED node for the region 0784c045e00205949461cb21b8f4cd6a returned true 2012-01-13 00:54:09,148 INFO