[jira] [Created] (HBASE-8423) Allow Major Compaction to Use Different Compression

2013-04-24 Thread Nicolas Spiegelberg (JIRA)
Nicolas Spiegelberg created HBASE-8423: -- Summary: Allow Major Compaction to Use Different Compression Key: HBASE-8423 URL: https://issues.apache.org/jira/browse/HBASE-8423 Project: HBase

[jira] [Created] (HBASE-8155) Allow Record Compression for HLogs

2013-03-20 Thread Nicolas Spiegelberg (JIRA)
Nicolas Spiegelberg created HBASE-8155: -- Summary: Allow Record Compression for HLogs Key: HBASE-8155 URL: https://issues.apache.org/jira/browse/HBASE-8155 Project: HBase Issue Type: New

Re: Buzzwords 2012

2012-04-10 Thread Nicolas Spiegelberg
I will be attending Berlin Buzzwords. I have a presentation entitled Multi-tenant HBase Solutions at Facebook that I will present there. On 4/10/12 4:35 AM, Lars George lars.geo...@gmail.com wrote: Hi, Any of you going to Buzzwords this year? Just curious. Lars

Re: Pull instant schema updating out?

2012-04-03 Thread Nicolas Spiegelberg
We're using a variant of the Online schema update in our 89 production. There are significant differences because of the master rewrite, so I can't speak about the stability on trunk. We don't run with online splitting, so that's also a large variant from the use cases some of you are trying to

Re: Arcanist + HBase

2012-03-22 Thread Nicolas Spiegelberg
Jonathan, please read HBASE-4896 for help documentation. It's the #2 search result when I searched for 'hbase arc'. It answers both issues you filed On 3/22/12 1:45 AM, Jonathan Hsieh j...@cloudera.com wrote: I wanted to try the arcanist 'arc lint' mechanism but need some help getting started.

Re: DISCUSS: Have hbase require at least hadoop 1.0.0 in hbase 0.96.0?

2012-03-02 Thread Nicolas Spiegelberg
I'm wondering why HDFS security support should be mandatory? Append makes sense because there's no way to have a durable system without it. Security is currently an optional feature implemented as an HBase co-processor (vs core), correct? Is there a problem (other than minor inconvenience) with

Re: manual trigger of major compaction does not seems to work on 0.92

2012-02-24 Thread Nicolas Spiegelberg
In 0.92, we can handle multiple compactions but we wait on major compact if minor compactions are ongoing for the same store. It should be enqueued and work after the minor compaction finishes, it's just not immediately enqueued. See HBASE-5330 for some discussion on this. There should be a log

Re: LIRS cache as an alternative to LRU cache

2012-02-21 Thread Nicolas Spiegelberg
We had the author of LIRS come to Facebook last year to talk about his algorithm and general benefits. At the time, we were looking at increasing block cache efficiency. The general consensus was that it wasn't an exponential perf gain, so we could get bigger wins from cache-on-write

Re: LIRS cache as an alternative to LRU cache

2012-02-21 Thread Nicolas Spiegelberg
: vrodio...@carrieriq.com From: Nicolas Spiegelberg [nspiegelb...@fb.com] Sent: Tuesday, February 21, 2012 9:01 AM To: dev@hbase.apache.org Subject: Re: LIRS cache as an alternative to LRU cache We had the author of LIRS come to Facebook last year to talk about

Re: LIRS cache as an alternative to LRU cache

2012-02-21 Thread Nicolas Spiegelberg
: If it was ARC (which uses both LRU and LFU) we'd have patenting issues with IBM, what we have is closer to a 2Q: http://www.vldb.org/conf/1994/P439.PDF J-D On Tue, Feb 21, 2012 at 9:19 AM, Nicolas Spiegelberg nspiegelb...@fb.com wrote: Vlad, You're correct. The existing algorithm

Re: Major Compaction Concerns

2012-02-19 Thread Nicolas Spiegelberg
1. my CF were already working with BF they used ROWCOL, (i didn't pay attention to that at the time i wrote my answers) 2. I see form the logs that the BF is already 100% - is it bad? should I had more memory for BF? Since Bloom Filters are a probabilistic optimization, it's kinda hard to analyze

Re: svn commit: r1239930 - in /hbase/branches/0.89-fb/src: main/java/org/apache/hadoop/hbase/master/SplitLogManager.java test/java/org/apache/hadoop/hbase/master/TestSplitLogManager.java

2012-02-03 Thread Nicolas Spiegelberg
@Stack: this is an 89-master change. It should have that tag. I'll look to figure out why the precommit hook isn't working properly. On 2/2/12 8:59 PM, Stack st...@duboce.net wrote: Mikhail, the below doesn't have the hbase jira in it? St.Ack On Thu, Feb 2, 2012 at 3:30 PM,

Re: hbase 0.94.0

2012-02-02 Thread Nicolas Spiegelberg
I think HFile upgrade in particular is more complicated than you think. We currently have production traffic running with HFileV1. It has a 5-min SLA. We can't afford to take the entire downtime to rewrite 100GB (or whatever) worth of data. We need to do this while the cluster is live. AFAIK

Re: hbase 0.94.0

2012-01-31 Thread Nicolas Spiegelberg
1) The trunk has more master work. Specifically 89-master related features that we implemented for our grid architecture. You have a list of issues Nicolas? We have a list, but it's non-trivial. You're more than welcome to help if you want. :) 2) Client/Server compatibility. Even if we

Re: hbase 0.94.0

2012-01-30 Thread Nicolas Spiegelberg
Currently, we at Facebook are developing mostly on 89 but also doing some significant exploratory work on trunk. I think that most of our development will continue to be done on 89 in the near future. We plan to release some lower-risk projects on 94. However, we won't even entertain a

Re: hbase 0.94.0

2012-01-30 Thread Nicolas Spiegelberg
I think it's also good to make a crucial distinction here: it's not like Facebook has a concrete wide-scale upgrade strategy, these 3 points are deal-breakers that won't allow us to even entertain an upgrade strategy. This is just as critical as HDFS data loss was in 0.90: it's something we can't

Re: hbase 0.94.0

2012-01-30 Thread Nicolas Spiegelberg
Traditionally there have been no guarantees of cross major version compatibility. RPC especially. Never a rolling upgrade from an 0.90 to an 0.92, for example. For persistent data, there is a migration utility for upping from one major release to the next. I'm advocating that RPC compatibility

Re: hbase 0.94.0

2012-01-30 Thread Nicolas Spiegelberg
1) The trunk has more master work. Specifically 89-master related features that we implemented for our grid architecture. You have a list of issues Nicolas? It's at least 3 major features. We're more than willing to work with you on porting. It sounds like eBay Cloudera have some possible

Re: Changing our logo purple to International Orange (Engineering)

2012-01-11 Thread Nicolas Spiegelberg
+1 It's much more visible and bold. I like the Golden Gate reference... On 1/11/12 8:42 AM, Stack st...@duboce.net wrote: I attached to https://issues.apache.org/jira/browse/HBASE-5115 what it looks like. International Orange (Engineering) is the color of the golden gate [1], and is a strong

Re: Major Compaction Concerns

2012-01-09 Thread Nicolas Spiegelberg
Mikael, Hi, I wrote the current compaction algorithm, so I should be able to answer most questions that you have about the feature. It sounds like you're creating quite a task list of work to do, but I don't understand what your use case is so a lot of that work may be not be critical and you

Re: Code review request for hbase-4120 table priority

2011-12-12 Thread Nicolas Spiegelberg
I would really like to see the isolation feature in 0.94+, so I intend to work with Jia and, if there is a successful result, support it going forward in a manner like Stargate, even though I may not run it personally (like I don't run Stargate). I take an expansive view of what open source

Re: Rename book to manual, redux

2011-12-05 Thread Nicolas Spiegelberg
+1 On 12/5/11 4:28 PM, Doug Meil doug.m...@explorysmedical.com wrote: Hi folks- Sorry to reopen this, but hopefully only briefly. I'm fine with renaming it something else not book, but in taking a few samples of others would anybody object to it being called Reference Guide? Reference Manual

Re: SequenceFileLogReader uses a reflection hack resulting in runtime failures

2011-12-02 Thread Nicolas Spiegelberg
I think it's good to remove the reflection when we can, more because it's easier to catch compile-time errors than run-time. The perf is negligible when you cache. As I recall, the problem here is the function was private in older versions. We just need to make sure that we don't support

Re: patch maturity and HBase release Was: HBASE-4120 table level priority

2011-11-02 Thread Nicolas Spiegelberg
I agree with Todd's sentiments on unnecessarily coupling feature priority with the availability of a patch. There's patches that we've developed internally, then threw away because we couldn't tie it to a production use case or thought a better design might be necessary. There's also patches

Commit Log format

2011-11-01 Thread Nicolas Spiegelberg
Committer questions. I accidentally checked in some changes to HBase trunk with the Phabricator checkin format. I used svn propedit to correct the log comments, so everything should look correct. However, this raised a couple questions about committing code to HBase. 1. I know the required

Re: Commit Log format

2011-11-01 Thread Nicolas Spiegelberg
To kick a dead horse, coupling the code diff with the CHANGES.txt alteration is also a massive PITA because its much harder to run 'git cherry-pick' across branches with it. It's far easier to directly port a change from 92 to trunk to 89-fb if I don't have that pesky little file in the way. On

0.89 Branch

2011-10-11 Thread Nicolas Spiegelberg
I've just created a new branch, 0.89. It has the current state of our largest HBase deploy, Facebook Messaging. 0.89 was branched from HBase trunk back in 2010 before the release of HBase 0.90.0 so its pretty old. The current HBase trunk has most of whats in 0.89 (and more) except for some

Re: 0.89 Branch

2011-10-11 Thread Nicolas Spiegelberg
, Nicolas Spiegelberg nspiegelb...@fb.com wrote: I've just created a new branch, 0.89. It has the current state of our largest HBase deploy, Facebook Messaging. 0.89 was branched from HBase trunk back in 2010 before the release of HBase 0.90.0 so its pretty old. The current HBase trunk has most

HBase Hackathon: August 22 @ Facebook

2011-08-08 Thread Nicolas Spiegelberg
/28754491/ Focus is 0.92 stabilization. I don't think we can avoid wanting to talk about 0.94 features, though :P Thanks, Nicolas Spiegelberg

Proposal: HBase Hackathon Aug 22 @ FB

2011-08-04 Thread Nicolas Spiegelberg
Chatted on IRC with Stack a couple others. We would like to host a Hackathon + Meetup on August 22 at Facebook in Palo Alto. All welcome. Unless objection, I'll put up a posting on meetup.com. The hacking theme would be HBase 0.92 stabilization. I'm sure 0.94 feature discussion will also

Re: Proposal: HBase Hackathon Aug 22 @ FB

2011-08-04 Thread Nicolas Spiegelberg
...@gmail.com wrote: I was thinking about this too. Is it possible to choose a day other than Monday (we have team meeting) ? Thanks On Thu, Aug 4, 2011 at 1:50 PM, Nicolas Spiegelberg nspiegelb...@fb.comwrote: Chatted on IRC with Stack a couple others. We would like to host a Hackathon + Meetup

Re: [DISCUSSION] Move hfile to v2 for 0.92 rather than 0.94?

2011-07-27 Thread Nicolas Spiegelberg
+1 for 0.92. This has received a lot of testing and code review internally. We have already found/fixed a number of advanced bugs before even submitting the initial patch. Very high quality. On Jul 26, 2011, at 7:18 PM, Stack st...@duboce.net wrote: Shall we commit the patch at

Re: Interesting YCBS Benchmark

2011-02-11 Thread Nicolas Spiegelberg
These slides were presented at FOSDEM, the conference where I just spoke about FB HBase. I went to the talk. There were some glaringly obvious problems that came out during the QA. Someone asked how many regions he had. Didnt know. Are you network or IO or CPU bottlenecked? Didn't know.

Re: Overhead of Bloomfilters

2011-01-25 Thread Nicolas Spiegelberg
A great article for Bloom Filter rules of thumb: http://corte.si/posts/code/bloom-filter-rules-of-thumb/ Note that only rules #1 #2 apply for our use case. Rule #3, while true, isn't as big a worry because we use combinatorial generation for hashes, so the number of 'expensive' hash

Re: bloom filter types

2010-12-29 Thread Nicolas Spiegelberg
I don't think there's an explicit wiki. Which option depends on whether your use case is calling get() for entire rows or for specific columns in a row. It also depends on analyzing your workload to determine how likely a row will be in every store file vs. a specific column. Also, since a row

Re: Good VLDB paper on WALs

2010-12-29 Thread Nicolas Spiegelberg
+1 for ELR. I think having some data structure where we prepare the next stage of sync() operations instead of holding the row lock over the sync would be a big win for hot regions without a huge refactor. I think the other two optimizations are useful to think about, but wouldn't have the same

[jira] Created: (HBASE-2625) Make testDynamicBloom()'s randomness deterministic

2010-05-28 Thread Nicolas Spiegelberg (JIRA)
Affects Versions: 0.21.0 Reporter: Nicolas Spiegelberg Assignee: Nicolas Spiegelberg Priority: Minor Fix For: 0.21.0 Had a failure with testDynamicBloom on Hudson today. Will investigate, however it would be nice to reproduce the problem to make sure