Re: Mocking classes for unit tests Was: asynchbase-1.2.0-rc1 is available for download

2012-02-22 Thread Ted Dunning
Jmockit has worked well for both mocking and stubbing for me. My problem was System.currentTimeMillis and if you can patch that you can patch most anything. Sent from my iPhone On Feb 22, 2012, at 9:23 AM, Ted Yu yuzhih...@gmail.com wrote: Benoit's comment is directly related to our

Re: Mocking classes for unit tests Was: asynchbase-1.2.0-rc1 is available for download

2012-02-22 Thread Ted Dunning
Actually jmockit uses byte code patching so you may suffer less reflection overhead than expected. My guess is that powermock is doing something quite similar. Sent from my iPhone On Feb 22, 2012, at 9:29 AM, Jesse Yates jesse.k.ya...@gmail.com wrote: Only long standing problem I've found

Re: what is fuse used for?

2011-12-10 Thread Ted Dunning
As long as vendors are being mentioned, it should be pointed out that MapR's NFS capability also provides this capability. Followups should be off-list since this is off-topic relative to hbase. On Sat, Dec 10, 2011 at 11:27 PM, David Engfer david.eng...@gmail.comwrote: With Cloudera's Hue

Re: patch maturity and HBase release Was: HBASE-4120 table level priority

2011-11-02 Thread Ted Dunning
Todd, I am curious what you mean here. How is adding a test suite better than annotating different tests in the TestNG or Junit 4 style? On Wed, Nov 2, 2011 at 5:25 PM, Todd Lipcon t...@cloudera.com wrote: Adding a system test suite would do us some good here. Accumulo has a very nice one

Re: patch maturity and HBase release Was: HBASE-4120 table level priority

2011-11-02 Thread Ted Dunning
On Wed, Nov 2, 2011 at 5:33 PM, Todd Lipcon t...@cloudera.com wrote: On Wed, Nov 2, 2011 at 5:31 PM, Ted Dunning tdunn...@maprtech.com wrote: Todd, I am curious what you mean here. How is adding a test suite better than annotating different tests in the TestNG or Junit 4 style

Re: CSLM performance Was: SILT - nice keyvalue store paper

2011-10-25 Thread Ted Dunning
No. The problem is that you want to emulate real-world behavior which is probably closer to 1000 threads each doing a single transaction and yielding than anything else. For instance, if your traffic originates from a web-farm, each transaction will be in a thread that yields when it finishes

Re: Maven 2 vs. 3

2011-10-09 Thread Ted Dunning
The book is already slightly out of date. Maven 3 is better and handles almost all maven 2 builds pretty seamlessly. I don't see a problem with moving forward rather than back. On Sun, Oct 9, 2011 at 1:03 PM, Jesse Yates jesse.k.ya...@gmail.com wrote: So the book says to use Maven 2, rather

Re: Maven 2 vs. 3

2011-10-09 Thread Ted Dunning
). If that's not the case anymore I'll update the book. On 10/9/11 4:31 PM, Ted Dunning tdunn...@maprtech.com wrote: The book is already slightly out of date. Maven 3 is better and handles almost all maven 2 builds pretty seamlessly. I don't see a problem with moving forward rather than back

Re: Maven 2 vs. 3

2011-10-09 Thread Ted Dunning
update the book. On 10/9/11 4:31 PM, Ted Dunning tdunn...@maprtech.com wrote: The book is already slightly out of date. Maven 3 is better and handles almost all maven 2 builds pretty seamlessly. I don't see a problem with moving forward rather than back. On Sun, Oct 9, 2011 at 1:03

Re: Bay Area HBase User Group organizer change (?)

2011-10-01 Thread Ted Dunning
I can get some sponsorship going on my end as well. On Sun, Oct 2, 2011 at 12:09 AM, Ted Yu yuzhih...@gmail.com wrote: I agree. We should share the payment. On Sat, Oct 1, 2011 at 5:05 PM, Todd Lipcon t...@cloudera.com wrote: Thanks, Andrew! Let us know if we can chip in for the dues.

Re: Should I use HBASE?

2011-09-14 Thread Ted Dunning
This sounds like you will have about 500 million rows in your database after 6 months. To my mind, this is at the level of inconvenient for a conventional database, but hardly impossible. HBase will definitely hold this much data. It would probably help you to do some slightly clever tricks to

Re: [DISCUSSION] Accumulo, another BigTable clone, has shown up on Apache Incubator as a proposal

2011-09-06 Thread Ted Dunning
Hbase offers co-processors which should be able to do this. And median *can* be accumulated in a small amount of memory. It is a little trickier than mean, but still doable. On Tue, Sep 6, 2011 at 11:21 AM, Duane Moore duane.mo...@issinc.com wrote: - Aggregation Accumulo offers the ability

Re: Hadoop real time

2011-09-04 Thread Ted Dunning
There are additional off-shoots of Hadoop that can specifically address real-time needs such as Spark, S4 and Hstreaming. Most real-time-ish applications come, however, with a 100% uptime guarantee. Most simply put, a system that is down and is going to take 10's to 100's of minutes to come back

Re: Decisionmaking [was: Re: New HBase Logo]

2011-09-01 Thread Ted Dunning
Oh lordie, let's not. Logo discussions drown out everything else and never result in getting a better logo. At most, how about just having an up or down vote on this one logo. Then everybody can get on with something important. On Thu, Sep 1, 2011 at 10:46 PM, Stack st...@duboce.net wrote:

Re: Thanks -- and intro to Jon and Rehmi

2011-08-18 Thread Ted Dunning
to highly varied evaluations. Few people have huge variation in problem size and thus can focus on the interplay between resources and run-time at a (nearly) constant problem size. Thanks, Jon On Mon, Aug 15, 2011 at 4:53 PM, Ted Dunning tdunn...@maprtech.comwrote: Sure. The attached

Re: Two requests from a grumpy old man

2011-08-15 Thread Ted Dunning
Actually, I think that anybody who maintains any kind of reference to a JIRA whether in a distribution or in their own head would like the meaning of a JIRA to be relatively static. So even if the community doesn't explicitly know that they care about this, I bet they care about the consequences.

Re: Starting the Hadoop DataNode inside the HBase process?

2011-07-18 Thread Ted Dunning
On Mon, Jul 18, 2011 at 9:32 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: My gut is that this would be a maintenance headache What specifically do you think would cause a problem? Tracking versions for one. Everybody has a different favorite. That is the nice thing about

Re: Starting the Hadoop DataNode inside the HBase process?

2011-07-16 Thread Ted Dunning
On Sat, Jul 16, 2011 at 6:28 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Running the DataNode inside of an HBase process seems like this could be a good option to enable? My gut is that this would be a maintenance headache. Specifically because it would reduce the number of

Re: Converting byte[] to ByteBuffer

2011-07-10 Thread Ted Dunning
jason.rutherg...@gmail.com wrote: I'm a little confused, I was told none of the HBase code changed with MapR, if the HBase (not the OS) block cache has a JNI implementation then that part of the HBase code changed. On Jul 9, 2011 11:19 AM, Ted Dunning tdunn...@maprtech.com wrote: MapR does help

Re: Converting byte[] to ByteBuffer

2011-07-09 Thread Ted Dunning
MapR does help with the GC because it *does* have a JNI interface into an external block cache. Typical configurations with MapR trim HBase down to the minimal viable size and increase the file system cache correspondingly. On Fri, Jul 8, 2011 at 7:52 PM, Jason Rutherglen

Re: editing JIRA comments

2011-06-25 Thread Ted Dunning
No. But you certainly can draw a lot of attention to something you didn't mean to say. On Sat, Jun 25, 2011 at 1:02 PM, Todd Lipcon t...@cloudera.com wrote: Once you say something on the internet, you can't unsay it :)

Re: Thrift: byte[] vs ByteBuffer

2011-05-31 Thread Ted Dunning
Yes. New thrift since 0.6 is all byte buffers (thank goodness). But the hbase dependency is 0.5.0 which is before the great change. On Tue, May 31, 2011 at 8:03 PM, Stack st...@duboce.net wrote: Isn't thrift all bytebuffers now? Are you using a new thrift but your target is 0.89 which may

Re: BBUZZ Berlin Sightseeing?

2011-05-26 Thread Ted Dunning
Stack, We will hoist a few in your honor. You don't need to feel bad. On Thu, May 26, 2011 at 2:46 PM, Stack st...@duboce.net wrote: I find this topic completely inappropriate for dev list. Discussions of a group of hbasers hanging together in exotic locations, and reading between the

Re: Thoughts about partitioning retention and other stuff...

2011-05-14 Thread Ted Dunning
You should be able to set split points for every customer/date combination of interest. This would allow you to localize the data to be deleted to single regions. On Sat, May 14, 2011 at 12:25 PM, Ophir Cohen oph...@gmail.com wrote: About the partitionion: I talked about something more

Re: Hackathon notes 3/21/2011

2011-03-26 Thread Ted Dunning
Todd, I see ycsb on your list. Where did that go? We have been beating on it as well and have pretty much decided that it is worthless as it stands. My thought is that we need a multi-node version that takes directions about what load to generate via ZK. That is better than a map-reduce based

Re: Hackathon notes 3/21/2011

2011-03-26 Thread Ted Dunning
the map-phase of your program. On Sat, Mar 26, 2011 at 3:55 PM, Todd Lipcon t...@cloudera.com wrote: On Sat, Mar 26, 2011 at 3:53 PM, Ted Dunning tdunn...@maprtech.comwrote: Hmm... Yeah. I hear that scrapping YCSB meme a lot. Do you not worry about verifying intermediate results when over

Re: move meta table to ZK

2011-03-18 Thread Ted Dunning
Excuse me? How does that affect the issue of snapshotting a table? And how can replication NOT involve meta-data? On Thu, Mar 17, 2011 at 10:55 PM, jiangwen w wjiang...@gmail.com wrote: replication does not involve meta data. On Fri, Mar 18, 2011 at 12:32 PM, Ryan Rawson ryano...@gmail.com

Re: move meta table to ZK

2011-03-17 Thread Ted Dunning
When does ZK plan to adopt this extension? In general, I agree with Ryan that ZK is a good coordination layer, but the data (and associated meta-data should be self-hosted to simplify the consistency problem). On Thu, Mar 17, 2011 at 9:54 PM, jiangwen w wjiang...@gmail.com wrote: yes , it is

Re: linked in account attached to dev@?

2011-03-11 Thread Ted Dunning
I already did that. I spidered the right pages at Apache and got a list of 3780 apache mailing lists to send to LI. This is an interim measure until they implement an opt-out link on their invitation emails. But they kicked dev@hbase.a.o back at me saying that it was associated with an account.

Re: Let's Switch to TestNG

2011-02-23 Thread Ted Dunning
One nice feature is the ability to mark tests as skipped while still reporting the skipped tests. On Wed, Feb 23, 2011 at 10:45 AM, Ryan Rawson ryano...@gmail.com wrote: I filed HBASE-3555, and I listed the following reasons; - test groups allow us to separate slow/fast tests from each other

Re: Interesting YCBS Benchmark

2011-02-11 Thread Ted Dunning
Also, ycsb at least doesn't check the data that is returned. That means that their numbers for /dev/null would have been even better. On Fri, Feb 11, 2011 at 7:36 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Finally, in research the most important step is validation of the results. They

Re: Can region be merged with others automatically when all data in the region has expired and removed ?

2011-02-08 Thread Ted Dunning
Online merge is a bit dangerous. Lots of applications require that the table be set up pre-split. This is probably more common than the need for merging. Having such a pre-split table collapse before it is full would be a disaster. It should be pretty easy to script taking a few regions

Re: Overhead of Bloomfilters

2011-01-25 Thread Ted Dunning
See http://en.wikipedia.org/wiki/Double_hashing for information on double hashing. On Tue, Jan 25, 2011 at 8:11 AM, Nicolas Spiegelberg nspiegelb...@fb.comwrote: A great article for Bloom Filter rules of thumb: http://corte.si/posts/code/bloom-filter-rules-of-thumb/ Note that only rules #1

modified ycsb

2011-01-24 Thread Ted Dunning
I was frustrated over the weekend using ycsb because it doesn't check the data it gets and because of general code hygiene issues. Rather than just kvetch, I have modified ycsb and pushed it back onto github. See https://github.com/tdunning/YCSB My changes include: a) switched to maven to

Re: HBase roadmaps

2011-01-21 Thread Ted Dunning
It is possible to do (but a pain) by setting up the smaller tasks as sub-tasks on major bugs and then only setting the version for the major bugs. I don't like doing that so much since it is fairly intricate. If you have a marker for majorness, you can probably build a custom report to do the

Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)

2011-01-21 Thread Ted Dunning
Nice work! On Fri, Jan 21, 2011 at 12:40 PM, Mingjie Lai mingjie_...@trendmicro.comwrote: Guys. There is a discussion regarding testing HBASE with YCSB on Whirr or EC2. Send to @dev so more people can be involved. Lars. I have an automatic YCSB test for HBase running on EC2. It was derived

Re: Heap fragmentation

2011-01-20 Thread Ted Dunning
Nice work Todd. Were these numbers extracted using jconsole? On Thu, Jan 20, 2011 at 9:33 AM, Todd Lipcon t...@cloudera.com wrote: I did some experiments to understand our full GC issues better last night. Here are the results: http://people.apache.org/~todd/hbase-fragmentation/

Re: Heap fragmentation

2011-01-20 Thread Ted Dunning
Also, can you say what YCSB workload that used? On Thu, Jan 20, 2011 at 9:33 AM, Todd Lipcon t...@cloudera.com wrote: I did some experiments to understand our full GC issues better last night. Here are the results: http://people.apache.org/~todd/hbase-fragmentation/

Re: Heap fragmentation

2011-01-20 Thread Ted Dunning
/browse/HBASE-3455 https://issues.apache.org/jira/browse/HBASE-3455-Todd On Thu, Jan 20, 2011 at 9:48 AM, Ted Dunning tdunn...@maprtech.com wrote: Also, can you say what YCSB workload that used? On Thu, Jan 20, 2011 at 9:33 AM, Todd Lipcon t...@cloudera.com wrote: I did some

Re: java.lang.NoSuchMethodException: hbase-0.90

2011-01-07 Thread Ted Dunning
This is on 0.90, right? Were you using HDFS to store your region tables? I just ran into the same thing and looked into the org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader$WALReaderFSDataInputStream.getPos method. That method does some truly hideous reflection things

Re: java.lang.NoSuchMethodException: hbase-0.90

2011-01-07 Thread Ted Dunning
On Fri, Jan 7, 2011 at 10:30 AM, Stack st...@duboce.net wrote: As to your question Ted, it does seem like we could do the reflection once-only in the constructor rather than every time we do a getPos. Let me ask Nicolas. Maybe he had reason for having to do it each time. As to its

Re: java.lang.NoSuchMethodException: hbase-0.90

2011-01-07 Thread Ted Dunning
I think a simple check for the presence of the method is better. On Fri, Jan 7, 2011 at 11:32 AM, M. C. Srivas mcsri...@gmail.com wrote: How about checking to see if in is instanceOf DFSInputStream before doing the rest of the stuff? St.Ack

Re: java.lang.NoSuchMethodException: hbase-0.90

2011-01-07 Thread Ted Dunning
Great. I will file a patch to move the check to the constructor and fail back to old process if the method is missing. For our case, I just implemented getFileLength and all is happy (on that front) On Fri, Jan 7, 2011 at 12:38 PM, Stack st...@duboce.net wrote: Let me open an issue to add

Re: provide a 0.20-append tarball?

2010-12-23 Thread Ted Dunning
If you have a PMC member who is willing to be release manager, what is the beef? On Thu, Dec 23, 2010 at 2:49 PM, Ryan Rawson ryano...@gmail.com wrote: Looks like the fight does not go well. A lot of hdfs developers are concerned that it would detract resources. I'm not sure who's

Re: Hypertable claiming upto 900% random-read throughput vs HBase

2010-12-15 Thread Ted Dunning
heaps would blunt the advantage of a C++ program using malloc. -ryan On Wed, Dec 15, 2010 at 11:15 AM, Ted Dunning tdunn...@maprtech.com wrote: From the small comments I have heard, the RAM versus disk difference is mostly what I have heard they were testing. On Wed, Dec 15