Peter, that doesnt make sense.
 
I mean I believe you in what you are saying, but don't see how a VPN in would 
cause this variance in results.

Do you have any speculative execution turned on?

Are you counting just the numbers of rows in the result set, or are you using 
counters in the map reduce? (I'm assuming that you are running a map/reduce, 
and not just a simple connection and single threaded scan...).

I apologize if this had already been answered, I hadn't been following this too 
closely.

Sent from a remote device. Please excuse any typos...

Mike Segel

On Mar 22, 2012, at 8:01 PM, Peter Wolf <opus...@gmail.com> wrote:

> Hello again Lars and Lars,
> 
> Here is some additional information that may help you track this down.
> 
> I think this behavior has something to do with my VPN.  My servers are on the 
> Amazon Cloud and I normally run my client on my laptop via a VPN 
> (Tunnelblick: OS X 10.7.3; Tunnelblick 3.2.3 (build 2891.2932)).  This is 
> where I see the buggy behavior I describe.
> 
> However, when my Client is running on an EC2 machine, then I get different 
> behavior.  I can not prove that it is always correct, but in at least one 
> case my current code does not work on my laptop, but gets the correct number 
> of results on an EC2 machine.  Note that my scans are also much faster on the 
> EC2 machine.
> 
> I will do more tests to see if I can localize it further.
> 
> Hope this helps
> Thank you again
> Peter
> 
> 
> On 3/19/12 2:24 PM, Peter Wolf wrote:
>> Hello Lars and Lars,
>> 
>> Thank you for you help and attention.
>> 
>> I wrote a standalone test that exhibits the bug.
>> 
>> http://dl.dropbox.com/u/68001072/HBaseScanCacheBug.java
>> 
>> Here is the output.  It shows how the number of results and key value pairs 
>> varies as caching in changed, and families are included.  It shows the bug 
>> starting with 3 families and 5000 caching.  It also shows a new bug, where 
>> the query always fails with an IOException with 4 families.
>> 
>> CacheSize FamilyCount ResultCount KeyValueCount
>> 1000 1 10000 10
>> 5000 1 10000 10
> 

Reply via email to