thanks, I will try List later
On Tue, Apr 15, 2014 at 3:39 AM, Doug Meil
wrote:
>
> re: "my first version is using 20,000 Get²
>
> Just throwing this out there, but have you looked at multi-get? Multi-get
> will group the gets by RegionServer internally.
>
> You are doing a lot of IO for a web-
re: "my first version is using 20,000 Get²
Just throwing this out there, but have you looked at multi-get? Multi-get
will group the gets by RegionServer internally.
You are doing a lot of IO for a web-app so this is going to be tough to
make ³fast², but there are ways to make it ³faster.²
But
This might help you: http://phoenix.incubator.apache.org/
JM
Le 2014-04-14 07:53, "Li Li" a écrit :
> I need to get about 20,000 rows from the table. the table is about
> 1,000,000 rows.
> my first version is using 20,000 Get and I found it's very slow. So I
> modified it to a scan and filter un
I need to get about 20,000 rows from the table. the table is about
1,000,000 rows.
my first version is using 20,000 Get and I found it's very slow. So I
modified it to a scan and filter unrelated rows in the client.
maybe I should write a coprocessor. btw, is there any filter available
for me? some
Hi Li Li,
If you have more than one region, might be useful. MR will scan all the
regions in parallel. If you do a full scan from a client API with no
parallelism, then the MR job might be faster. But it will take more
resources on the cluster and might impact the SLA of the other clients, if
any,
Well, it depends. Could you please provide some more details?It will help
us in giving a proper answer.
Warm Regards,
Tariq
cloudfront.blogspot.com
On Mon, Apr 14, 2014 at 11:38 AM, Li Li wrote:
> I have a full table scan which cost about 10 minutes. it seems a
> bottleneck for our application
I have a full table scan which cost about 10 minutes. it seems a
bottleneck for our application. if use map-reduce to rewrite it. will
it be faster?