Thanks. I got an idea how distinct works and probably would be able to solve
this issue now.
Regards
Sumanta
-Juvenn Woo wrote: -
===
To: user@phoenix.apache.org
From: Juvenn Woo
Date: 02/10/2017 09:08PM
Subject: Re: Query TimeOut on Azure HDInsight
=
Sumanta,
Actually DISTINCT makes big difference, it may require scan as many rows as
possible to find 10 (limit 10) distinct rows. If your COL1 has less than 10
distinct value, it'll scan whole table to know that there are less than
that.
On Feb 10, 2017 11:25 PM, "Sumanta Gh" wrote:
> If we re
If we remove DISTINCT from the below query, everything works fine.
Any pointer why DISTINCT could fail?
Regards
Sumanta
-Mark Heppner wrote: -
===
To: user@phoenix.apache.org
From: Mark Heppner
Date: 02/10/2017 08:02PM
Subject: Re: Query TimeOut on Azure
Sumanta:
bq. at region=TABLE1,,1450429763940.e30cec826e39df2e3b21e0baa6e1d9c0.,
Please check the log of region server which hosted the above region around
the time of your query.
Which Phoenix / hbase release are you using ?
Thanks
On Fri, Feb 10, 2017 at 6:31 AM, Mark Heppner
wrote:
> Sumant
Sumanta,
Doing the full scan over 100 million rows is going to be costly. How many
region servers do you have? If this is a common query, you could add a
secondary index on COL1 and INCLUDE(COLX). Otherwise, you'll have to
increase hbase.rpc.timeout to something higher than 6 and maybe even
pho
Hi,
We have a production system on Azure HDInsight.
There is a table called TABLE1 which has approx 100 million rows.
Recently the following query is always timing out -
SELECT DISTINCT COLX FROM TABLE1 WHERE COL1=1 LIMIT 10;
java.lang.RuntimeException: org.apache.phoenix.exception.PhoenixIOExc