[jira] Commented: (DERBY-3937) Select count(*) scans all the rows (and is therefore slow with big tables), is the amount of rows not available/known for example in index ?

Trejkaz (JIRA) Wed, 24 Jun 2009 21:35:32 -0700

    [ 
https://issues.apache.org/jira/browse/DERBY-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723893#action_12723893
 ]


Trejkaz commented on DERBY-3937:
--------------------------------

The query optimiser works on an estimation of the number of rows, not the 
actual number of rows.

Related to this: in our case if we had a way to do APPROXCOUNT(*) instead of 
COUNT(*), it would be fast and accurate "enough", as we are only using this on 
big tables to provide the user with an idea of how much stuff is left to do.  
It doesn't have to be the exact number of items, just in the ballpark.  But we 
couldn't find a way to use the estimate, so we used COUNT(*) and are thus 
bitten by this bug.


> Select count(*) scans all the rows (and is therefore slow with big tables), 
> is the amount of rows not available/known for example in index ?
> --------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3937
>                 URL: https://issues.apache.org/jira/browse/DERBY-3937
>             Project: Derby
>          Issue Type: Improvement
>          Components: Performance
>         Environment: Any
>            Reporter: Martin Hajduch
>
> Create table with 5000000 rows. Create index on unique ID. Select count(*) on 
> such table is going to take quite some time.
> Shouldn't the index contain amount of indexed rows and the value taken from 
> there ?
> Additionally, queries of the form select count(*) from table where 
> col1=value; take lots of time (depending on amount of rows satisfying WHERE 
> clause) even if index on col1 exists. Isn't it possible to find first and 
> last occurence in the index, and then calculate amount of rows more 
> effectively then scanning through all of them ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3937) Select count(*) scans all the rows (and is therefore slow with big tables), is the amount of rows not available/known for example in index ?

Reply via email to