[ 
https://issues.apache.org/jira/browse/ASTERIXDB-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498362#comment-17498362
 ] 

ASF subversion and git services commented on ASTERIXDB-3016:
------------------------------------------------------------

Commit 3d79c9f39392d6e2e5127b716788e4335014606b in asterixdb's branch 
refs/heads/master from Dmitry Lychagin
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=3d79c9f ]

[ASTERIXDB-3016][RT] Fix failure in hash groupby

- user model changes: no
- storage format changes: no
- interface changes: no

Details:
- Modify hash group by to force garbage collection on the
  hash table if a tuple could not be inserted into it
- Make hash group by clean up its run files in case
  of an error

Change-Id: I7a133fa1d0555ebbcb7a9e3cb7445757716c9a2a
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/15325
Integration-Tests: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Tested-by: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Reviewed-by: Dmitry Lychagin <dmitry.lycha...@couchbase.com>
Reviewed-by: Till Westmann <t...@couchbase.com>


> Failure in hash groupby: Failed to insert a new buffer into the aggregate 
> operator
> ----------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-3016
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3016
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: RT - Runtime
>    Affects Versions: 0.9.6
>            Reporter: Dmitry Lychagin
>            Assignee: Dmitry Lychagin
>            Priority: Major
>             Fix For: 0.9.8
>
>
> Load the data as follows:
> {noformat}
> drop dataverse tpcds if exists;
> create dataverse tpcds;
> use tpcds;
> create dataset item(i_item_sk string not unknown) open type primary key 
> i_item_sk;
> create dataset inventory(inv_date_sk string not unknown, inv_item_sk string 
> not unknown,
>   inv_warehouse_sk string not unknown) open type primary key inv_date_sk, 
> inv_item_sk, inv_warehouse_sk;
> {noformat}
> {noformat}
> use tpcds;
> set `import-private-functions` `true`;
> insert into item (select value object_remove(t, "table_name") from 
> tpcds_datagen("item", 0.5) t);
> insert into inventory (select value object_remove(t, "table_name") from 
> tpcds_datagen("inventory", 0.5) t);
> {noformat}
> Run the following query:
> {noformat}
> SELECT  i.i_product_name, AVG(inv.inv_quantity_on_hand) qoh
> FROM  inventory inv, item i
> WHERE inv.inv_item_sk /*+hash-bcast*/ = i.i_item_sk
> /*+ hash */ GROUP BY i.i_product_name
>  ORDER BY qoh, i.i_product_name
> LIMIT 1;
> {noformat}
> The query fails with:
> {noformat}
> org.apache.hyracks.api.exceptions.HyracksDataException: Failed to insert a 
> new buffer into the aggregate operator!
>         at 
> org.apache.hyracks.dataflow.std.group.external.ExternalHashGroupBy.insert(ExternalHashGroupBy.java:57)
>  ~[classes/:?]
>         at 
> org.apache.hyracks.dataflow.std.group.external.ExternalGroupWriteOperatorNodePushable.buildGroup(ExternalGroupWriteOperatorNodePushable.java:175)
>  ~[classes/:?]
>         at 
> org.apache.hyracks.dataflow.std.group.external.ExternalGroupWriteOperatorNodePushable.doPass(ExternalGroupWriteOperatorNodePushable.java:146)
>  ~[classes/:?]
>         at 
> org.apache.hyracks.dataflow.std.group.external.ExternalGroupWriteOperatorNodePushable.initialize(ExternalGroupWriteOperatorNodePushable.java:108)
>  ~[classes/:?]
> {noformat}
> Also the test framework prints the following error:
> {noformat}
> java.lang.AssertionError: There are 7 leaked run files.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to