[jira] [Resolved] (IMPALA-2787) Support de-duplicate records in Impala

Tim Armstrong (Jira) Wed, 23 Dec 2020 16:13:07 -0800


     [ 
https://issues.apache.org/jira/browse/IMPALA-2787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Tim Armstrong resolved IMPALA-2787.
-----------------------------------
    Resolution: Won't Fix

It's a little underspecified what the use-case is, but I don't know that we 
necessarily want to add a customized operation for this (as opposed to INSERT 
OVERWRITE .. SELECT DISTINCT)

> Support de-duplicate records in Impala
> --------------------------------------
>
>                 Key: IMPALA-2787
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2787
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend
>    Affects Versions: Impala 2.2.4
>            Reporter: Eric Lin
>            Priority: Minor
>
> Two use cases:
> Use Case 1: Remove duplicate rows where the all data in the row is identical
> Use Case 2: Remove duplicate rows where the all data in the row is identicalm 
> except for a small number of columns
> Rather than using SELECT DISTINCT from one table to another table, it would 
> be great if Impala can support it natively and remove duplicate records on 
> the table itself without a new table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Resolved] (IMPALA-2787) Support de-duplicate records in Impala

Reply via email to