[ 
https://issues.apache.org/jira/browse/KYLIN-5000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17355493#comment-17355493
 ] 

star_dev edited comment on KYLIN-5000 at 6/2/21, 5:34 AM:
----------------------------------------------------------

@[~zhangyaqian]

1. 分页形式独立请求

一次执行

SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
 GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
 LIMIT 1000 OFFSET 0

一次执行

SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
 GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
 LIMIT 1000 OFFSET 1000

结果有重复,如图

!image-2021-06-02-13-31-44-052.png!

 

2. union 形式请求

select COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID, kkk from (
 select * from (SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk 
FROM MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
 GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
 LIMIT 1000 OFFSET 0)
 UNION 
 SELECT * from (
 SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
 GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
 LIMIT 1000 OFFSET 1000)
 UNION 
 SELECT * from (
 SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
 GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
 LIMIT 1000 OFFSET 2000)
 UNION 
 SELECT * from (
 SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
 GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
 LIMIT 1000 OFFSET 3000)
 ) tmp

结果无重复,如图

!image-2021-06-02-13-33-02-400.png!

3. kylin导出数据

 [^0-1000.csv]

^[^1000-2000.csv]^

^^[^0-4000.csv]^^

 


was (Author: star_dev):
[~zhangyaqian]

1. 分页形式独立请求

一次执行

SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 0

一次执行

SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 1000

结果有重复,如图

!image-2021-06-02-13-31-44-052.png!

 

2. union 形式请求

select COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID, kkk from (
select * from (SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk 
FROM MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 0)
UNION 
SELECT * from (
SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 1000)
UNION 
SELECT * from (
SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 2000)
UNION 
SELECT * from (
SELECT COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,COUNT (1) as kkk FROM 
MIRROR_DATABUS_PARTNER_TRANSACTION_ACTIVATE WHERE etl_dt>='20210501' AND 
etl_dt<='20210525' AND TRANSACTION_MONTH = '202105' AND BUSINESS_CLASS='DTH' 
GROUP BY TRANSACTION,COMPANY_ID,TRANSACTION_MONTH,PARTNER_ID,BUSINESS_CLASS
LIMIT 1000 OFFSET 3000)
) tmp

结果无重复,如图

!image-2021-06-02-13-33-02-400.png!

3. [^0-1000.csv]

> kylin query分页多次请求间返回结果有重复
> -------------------------
>
>                 Key: KYLIN-5000
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5000
>             Project: Kylin
>          Issue Type: Task
>          Components: Query Engine
>    Affects Versions: v3.1.1
>            Reporter: star_dev
>            Priority: Major
>         Attachments: 0-1000.csv, 0-4000.csv, 1000-2000.csv, 
> image-2021-06-02-13-31-44-052.png, image-2021-06-02-13-33-02-400.png
>
>
> 请以分页形式分多次请求kylin查询时,如第一次
> SELECT 分组条件, COUNT (1) as kkk FROM table
> WHERE 过滤条件
> GROUP BY 分组条件
> LIMIT 1000 OFFSET 0
> 第二次
> SELECT 分组条件, COUNT (1) as kkk FROM table
> WHERE 过滤条件
> GROUP BY 分组条件
> LIMIT 1000 OFFSET 1000
> 两次查询的结果有重复数据
>  
> 但当执行
> SELECT 分组条件, COUNT (1) as kkk FROM table
> WHERE 过滤条件
> GROUP BY 分组条件
> LIMIT 1000 OFFSET 0
> union 
> SELECT 分组条件, COUNT (1) as kkk FROM table
> WHERE 过滤条件
> GROUP BY 分组条件
> LIMIT 1000 OFFSET 1000
> 时,返回了2000条不重复的数据
>  
> 问题:
>  # 当以分页形式分别独立请求kylin查询时,为何不同的请求间会有重复数据?
>  # 当以union 形式请求kylin查询时,能按照需求返回不重复的满足条数的数据,和1有什么不同吗?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to