Hi Ananthanarayanan.M.S,

Actually, we add a new column `__tmst` for each table, to persist the timestamp 
of each data row.
But the sql you list seems not like our exact logic, so I want to see your 
accuracy rule statement to have a check.


if your accuracy rule statement is like:
"source.product_id = target.product_id, source.market_id = target.market_id, 
source.period_id = target.period_id",
the sql for that step should be:
"SELECT source.* FROM source LEFT JOIN target ON coalesce(source. product_id, 
'null') = coalesce(target. product_id, 'null') and coalesce(source.market_id, 
'null') = coalesce(target.market_id, 'null') and coalesce(source. period_id, 
'null') = coalesce(target. period_id, 'null') WHERE (NOT (source.product_id IS 
NULL AND source.market_id IS NULL AND source. period_id IS NULL)) AND 
target.product_id IS NULL AND source.market_id IS NULL AND source. period_id IS 
NULL".


Hope this can help you, thanks.



--

Regards,
Lionel, Liu



At 2017-11-21 06:50:09, "William Guo" <gu...@apache.org> wrote:
>hello Ananthanarayanan.M.S,
>
>We support some partitioned tables, could you show us your partitioned table 
>description for check and some sample data will be helpful?
>
>Thanks,
>William
>________________________________
>From: Ananthanarayanan Ms <ananthanarayanan.ms...@nielsen.com>
>Sent: Tuesday, November 21, 2017 3:53 AM
>To: William Guo
>Cc: dev@griffin.incubator.apache.org; Abishek Kunduru
>Subject: Re: Re: Griffin Accuracy Doubts
>
>Hello William / Lionel,
>Could you please let us know if we can use 1.6. Could you please let know if 
>the partitioned tables alone can be input as src/tgt as below err is got when 
>we build and run the master code.
>
>17/11/20 14:51:13 ERROR engine.SparkSqlEngine: run spark sql [ SELECT `__tmst` 
>AS `__tmst`, source.product_id, source.market_id, 
>source.period_id,s5_volume_rom_distrib FROM source LEFT JOIN target ON 
>coalesce(source.period_id, 'null') = coalesce(target.period_id, 'null') and 
>coalesce(source.market_id, 'null') = coalesce(target.market_id, 'null') and 
>coalesce(source.product_id, 'null') = coalesce(target.product_id, 'null') 
>WHERE (NOT (source.product_id IS NULL )) GROUP BY `__tmst` ] error: Reference 
>'__tmst' is ambiguous, could be: __tmst#145L, __tmst#146L.; line 1 pos 413
>
>
>
>Regards,
>Ananthanarayanan.M.S
>
>On Wed, Oct 4, 2017 at 2:39 PM, William Guo 
><gu...@apache.org<mailto:gu...@apache.org>> wrote:
>
>hi Abishek,
>
>
>For SQL support, the version should be 0.1.6, we will release in Middle of 
>October.
>
>
>
>Thanks,
>
>William
>
>
>________________________________
>From: Abishek Kunduru 
><abishek.kund...@nielsen.com<mailto:abishek.kund...@nielsen.com>>
>Sent: Tuesday, October 3, 2017 12:35 AM
>To: 刘力力
>Cc: dev@griffin.incubator.apache.org<mailto:dev@griffin.incubator.apache.org>; 
>Vidyaranya Malladi; gu...@apache.org<mailto:gu...@apache.org>; Mara Preotescu; 
>Ananthanarayanan Ms
>Subject: Re: Re: Griffin Accuracy Doubts
>
>Thanks William and Lionel for the quick response. Can you please tell us when 
>would the Griffin with SQL support be released and its version number?
>
>Regarding the second question, Vidya is not referring to changing the source 
>or target data but instead DQ feature to compare source and target columns and 
>output the difference in terms of percentage change (Vidya gave an example as 
>well in the below table which outputs the difference between source and target 
>numeric columns).
>
>Best,
>Abishek
>
>
>On Fri, Sep 29, 2017 at 10:13 PM, 刘力力 
><bhlx3l...@163.com<mailto:bhlx3l...@163.com>> wrote:
>Hi Vidyaranya,
>
>I have some more comments about the answers from William. Pls check the 
>following comments.
>
>
>1. Does Griffin plan to support multi table being src eg. src table frame from 
>a qry joined thro tabl1,tabl2,tabln and selected few cols and then finally use 
>that as src and similarly for destination as well
>
>--Yes, Since we have change our backend to fully support sql.
>
>2. Would Griffin support the result metric col wise instead of overall matched 
>count.
>eg if src table is having 4 cols and dest is having 4 or more cols and user 
>wants to check the accuracy of col4 which is 'price' in src and destination by 
>having a new col added to the src table as '%change' which would be 
>abs(src_price - dest_price)/100
>
>--Yes, Since we have change our backend to fully support sql.
>
>Fact_tag        fact_sDesc      fat_Ldesc       SRC Fact_precision      SRC 
>Data        TGT Data        % Chg
>f000000000000000000100000000000000000000        Units   Sales Units     1      
> 25510.00        2551.00 0.00
>f000000000000000000100000000000000000001        Units (any promo)       Sales 
>Units (any promo) 1       5846.00 584.59  0.00
>f000000000000000000100000000000000000002        Units (no promo)        Sales 
>Units (no promo)  1       17172.00        1717.25 0.00
>
>
>--[commont from Lionel] Actually, I think the modification of data source is 
>like some ETL task, not the DQ domain, we consider that the DQ result from 
>griffin should be simple like a value or percentage or group of it, that might 
>be better for DQ check.
>
>
>3. Support udfs on the cols before comparing, eg one of our src is a table 
>from mainframe format where dots/decimals are ignore and we have to apply a 
>logic to get the decimals
>
>--For UDFs, if it registered in your computing cluster, it should be fine from 
>command line. we are considering to refactor ui to support sql editor directly.
>but how to provide better user experience from front end for UDF, we are still 
>considering it.
>
>4. Last time we discussed few features were said to be discussed to support 
>could you please let us know if we have any road map for the same. like 
>advanced sql features and adding groupby clause/udfs etc
>
>--Support SQL is already done, will have a quick release after we fully tested 
>and reviewed.
>
>5. Griffin compares the cols selected by using them in joins, do we plan to 
>enhance to check if the user could select few cols other than the mentioned in 
>joins and we need to compare the src col's value vs dest col's value lets say 
>without the join and only by order by clause.
>
>
>select col from src order by prikey
>select col from tgt order by prikey
>
>src.col vs tgt.col instead of the join key . Please let us know.
>
>-- For this case, it depends on how would you like to map src.col with 
>tgt.col, we don't need join key if you can elaborate mapping in sql since we 
>are fully supporting sql.
>
>--[comment from Lionel] We support sql directly, if you can describe your case 
>in sql, we can do that.
>
>Thanks very much.
>
>
>--
>Regards,
>Lionel, Liu
>
>
>At 2017-09-30 08:23:39, "William Guo" 
><guo...@outlook.com<mailto:guo...@outlook.com>> wrote:
>>hi Vidyaranya,
>>
>>
>>Please check my comments inline.
>>
>>
>>Thanks,
>>
>>William
>>
>>
>>________________________________
>>From: Vidyaranya Malladi 
>><vidyaranya.malladi...@nielsen.com<mailto:vidyaranya.malladi...@nielsen.com>>
>>Sent: Friday, September 29, 2017 10:06 PM
>>To: gu...@apache.org<mailto:gu...@apache.org>; 
>>guo...@outlook.com<mailto:guo...@outlook.com>
>>Cc: 
>>dev@griffin.incubator.apache.org<mailto:dev@griffin.incubator.apache.org>; 
>>Mara Preotescu; Abishek Kunduru; Ananthanarayanan Ms
>>Subject: Griffin Accuracy Doubts
>>
>>Dear William,
>> Good day. Please let us know on the below questions that we have
>>
>>1. Does Griffin plan to support multi table being src eg. src table frame 
>>from a qry joined thro tabl1,tabl2,tabln and selected few cols and then 
>>finally use that as src and similarly for destination as well
>>
>>--Yes, Since we have change our backend to fully support sql.
>>
>>2. Would Griffin support the result metric col wise instead of overall 
>>matched count.
>>eg if src table is having 4 cols and dest is having 4 or more cols and user 
>>wants to check the accuracy of col4 which is 'price' in src and destination 
>>by having a new col added to the src table as '%change' which would be 
>>abs(src_price - dest_price)/100
>>
>>--Yes, Since we have change our backend to fully support sql.
>>
>>Fact_tag        fact_sDesc      fat_Ldesc       SRC Fact_precision      SRC 
>>Data        TGT Data        % Chg
>>f000000000000000000100000000000000000000        Units   Sales Units     1     
>>  25510.00        2551.00 0.00
>>f000000000000000000100000000000000000001        Units (any promo)       Sales 
>>Units (any promo) 1       5846.00 584.59  0.00
>>f000000000000000000100000000000000000002        Units (no promo)        Sales 
>>Units (no promo)  1       17172.00        1717.25 0.00
>>
>>
>>3. Support udfs on the cols before comparing, eg one of our src is a table 
>>from mainframe format where dots/decimals are ignore and we have to apply a 
>>logic to get the decimals
>>
>>--For UDFs, if it registered in your computing cluster, it should be fine 
>>from command line. we are considering to refactor ui to support sql editor 
>>directly.
>>but how to provide better user experience from front end for UDF, we are 
>>still considering it.
>>
>>4. Last time we discussed few features were said to be discussed to support 
>>could you please let us know if we have any road map for the same. like 
>>advanced sql features and adding groupby clause/udfs etc
>>
>>--Support SQL is already done, will have a quick release after we fully 
>>tested and reviewed.
>>
>>5. Griffin compares the cols selected by using them in joins, do we plan to 
>>enhance to check if the user could select few cols other than the mentioned 
>>in joins and we need to compare the src col's value vs dest col's value lets 
>>say without the join and only by order by clause.
>>
>>
>>select col from src order by prikey
>>select col from tgt order by prikey
>>
>>src.col vs tgt.col instead of the join key . Please let us know.
>>
>>-- For this case, it depends on how would you like to map src.col with 
>>tgt.col, we don't need join key if you can elaborate mapping in sql since we 
>>are fully supporting sql.
>>
>>Thank you very much.
>>
>>
>>
>>Thanks & Regards
>>M. Vidyaranya
>>NDX
>>TCS - Nielsen Relationship
>>Mobile  : +91 9790591311
>
>
>
>【网易自营|30天无忧退货】仅售同款价1/4!MUJI制造商“2017秋冬舒适家居拖鞋系列”限时仅34.9元>>  
><http://you.163.com/item/detail?id=1165011&from=web_gg_mail_jiaobiao_9>
>
>

Reply via email to