Hi, lifei

After check your model.json, I found you use "HOUR_START" as your 
partition_date_column, which is not correct. 
I think you should change to "timestamp" and have another try. 

Source code at 
https://github.com/apache/kylin/blob/master/source-kafka/src/main/java/org/apache/kylin/source/kafka/TimedJsonStreamParser.java#L111

If you find any mistake, please let me know.

----------------
Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

    Hello, I am evaluating Kylin and tried to join streaming table and hive
    table, but now got unexpected behavior.
    
    All the scripts can be found in
    https://gist.github.com/OstCollector/a4ac396e3169aa42a416d96db3021195
    (may need to modify some script to match the environments)
    
    Environment: 
    Centos 7
    Hadoop on CDH-5.8
    dedicated Kafka-2.1 (not included in CDH)
    
    How to reproduce this problem:
    
    1. run gen_station.pl to generate dim table data
    2. run import-data.sh to build dim table in Hive
    3. run factdata.pl and pipe its output into kafka
    4. create tables TEST_WEATHER.STATION_INFO(hive)
    TEST_WEATHER.WEATHER(streaming) in Kylin
    5. create model and cube in Kylin, join WEATHER.SATION_ID = STATION.ID
    6. build the cube
    
    Expected behavior:
    The cube is built correctly and I can get data when search.
    
    Actual behavior:
    On apache-kylin-2.6.0-bin-cdh57: build failed at step #2 (Create
    Intermediate Flat Hive Table)
    On apache-kylin-2.5.2-bin-cdh57: got empty cube
    
    I also tried with this case without streaming, with the format of timestamp
    column changed to "%Y-%m-%d %H:%M:%S", and an additional table to store the
    mapping of timestamp and {hour,day,month,year}_start.
    In this case, the cube is built as expected. 
    
    
    In both failed cases, the intermediate fact table on Hive built in step #2
    seems to have wrong column order.
    e.g. on version 2.5.2-cdh57, the schema and content of temp table are shown
    below:
    
    CREATE EXTERNAL TABLE IF NOT EXISTS
    kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact
    (
    DAY_START date
    ,YEAR_START date
    ,STATION_ID string
    ,QUARTER_START date
    ,MONTH_START date
    ,TEMPERATURE bigint
    ,HOUR_START timestamp
    )
    STORED AS SEQUENCEFILE
    LOCATION
    
'hdfs://hz-dev-hdfs-service/user/admin/kylin-2/kylin_metadata/kylin-5dbe40eb-55ba-2245-c0b5-1e9efcb67937/kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact';
    ALTER TABLE
    kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact SET
    TBLPROPERTIES('auto.purge'='true');
    
    hive> select * from
    kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact limit
    10;
    OK
    NULL    2010-01-01      2010-01-01      2010-01-01      2010-01-01      
NULL   
    NULL
    NULL    2009-01-01      2009-10-01      2009-12-01      2009-12-31      
NULL   
    NULL
    NULL    2009-01-01      2009-10-01      2009-12-01      2009-12-31      
NULL   
    NULL
    NULL    2009-01-01      2009-10-01      2009-12-01      2009-12-31      
NULL   
    NULL
    NULL    2009-01-01      2009-10-01      2009-12-01      2009-12-31      
NULL   
    NULL
    NULL    2010-01-01      2010-01-01      2010-01-01      2010-01-01      
NULL   
    NULL
    NULL    2010-01-01      2010-01-01      2010-01-01      2010-01-01      
NULL   
    NULL
    NULL    2009-01-01      2009-10-01      2009-12-01      2009-12-31      
NULL   
    NULL
    NULL    2009-01-01      2009-10-01      2009-12-01      2009-12-31      
NULL   
    NULL
    NULL    2010-01-01      2010-01-01      2010-01-01      2010-01-01      
NULL   
    NULL
    Time taken: 0.421 seconds, Fetched: 10 row(s)
    
    While the the content of temp file is:
    # hdfs dfs -text
    
hdfs://hz-dev-hdfs-service/user/admin/kylin-2/kylin_metadata/kylin-5dbe40eb-55ba-2245-c0b5-1e9efcb67937/kylin_intermediate_weather_f32241e6_53c6_2949_b737_d9a88a4618df_fact/part-m-00001
    | head -n 10
    19/02/13 11:44:12 INFO zlib.ZlibFactory: Successfully loaded & initialized
    native-zlib library
    19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor
    [.deflate]
    19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor
    [.deflate]
    19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor
    [.deflate]
    19/02/13 11:44:12 INFO compress.CodecPool: Got brand-new decompressor
    [.deflate]
            0030322010-01-012010-01-012010-01-012010-01-012010-01-01
    07:00:001706
            0075762010-01-012010-01-012010-01-012010-01-012010-01-01
    07:00:002605
            0113882010-01-012010-01-012010-01-012010-01-012010-01-01
    07:00:002963
            0214922010-01-012010-01-012010-01-012010-01-012010-01-01
    07:00:001769
            0303062010-01-012010-01-012010-01-012010-01-012010-01-01 07:00:00432
            0377712010-01-012010-01-012010-01-012010-01-012010-01-01 07:00:00808
            0443462010-01-012010-01-012010-01-012010-01-012010-01-01
    07:00:001400
            0500512010-01-012010-01-012010-01-012010-01-012010-01-01 07:00:00342
            0537982010-01-012010-01-012010-01-012010-01-012010-01-01
    07:00:001587
            0597122010-01-012010-01-012010-01-012010-01-012010-01-01
    07:00:00-1309
    (the '\x01' character is not correctly copied)
    
    So what am I doing wrong?
    
    --
    Sent from: http://apache-kylin.74782.x6.nabble.com/
    

Reply via email to