[ 
https://issues.apache.org/jira/browse/KYLIN-4683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173887#comment-17173887
 ] 

ASF GitHub Bot commented on KYLIN-4683:
---------------------------------------

hit-lacus edited a comment on pull request #1351:
URL: https://github.com/apache/kylin/pull/1351#issuecomment-671059337


   Before we scale up the topic partition , we should doing the following steps:
   1. Disable the cube, thus all consumption task will be cacelled.
   2. Use the REST API 
`http://${KYLIN_INSTANCE_IP}7236/kylin/cubes/view/${CUBE_NAME}/instancejson` to 
check the `CubeInstance.json`. You will find each READY segment has a property 
named `stream_source_checkpoint`. Here s part of its content: 
   ```json
   {
         "uuid": "aab58181-29f7-0593-d7f8-ca9d9b97d49b",
         "name": "20200809200000_20200809210000",
         "storage_location_identifier": "APACHE:REALTIME_OLAP_ISH2P3HM30",
         "date_range_start": 1597003200000,
         "date_range_end": 1597006800000,
         "source_offset_start": 0,
         "source_offset_end": 0,
         "status": "READY",
         "size_kb": 102984,
         "is_merged": false,
         "estimate_ratio": null,
         "input_records": 118216,
         "input_records_size": 0,
         "last_build_time": 1596982095554,
         "last_build_job_id": "8f3660b8-72f7-7aac-e3bd-1e06ea30d8e2",
         "create_time_utc": 1596981735548,
         "cuboid_shard_nums": {},
         "total_shards": 1,
         "blackout_cuboids": [],
         "binary_signature": null,
         "dictionaries": {
           "USERACTIONSTREAM.DEVICE_BRAND": 
"/dict/APACHE.USERACTIONSTREAM/DEVICE_BRAND/afb80e63-4fef-d575-3483-1e0314bf4bef.dict",
           "USERACTIONSTREAM.DEVIDE_TYPE": 
"/dict/APACHE.USERACTIONSTREAM/DEVIDE_TYPE/61bf9051-3cdd-ec82-bc0f-1bb3226bb411.dict",
           "USERACTIONSTREAM.LOCATION_CITY": 
"/dict/APACHE.USERACTIONSTREAM/LOCATION_CITY/292ce446-62a1-0b12-e4d0-13bc249b0dbe.dict",
           "USERACTIONSTREAM.PAGE_ID": 
"/dict/APACHE.USERACTIONSTREAM/PAGE_ID/12fa188b-0f89-db3d-560a-8aea5b970349.dict",
           "USERACTIONSTREAM.NETWORK_TYPE": 
"/dict/APACHE.USERACTIONSTREAM/NETWORK_TYPE/99d38dea-25ef-c03d-73b7-19a6fda2ce4c.dict",
           "USERACTIONSTREAM.STR_MINUTE_SECOND": 
"/dict/APACHE.USERACTIONSTREAM/STR_MINUTE_SECOND/2b23e9aa-dee0-b88f-7e3e-0a3d74d89f89.dict",
           "USERACTIONSTREAM.ACT_TYPE": 
"/dict/APACHE.USERACTIONSTREAM/ACT_TYPE/e77f008d-6bd8-bc1f-ba94-ff62764c3e14.dict",
           "USERACTIONSTREAM.UID": 
"/dict/APACHE.USERACTIONSTREAM/UID/665546b1-424a-fc42-a35b-1a58fcd1fb5f.dict"
         },
         "snapshots": null,
         "rowkey_stats": [
           [
             "ACT_TYPE",
             10,
             1
           ],
           [
             "NETWORK_TYPE",
             4,
             1
           ],
           [
             "LOCATION_CITY",
             7,
             1
           ],
           [
             "STR_MINUTE_SECOND",
             3600,
             2
           ],
           [
             "PAGE_ID",
             50,
             1
           ],
           [
             "DEVICE_BRAND",
             5,
             1
           ],
           [
             "DEVIDE_TYPE",
             60,
             1
           ],
           [
             "UID",
             19543,
             2
           ]
         ],
         "stream_source_checkpoint": 
"{\"0\":363171,\"1\":363198,\"2\":363249,\"3\":363171,\"4\":363199,\"5\":363250,\"6\":363170,\"7\":363196,\"8\":363250,\"9\":363170}"
       }
   ```
   
   3. Check `${KYLIN_RECEIVER_HOME}/logs/kylin_streaming_receiver.log`, you can 
find the some output:
   
   ```
   2020-08-09 22:34:05,381 INFO  [UserAnalysisCube_channel] 
storage.StreamingSegmentManager:645 : Print check point for cube 
UserAnalysisCube 
,CheckPoint{sourceConsumePosition='{"0":381733,"1":381763,"2":381820,"3":381733,"4":381764,"5":381820,"6":381732,"7":381760,"8":381821,"9":381733}',
 persistedIndexes={1597006800000=13, 1597010400000=8}, 
longLatencyInfo=LongLatencyInfo{longLatencyEventCnts={20200808000000_20200808010000=3,
 20200808060000_20200808070000=2, 20200809000000_20200809010000=2, 
20200809060000_20200809070000=2}, totalLongLatencyEventCnt=9}, 
segmentSourceStartPosition={1597006800000={"0":363171,"1":363198,"2":363249,"3":363171,"4":363199,"5":363250,"6":363170,"7":363196,"8":363250,"9":363170},
 
1597010400000={"0":375013,"1":375040,"2":375099,"3":375013,"4":375041,"5":375099,"6":375012,"7":375038,"8":375100,"9":375013}},
 checkPointTime=1596983645381, totalCount=3817689, checkPointCount=5801}
   ```
   
   These logs indicated that the data ingetsed and indexed in receiver side is 
checkpointed at position : 
   
   ```json
   {
       "0":375013,
       "1":375040,
       "2":375099,
       "3":375013,
       "4":375041,
       "5":375099,
       "6":375012,
       "7":375038,
       "8":375100,
       "9":375013
   }
   ```
   
   4. When disable cube , data ingetsed and indexed in receiver side will be 
removed, so when scaled up, we expected receiver will continue its consumpution 
after following position:
   ```json
   {
       "0":363171,
       "1":363198,
       "2":363249,
       "3":363171,
       "4":363199,
       "5":363250,
       "6":363170,
       "7":363196,
       "8":363250,
       "9":363170
   }
   ```
   
   5. So let's check if it is correct ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fail to consume kafka when partition number get larger
> ------------------------------------------------------
>
>                 Key: KYLIN-4683
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4683
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v3.0.2
>            Reporter: tianhui
>            Priority: Major
>         Attachments: image-2020-08-05-17-20-37-270.png
>
>
> I run a testing streaming cube with kafka. At first, the topic has 3 
> partitions, and the cube running smoothly. But after I alter kafka topic to 7 
> partitions, all receivers stop consuming. !image-2020-08-05-17-20-37-270.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to