Hello Ketan,

Thanks for the reporting.

Firstly, the HBase table name is generated at the moment a segment is
created; Seeing the table name does not mean it already exists in HBase.
Only when the build job is executed to the "Create HTable" step, Kylin will
request the table in HBase. So for a "NEW" segment, it is expected that the
table does not exist.

For the problem "some of the existing segment tables get deleted from
Hbase", as I know normally Kylin won't delete htable (unless merging
segments or run the StorageCleanupJob). Could you please check Kylin and
HBase logs to see when and how the table be dropped? If a table is deleted,
there should be something be logged. The logs will be helpful for analyzing
the issue.


2018-04-06 16:18 GMT+08:00 ketan dikshit <kdcool6...@yahoo.com.invalid>:

> Hi Team,
> We recently upgraded from Kylin 2.1 to Kylin 2.3.1
>
> Since then we are facing some issues with our cube building pipeline.
>
> While building new segments, sometimes(not always), some of the existing
> segment tables get deleted from Hbase.
> For example here is the segment Json for empty segment, it shows me the
> table name , but this table gets dropped from Hbase.
> {  <>
>          "uuid":"90972f20-d64d-4224-a1f1-cdb6a0ddb69c",
>          "name":"20180402110000_20180402120000",
>          "storage_location_identifier":"KYLIN_Z3C9IU2QNY",
>          "date_range_start":1522666800000,
>          "date_range_end":1522670400000,
>          "source_offset_start":0,
>          "source_offset_end":0,
>          "status":"NEW",
>          "size_kb":0,
>          "input_records":0,
>          "input_records_size":0,
>          "last_build_time":0,
>          "last_build_job_id":null,
>          "create_time_utc":1522700641299,
>          "cuboid_shard_nums":{  <>
>
>          },
>          "total_shards":0,
>          "blackout_cuboids":[  <>
>
>          ],
>          "binary_signature":null,
>          "dictionaries":{  <>
>
>          },
>          "snapshots":null,
>          "rowkey_stats":[  <>
>
>          ]
>       }
>
> This data loss is actually proving out very heavy business impact as we
> are always going back and restoring previous day snapshots and building the
> new segments again, hoping it doesn’t fails.
> Here are my kylin.props
> kylin.web.timezone=US/Pacific
> kylin.metadata.url=kylin2.1MetadataProduction@hbase
> kylin.storage.url=hbase
> kylin.env.hdfs-working-dir=/tmp/kylin-2.1-prod
> kylin.engine.mr.reduce-input-mb=300
> kylin.server.mode=all
> kylin.job.use-remote-cli=false
> kylin.job.remote-cli-working-dir=/tmp/kylin-2.1
> kylin.job.max-concurrent-jobs=10
> kylin.engine.mr.yarn-check-interval-seconds=10
> kylin.source.hive.database-for-flat-table=tmp_kylin
> kylin.storage.hbase.table-name-prefix=KYLIN_
> kylin.storage.hbase.compression-codec=lz4
> kylin.storage.hbase.region-cut-gb=3
> kylin.storage.hbase.min-region-count=1
> kylin.storage.hbase.max-region-count=500
> kylin.storage.partition.max-scan-bytes=16106127360
> kylin.storage.hbase.coprocessor-mem-gb=6
> kylin.security.profile=testing
>
> kylin.query.cache-enabled=true
> kylin.query.cache-threshold-duration=500
> kylin.query.cache-threshold-scan-count=10240
> kylin.storage.hbase.scan-cache-rows=4096
>
>
> Any idea around how and why this corruption might happen, How can even
> data get dropped while building some other segments.
>
> Thanks,
> Ketan@Exponential




-- 
Best regards,

Shaofeng Shi 史少锋

Reply via email to