Thanks, ShaoFeng,That's just what we need.I will use 'to-add' as reference,
along with the online 'howto', that explains it
all.Thanks,ke...@exponential.com Sent from my Samsung Galaxy smartphone.
-------- Original message --------From: ShaoFeng Shi <shaofeng...@apache.org>
Date: 30/10/2018 7:49 am (GMT+05:30) To: dev <dev@kylin.apache.org> Subject:
Re: Merge Job in inconsistent state I'm updating the document for modifying
metadata in a selective way, butthe jenkins has a problem today. Here is the
to-add part inhttps://kylin.apache.org/docs/howto/howto_backup_metadata.html:##
Restore metadata selectively (Recommended)If only changes a couple of metadata
files, the administrator can just pickthese files to restore, without having to
cover all the metadata. Comparedto the full recovery, this approach is more
efficient, safer, so it isrecommended.Create a new empty directory, and then
create subdirectories in itaccording to the location of the metadata files to
restore; for example, torestore a Cube instance, you should create a "cube"
subdirectory:{% highlight Groff markup %}mkdir /path/to/restore_newmkdir
/path/to/restore_new/cube{% endhighlight %}Copy the metadata file to be
restored to this new directory:{% highlight Groff markup %}cp
meta_backups/meta_2016_06_10_20_24_50/cube/kylin_sales_cube.json/path/to/restore_new/cube/{%
endhighlight %}At this point, you can modify/fix the metadata manually.Restore
from this directory:{% highlight Groff markup %}cd
$KYLIN_HOME./bin/metastore.sh restore /path/to/restore_new{% endhighlight
%}Only the files in the folder will be uploaded to Kylin metastore.Similarly,
after the recovery is finished, click Reload Metadata button onthe Web UI to
flush the cache.kdcool6932 <kdcool6...@yahoo.com.invalid> 于2018年10月29日周一
下午7:54写道:> Thanks guys,, really appreciate the prompt response.@ShaoFeng,Yes we
have> the data in hive(or we can load that if needed). And we will be
rebuilding> those segments. Do we have any mail thread or document or blog to
refer to> for manully editing and restoring metadata. That would actually be
great> help, as we often get into situations like this as we don't want to
take> risk for 3 plus years (120+TB) data in Hbase for Kylin.Again,, really>
appreciate the help provided.Thanks,Ketan Sent from my Samsung Galaxy>
smartphone.> -------- Original message --------From: ShaoFeng Shi <>
shaofeng...@apache.org> Date: 29/10/2018 12:36 pm (GMT+05:30) To: dev <>
dev@kylin.apache.org> Subject: Re: Merge Job in inconsistent state It is> a
known issue; the auto-merge was triggered on each segment change.Maybe,> Kylin
should not trigger the auto-merge on canceling/deleting> ajob/segment?But you
can keep that error job/segment, it won't impact on> the query. Theonly thing
is an error job there.Do you know the root cause> of "No input paths specified
in job"? Did youdelete some folders from> HDFS?If you have the source data in
Hive, you can rebuild those segments;> Youcan take a backup of the metadata,
and then dump metadata in local> disk,copy that cube's json to a clean folder,
edit it to delete these> segment,and then restore metadata from the clean
folder (same structure,> e.g/cube/yourcube.json). After restored, build the
segments for the> missingdate range.Chao Long <wayn...@qq.com> 于2018年10月29日周一
下午2:12写道:> Hi> Ketan,> As this merge job is an automatically triggered job,
so it start> again> when you discard it. If you don't want this job to be
triggered> again, you> can remove the "Auto Merge" related configuration on the
cube> design page> until the problem is resolved or fixed(if it's a bug).>>>>
This is the merging Segment[20181005080000_20181012170000]. And the>> exception
occurred during the merging job, so it's in an incorrect> state.> Segment:
20181005080000_20181012170000 //**This segment Table> was> deleted from Hbase
(somehow, we don’t have the reason)**//> Start> Time: 2018-10-05 08:00:00>
End Time: 2018-10-12 17:00:00> Source> Count: 0> HBase Table:
KYLIN_CFLY2CKMCU> Region Count: 3> Size:> less than 1 MB>>> To
identify the root cause, you may provide more log> around the error>
message.>>>>>> ------------------ 原始邮件> ------------------> 发件人: "ketan
dikshit"<kdcool6...@yahoo.com.INVALID>;>> 发送时间: 2018年10月28日(星期天) 凌晨0:25> 收件人:
"dev"<dev@kylin.apache.org>;>> 主题:> Merge Job in inconsistent state>>>> Hi
Team,> We are using Kylin 2.3.1, And> in the merge Job(which gets
automatically> triggered), we are getting this> error in Merge Cuboid Data
Step;>> java.io.IOException: No input paths> specified in job> at>>
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:239)>>
at>>
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:59)>>
at>>
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:387)>>
at org.apache.kylin.engine.mr>>
.common.AbstractHadoopJob.getTotalMapInputMB(AbstractHadoopJob.java:622)>> at>>
org.apache.kylin.storage.hbase.steps.HBaseMROutput2Transition$HBaseMergeMROutputFormat.configureJobOutput(HBaseMROutput2Transition.java:166)>>
at org.apache.kylin.engine.mr>>
.steps.MergeCuboidJob.run(MergeCuboidJob.java:82)> at>
org.apache.kylin.engine.mr>>
.common.MapReduceExecutable.doWork(MapReduceExecutable.java:130)>> at>>
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)>>
at>>
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:67)>>
at>>
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)>>
at>>
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:300)>>
at>>
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)>>
at>>
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)>>
at java.lang.Thread.run(Thread.java:745)>> result code:2>> As it is a merge>
job, whenever I discard this job,, it again starts> automatically.> So we> need
a way(if there is one), of permanently discarding this job> somehow.>>> Also in
case this can be done by changing the metadata for the cube>> segment, do let
me know how it can be done(even if I have to delete/remove>> the segments for
these below days)> We are open for all options (except> dropping the cube as it
has more than> 1 year data, and is crucial for> organisation). We are stuck on
this for> some days now, help would really> be appreciated.>> Merge
Thresholds:> 8 (Hours)> 1 (Days)> 7 (Days)> 15> (Days)>> Merge Job, (start and
end):> MERGE CUBE - XXXX -> 20181005080000_20181012170000>> Here are list of
segments;> Segment:> 20181005080000_20181006080000> Start Time: 2018-10-05
08:00:00> End Time:> 2018-10-06 08:00:00> Source Count: 14899048> HBase Table:
KYLIN_R1MUK56K71>> Region Count: 1> Size: 860 MB>> Segment:
20181005080000_20181012170000> //**This segment Table was deleted> from Hbase
(somehow, we don’t have the> reason)**//> Start Time: 2018-10-05 08:00:00> End
Time: 2018-10-12> 17:00:00> Source Count: 0> HBase Table: KYLIN_CFLY2CKMCU>
Region Count: 3>> Size: less than 1 MB>> Segment:
20181006080000_20181008000000> Start Time:> 2018-10-06 08:00:00> End Time:
2018-10-08 00:00:00> Source Count: 24455686>> HBase Table: KYLIN_0KH6PHTEM2>
Region Count: 1> Size: 1.0498 GB>> Segment:> 20181008000000_20181009000000>
Start Time: 2018-10-08 00:00:00> End Time:> 2018-10-09 00:00:00> Source Count:
14882090> HBase Table: KYLIN_V1CC4LDSIR>> Region Count: 1> Size: 598 MB>>
Segment: 20181009000000_20181010000000>> Start Time: 2018-10-09 00:00:00> End
Time: 2018-10-10 00:00:00> Source> Count: 16245847> HBase Table:
KYLIN_4A44K2VJEU> Region Count: 1> Size: 628> MB>> Segment:
20181010000000_20181010080000> Start Time: 2018-10-10> 00:00:00> End Time:
2018-10-10 08:00:00> Source Count: 5213022> HBase> Table: KYLIN_EHO316VC7M>
Region Count: 1> Size: 397 MB>> Segment:> 20181010080000_20181010090000> Start
Time: 2018-10-10 08:00:00> End Time:> 2018-10-10 09:00:00> Source Count:
865722> HBase Table: KYLIN_I9LEJ2JDZ8>> Region Count: 1> Size: 181 MB>>
Segment: 20181010090000_20181010100000>> Start Time: 2018-10-10 09:00:00> End
Time: 2018-10-10 10:00:00> Source> Count: 859127> HBase Table:
KYLIN_9IBX3W4UNL> Region Count: 1> Size: 180> MB>> Segment:
20181010100000_20181010110000> Start Time: 2018-10-10> 10:00:00> End Time:
2018-10-10 11:00:00> Source Count: 855752> HBase Table:> KYLIN_HRDJ16B3O8>
Region Count: 1> Size: 179 MB>> Segment:> 20181010110000_20181010120000> Start
Time: 2018-10-10 11:00:00> End Time:> 2018-10-10 12:00:00> Source Count:
849363> HBase Table: KYLIN_6BFHFA5LU1>> Region Count: 1> Size: 178 MB>>
Segment: 20181010120000_20181010130000>> Start Time: 2018-10-10 12:00:00> End
Time: 2018-10-10 13:00:00> Source> Count: 851162> HBase Table:
KYLIN_H41KZXUIRN> Region Count: 1> Size: 177> MB>> Segment:
20181010130000_20181010140000> Start Time: 2018-10-10> 13:00:00> End Time:
2018-10-10 14:00:00> Source Count: 836481> HBase Table:> KYLIN_8RXPI7T0PA>
Region Count: 1> Size: 173 MB>> Segment:> 20181010140000_20181010150000> Start
Time: 2018-10-10 14:00:00> End Time:> 2018-10-10 15:00:00> Source Count:
780337> HBase Table: KYLIN_7L3WHR3ZQY>> Region Count: 1> Size: 164 MB>>
Segment: 20181010150000_20181010160000>> Start Time: 2018-10-10 15:00:00> End
Time: 2018-10-10 16:00:00> Source> Count: 723669> HBase Table:
KYLIN_RM0ICHV5EP> Region Count: 1> Size: 155> MB>> Segment:
20181010160000_20181011170000> Start Time: 2018-10-10> 16:00:00> End Time:
2018-10-11 17:00:00> Source Count: 17476745> HBase> Table: KYLIN_Y8ZOSIWNJP>
Region Count: 1> Size: 941 MB>> Segment:> 20181011170000_20181012170000> Start
Time: 2018-10-11 17:00:00> End Time:> 2018-10-12 17:00:00> Source Count:
15485276> HBase Table: KYLIN_RWTQZFY6J4>> Region Count: 1> Size: 887 MB>>
Thanks,> Ketan@Exponential-- Best> regards,Shaofeng Shi 史少锋-- Best
regards,Shaofeng Shi 史少锋