Thanks, ShaoFeng,That's just what we need.I will use 'to-add' as reference, 
along with the online 'howto', that explains it 
all.Thanks,ke...@exponential.com Sent from my Samsung Galaxy smartphone.
-------- Original message --------From: ShaoFeng Shi <shaofeng...@apache.org> 
Date: 30/10/2018  7:49 am  (GMT+05:30) To: dev <dev@kylin.apache.org> Subject: 
Re: Merge Job in inconsistent state I'm updating the document for modifying 
metadata in a selective way, butthe jenkins has a problem today. Here is the 
to-add part inhttps://kylin.apache.org/docs/howto/howto_backup_metadata.html:## 
Restore metadata selectively (Recommended)If only changes a couple of metadata 
files, the administrator can just pickthese files to restore, without having to 
cover all the metadata. Comparedto the full recovery, this approach is more 
efficient, safer, so it isrecommended.Create a new empty directory, and then 
create subdirectories in itaccording to the location of the metadata files to 
restore; for example, torestore a Cube instance, you should create a "cube" 
subdirectory:{% highlight Groff markup %}mkdir /path/to/restore_newmkdir 
/path/to/restore_new/cube{% endhighlight %}Copy the metadata file to be 
restored to this new directory:{% highlight Groff markup %}cp 
meta_backups/meta_2016_06_10_20_24_50/cube/kylin_sales_cube.json/path/to/restore_new/cube/{%
 endhighlight %}At this point, you can modify/fix the metadata manually.Restore 
from this directory:{% highlight Groff markup %}cd 
$KYLIN_HOME./bin/metastore.sh restore /path/to/restore_new{% endhighlight 
%}Only the files in the folder will be uploaded to Kylin metastore.Similarly, 
after the recovery is finished, click Reload Metadata button onthe Web UI to 
flush the cache.kdcool6932 <kdcool6...@yahoo.com.invalid> 于2018年10月29日周一 
下午7:54写道:> Thanks guys,, really appreciate the prompt response.@ShaoFeng,Yes we 
have> the data in hive(or we can load that if needed). And we will be 
rebuilding> those segments. Do we have any mail thread or document or blog to 
refer to> for manully editing and restoring metadata. That would actually be 
great> help, as we often get into situations like this as we don't want to 
take> risk for 3 plus years (120+TB) data in Hbase for Kylin.Again,, really> 
appreciate the help provided.Thanks,Ketan Sent from my Samsung Galaxy> 
smartphone.> -------- Original message --------From: ShaoFeng Shi <> 
shaofeng...@apache.org> Date: 29/10/2018  12:36 pm  (GMT+05:30) To: dev <> 
dev@kylin.apache.org> Subject: Re: Merge Job in inconsistent state It is> a 
known issue; the auto-merge was triggered on each segment change.Maybe,> Kylin 
should not trigger the auto-merge on canceling/deleting> ajob/segment?But you 
can keep that error job/segment, it won't impact on> the query. Theonly thing 
is an error job there.Do you know the root cause> of  "No input paths specified 
in job"? Did youdelete some folders from> HDFS?If you have the source data in 
Hive, you can rebuild those segments;> Youcan take a backup of the metadata, 
and then dump metadata in local> disk,copy that cube's json to a clean folder, 
edit it to delete these> segment,and then restore metadata from the clean 
folder (same structure,> e.g/cube/yourcube.json). After restored, build the 
segments for the> missingdate range.Chao Long <wayn...@qq.com> 于2018年10月29日周一 
下午2:12写道:> Hi> Ketan,>    As this merge job is an automatically triggered job, 
so it start> again> when you discard it. If you don't want this job to be 
triggered> again, you> can remove the "Auto Merge" related configuration on the 
cube> design page> until the problem is resolved or fixed(if it's a bug).>>>> 
This is the merging Segment[20181005080000_20181012170000]. And the>> exception 
occurred during the merging job, so it's in an incorrect> state.>    Segment: 
20181005080000_20181012170000 //**This segment Table> was> deleted from Hbase 
(somehow, we don’t have the reason)**//>    Start> Time: 2018-10-05 08:00:00>   
 End Time: 2018-10-12 17:00:00>    Source> Count: 0>    HBase Table: 
KYLIN_CFLY2CKMCU>    Region Count: 3>    Size:> less than 1 MB>>>    To 
identify the root cause, you may provide more log> around the error> 
message.>>>>>> ------------------ 原始邮件> ------------------> 发件人: "ketan 
dikshit"<kdcool6...@yahoo.com.INVALID>;>> 发送时间: 2018年10月28日(星期天) 凌晨0:25> 收件人: 
"dev"<dev@kylin.apache.org>;>> 主题:> Merge Job in inconsistent state>>>> Hi 
Team,> We are using Kylin 2.3.1, And> in the merge Job(which gets 
automatically> triggered), we are getting this> error in Merge Cuboid Data 
Step;>> java.io.IOException: No input paths> specified in job>         at>> 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:239)>>
 at>> 
org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:59)>>
 at>> 
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:387)>>
 at org.apache.kylin.engine.mr>> 
.common.AbstractHadoopJob.getTotalMapInputMB(AbstractHadoopJob.java:622)>> at>> 
org.apache.kylin.storage.hbase.steps.HBaseMROutput2Transition$HBaseMergeMROutputFormat.configureJobOutput(HBaseMROutput2Transition.java:166)>>
 at org.apache.kylin.engine.mr>> 
.steps.MergeCuboidJob.run(MergeCuboidJob.java:82)>         at> 
org.apache.kylin.engine.mr>> 
.common.MapReduceExecutable.doWork(MapReduceExecutable.java:130)>> at>> 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)>>
 at>> 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:67)>>
 at>> 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)>>
 at>> 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:300)>>
 at>> 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)>>
 at>> 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)>>
 at java.lang.Thread.run(Thread.java:745)>> result code:2>> As it is a merge> 
job, whenever I discard this job,, it again starts> automatically.> So we> need 
a way(if there is one), of permanently discarding this job> somehow.>>> Also in 
case this can be done by changing the metadata for the cube>> segment, do let 
me know how it can be done(even if I have to delete/remove>> the segments for 
these below days)> We are open for all options (except> dropping the cube as it 
has more than> 1 year data, and is crucial for> organisation). We are stuck on 
this for> some days now, help would really> be appreciated.>> Merge 
Thresholds:> 8 (Hours)> 1 (Days)> 7 (Days)> 15> (Days)>> Merge Job, (start and 
end):> MERGE CUBE - XXXX -> 20181005080000_20181012170000>> Here are list of 
segments;> Segment:> 20181005080000_20181006080000> Start Time: 2018-10-05 
08:00:00> End Time:> 2018-10-06 08:00:00> Source Count: 14899048> HBase Table: 
KYLIN_R1MUK56K71>> Region Count: 1> Size: 860 MB>> Segment: 
20181005080000_20181012170000> //**This segment Table was deleted> from Hbase 
(somehow, we don’t have the> reason)**//> Start Time: 2018-10-05 08:00:00> End 
Time: 2018-10-12> 17:00:00> Source Count: 0> HBase Table: KYLIN_CFLY2CKMCU> 
Region Count: 3>> Size: less than 1 MB>> Segment: 
20181006080000_20181008000000> Start Time:> 2018-10-06 08:00:00> End Time: 
2018-10-08 00:00:00> Source Count: 24455686>> HBase Table: KYLIN_0KH6PHTEM2> 
Region Count: 1> Size: 1.0498 GB>> Segment:> 20181008000000_20181009000000> 
Start Time: 2018-10-08 00:00:00> End Time:> 2018-10-09 00:00:00> Source Count: 
14882090> HBase Table: KYLIN_V1CC4LDSIR>> Region Count: 1> Size: 598 MB>> 
Segment: 20181009000000_20181010000000>> Start Time: 2018-10-09 00:00:00> End 
Time: 2018-10-10 00:00:00> Source> Count: 16245847> HBase Table: 
KYLIN_4A44K2VJEU> Region Count: 1> Size: 628> MB>> Segment: 
20181010000000_20181010080000> Start Time: 2018-10-10> 00:00:00> End Time: 
2018-10-10 08:00:00> Source Count: 5213022> HBase> Table: KYLIN_EHO316VC7M> 
Region Count: 1> Size: 397 MB>> Segment:> 20181010080000_20181010090000> Start 
Time: 2018-10-10 08:00:00> End Time:> 2018-10-10 09:00:00> Source Count: 
865722> HBase Table: KYLIN_I9LEJ2JDZ8>> Region Count: 1> Size: 181 MB>> 
Segment: 20181010090000_20181010100000>> Start Time: 2018-10-10 09:00:00> End 
Time: 2018-10-10 10:00:00> Source> Count: 859127> HBase Table: 
KYLIN_9IBX3W4UNL> Region Count: 1> Size: 180> MB>> Segment: 
20181010100000_20181010110000> Start Time: 2018-10-10> 10:00:00> End Time: 
2018-10-10 11:00:00> Source Count: 855752> HBase Table:> KYLIN_HRDJ16B3O8> 
Region Count: 1> Size: 179 MB>> Segment:> 20181010110000_20181010120000> Start 
Time: 2018-10-10 11:00:00> End Time:> 2018-10-10 12:00:00> Source Count: 
849363> HBase Table: KYLIN_6BFHFA5LU1>> Region Count: 1> Size: 178 MB>> 
Segment: 20181010120000_20181010130000>> Start Time: 2018-10-10 12:00:00> End 
Time: 2018-10-10 13:00:00> Source> Count: 851162> HBase Table: 
KYLIN_H41KZXUIRN> Region Count: 1> Size: 177> MB>> Segment: 
20181010130000_20181010140000> Start Time: 2018-10-10> 13:00:00> End Time: 
2018-10-10 14:00:00> Source Count: 836481> HBase Table:> KYLIN_8RXPI7T0PA> 
Region Count: 1> Size: 173 MB>> Segment:> 20181010140000_20181010150000> Start 
Time: 2018-10-10 14:00:00> End Time:> 2018-10-10 15:00:00> Source Count: 
780337> HBase Table: KYLIN_7L3WHR3ZQY>> Region Count: 1> Size: 164 MB>> 
Segment: 20181010150000_20181010160000>> Start Time: 2018-10-10 15:00:00> End 
Time: 2018-10-10 16:00:00> Source> Count: 723669> HBase Table: 
KYLIN_RM0ICHV5EP> Region Count: 1> Size: 155> MB>> Segment: 
20181010160000_20181011170000> Start Time: 2018-10-10> 16:00:00> End Time: 
2018-10-11 17:00:00> Source Count: 17476745> HBase> Table: KYLIN_Y8ZOSIWNJP> 
Region Count: 1> Size: 941 MB>> Segment:> 20181011170000_20181012170000> Start 
Time: 2018-10-11 17:00:00> End Time:> 2018-10-12 17:00:00> Source Count: 
15485276> HBase Table: KYLIN_RWTQZFY6J4>> Region Count: 1> Size: 887 MB>> 
Thanks,> Ketan@Exponential-- Best> regards,Shaofeng Shi 史少锋-- Best 
regards,Shaofeng Shi 史少锋

Reply via email to