Re:apache kylin 升级问题

2019-02-10 Thread nichunen
Hi 王林,


Please check http://kylin.apache.org/docs/howto/howto_upgrade.html


--


Best regards,

 

Ni Chunen / George




在 2019-02-11 14:43:14,"王林" <1059790...@qq.com> 写道:
>你好:
>   怎么将apache kylin 2.2.x 版本升级到 2.6.x 版本呢?
>   能否提供一点升级资料。
>
>
>谢谢,以上,王林。


apache kylin ????????

2019-02-10 Thread ????
??
   ??apache kylin 2.2.x ?? 2.6.x 
   ??


??

Re: Hdfs Working directory usage

2019-02-10 Thread kdcool6932
Thanks , a lot Xiaoxiang,Really appreciate, your support, to impart clarity to 
us on this.Thanks,KetanSent from my Samsung Galaxy smartphone.
 Original message From: Xiaoxiang Yu 
 Date: 11/02/2019  9:16 am  (GMT+05:30) To: 
dev@kylin.apache.org Cc: Xiaoxiang Yu  Subject: Re: 
Hdfs Working directory usage Hi, Ketan.This is what I find:- 
cuboid     - This dir 
contains the cuboid data with each row contains dimensions array and 
MeasureAggregator array.     - The size is depend on the cardinality of each 
columns and it is often very large.     - When merge job completed, cuboid file 
of all segments which be merged successfully will be deleted automatically.- 
fact_distinct_columns    
 - This dir contains the distinct value of each column.    - It should be 
deleted after current segment build job succeed.- 
hfile     - This dir 
contains data file which be bulk loaded into hbase.    - It should be deleted 
after current segment build job succeed.- 
rowkey_stats     - Files 
under this dir are often very small, you may not need deleted them yourself.    
- These files are used to partition hfile.I think you should update your 
auto-merge settings to let auto-merge more often, if you find any mistakes, 
please let me know, thank you!Best wishes,Xiaoxiang Yu  On 
[DATE], "[NAME]" <[ADDRESS]> wrote:    Hi team,    Any updates on the same ?    
      Thanks,    Ketan        > On 01-Feb-2019, at 11:39 AM, ketan dikshit 
 wrote:    >     > Hi Team,    >     > We have a lot of 
data accumulated in our hdfs-working-directory, so we want to understand the 
usage of the following job data, once the job has been completed and segment is 
successfully created.     >     > 
cuboid    > 
fact_distinct_columns    
> hfile    > 
rowkey_stats    >     > 
Basically I need to understand the purpose of: 
cuboid,fact_distinct_columns,hfile,rowkey_stats after the job has built the 
cube segment (assuming we don’t use and merging/automerging of segments on the 
cube later).    >     > The space taken up by these data in hdfs-working-dir is 
quite huge(affecting our costing), and is not getting cleaned by by cleanup 
job(org.apache.kylin.tool.StorageCleanupJob). So we need to be understand, that 
if we manually clean this up we will not get any issues later.    >     > 
Thanks,    > Ketan@Exponential        

Re: Hdfs Working directory usage

2019-02-10 Thread Xiaoxiang Yu
Hi, Ketan.

This is what I find:

- cuboid 
- This dir contains the cuboid data with each row contains dimensions array 
and MeasureAggregator array. 
- The size is depend on the cardinality of each columns and it is often 
very large. 
- When merge job completed, cuboid file of all segments which be merged 
successfully will be deleted automatically.
- fact_distinct_columns 
- This dir contains the distinct value of each column.
- It should be deleted after current segment build job succeed.
- hfile 
- This dir contains data file which be bulk loaded into hbase.
- It should be deleted after current segment build job succeed.
- rowkey_stats 
- Files under this dir are often very small, you may not need deleted them 
yourself.
- These files are used to partition hfile.

I think you should update your auto-merge settings to let auto-merge more 
often, if you find any mistakes, please let me know, thank you!



Best wishes,
Xiaoxiang Yu 
 

On [DATE], "[NAME]" <[ADDRESS]> wrote:

Hi team,
Any updates on the same ?  

Thanks,
Ketan

> On 01-Feb-2019, at 11:39 AM, ketan dikshit  wrote:
> 
> Hi Team,
> 
> We have a lot of data accumulated in our hdfs-working-directory, so we 
want to understand the usage of the following job data, once the job has been 
completed and segment is successfully created. 
> 
> cuboid
> 
fact_distinct_columns
> hfile
> rowkey_stats
> 
> Basically I need to understand the purpose of: 
cuboid,fact_distinct_columns,hfile,rowkey_stats after the job has built the 
cube segment (assuming we don’t use and merging/automerging of segments on the 
cube later).
> 
> The space taken up by these data in hdfs-working-dir is quite 
huge(affecting our costing), and is not getting cleaned by by cleanup 
job(org.apache.kylin.tool.StorageCleanupJob). So we need to be understand, that 
if we manually clean this up we will not get any issues later.
> 
> Thanks,
> Ketan@Exponential





Re: Kylin exist same segement

2019-02-10 Thread ShaoFeng Shi
Hi Bingmei,

Kylin doesn't allow two segments have the same or overlapped time range. So
you got the above error. You said you already have two segments for the
same range, could you please provide the Cube instance JSON? Is your Kylin
a multiple nodes cluster? If yes, how many "job" or "all" nodes in this
cluster?

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




Na Zhai  于2019年2月7日周四 上午1:35写道:

> Hi, XiebingMei.
>
>I can not reproduce this suitcase. Can you provide more
> information? When you built the segment for the third time, was it in the
> build phase? If there is a job for this segment, the system will alert this
> error.
>
> 发送自 Windows 10 版邮件应用
>
>
>
> 
> 发件人: XiebingMei 
> 发送时间: Sunday, February 3, 2019 11:48:02 AM
> 收件人: dev@kylin.apache.org
> 主题: Kylin exist same segement
>
> Hi Team,
>
> kylin  2.5.1 ,after i build cube same
> segement(2019020303_2019020304) two times,and two times building
> all
> success, When i build cube same segement third times.system alert
>
> "Segments overlap: cube_name[2019020303_2019020304] and
> cube_name[2019020303_2019020304]"
>
> {"code":"999","data":null,"msg":"Segments overlap:
> cube_name[2019020303_2019020304] and
>
> cube_name[2019020303_2019020304]","stacktrace":"org.apache.kylin.rest.exception.InternalErrorException:
> Segments overlap: cube_name[2019020303_2019020304] and
> cube_name[2019020303_2019020304]\n
> at
>
> org.apache.kylin.rest.controller.CubeController.buildInternal(CubeController.java:402)\n
> at
>
> org.apache.kylin.rest.controller.CubeController.rebuild(CubeController.java:354)\n
> at
>
> org.apache.kylin.rest.controller.CubeController.build(CubeController.java:343)\n
> at sun.reflect.GeneratedMethodAccessor257.invoke(Unknown Source)\n
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n
> at java.lang.reflect.Method.invoke(Method.java:498)\n
> at
>
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)\n
> at
>
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)\n
> at
>
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)\n
> at
>
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)\n
> at
>
> org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)\n
> at
>
> org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)\n
> at
>
> org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)\n
> at
>
> org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)\n
> at
>
> org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)\n
> at
>
> org.springframework.web.servlet.FrameworkServlet.doPut(FrameworkServlet.java:883)\n
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:653)\n
> at
>
> org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)\n
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)\n
> at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)\n
> at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n
> at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)\n
> at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)\n
> at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)\n
> at
>
> org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:317)\n
> at
>
> org.springframework.security.web.access.intercept.FilterSecurityInterceptor.invoke(FilterSecurityInterceptor.java:127)\n
> at
>
> org.springframework.security.web.access.intercept.FilterSecurityInterceptor.doFilter(FilterSecurityInterceptor.java:91)\n
> at
>
> org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n
> at
>
> org.springframework.security.web.access.ExceptionTranslationFilter.doFilter(ExceptionTranslationFilter.java:114)\n
> at
>
> org.springframework.security.web.FilterChainProxy$VirtualFilterChain.doFilter(FilterChainProxy.java:331)\n
> at
>
> 

Re: 答复: kylin 手动合并(merge)报错问题

2019-02-10 Thread ShaoFeng Shi
From the source code where the NPE be thrown, we can see it seems the cube
statistics file wasn't found in Kylin meta store. It seems your metadata is
incomplete:

https://github.com/apache/kylin/blob/2.2.x/engine-mr/src/main/java/org/apache/kylin/engine/mr/steps/MergeStatisticsStep.java#L80

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




王林 <1059790...@qq.com> 于2019年1月31日周四 下午5:11写道:

> 2.2.0
>
>
>
>
> -- 原始邮件 --
> 发件人: "Na Zhai";
> 发送时间: 2019年1月31日(星期四) 下午2:54
> 收件人: "dev@kylin.apache.org";
>
> 主题: 答复: kylin 手动合并(merge)报错问题
>
>
>
> Hi, wanglin.
>
>What’s your Kylin version? There is an issue about auto merge:
> https://issues.apache.org/jira/browse/KYLIN-3718.
>
> But I think your error is not related to that issue. It is may be caused
> by the deletion of cube_statistics directory.
>
>
>
> 发送自 Windows 10 版邮件应用
>
>
>
> 
> 发件人: 王林 <1059790...@qq.com>
> 发送时间: Monday, January 28, 2019 10:23:37 AM
> 收件人: dev
> 主题: kylin 手动合并(merge)报错问题
>
> 你好:
>   我使用kylin 创建了一个cube,开启了自动合并功能,合并周期为7天,28天。
> 但是发现kylin 自动合并功能没有生效,然后手动合并cube,合并最近几天的没有问题,但是合并以前的就报错信息如下:
>
>
> 报错步骤:#2 Step Name: Merge Cuboid Statistics Duration:
> 0.01 mins Waiting:  0 seconds
> 报错日志:java.lang.NullPointerException
> at 
> org.apache.kylin.engine.mr.steps.MergeStatisticsStep.doWork(MergeStatisticsStep.java:80)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:125)
>   at
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:64)
> at
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:125)
>   at
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:144)
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:745)
> 请教是什么原因。
> 谢谢


Re: kylin 自动merge 问题

2019-02-10 Thread ShaoFeng Shi
Hi Lin,

Could you please describe the problem you found in detail? Thanks!

Best regards,

Shaofeng Shi 史少锋
Apache Kylin PMC
Work email: shaofeng@kyligence.io
Kyligence Inc: https://kyligence.io/

Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html
Join Kylin user mail group: user-subscr...@kylin.apache.org
Join Kylin dev mail group: dev-subscr...@kylin.apache.org




王林 <1059790...@qq.com> 于2019年1月31日周四 上午11:57写道:

> 你好:
>  通过查看kylin自动merge功能未起作用通过跟踪源码:
> 当自动merge时触发:
> public void updateOnNewSegmentReady(String cubeName) {
> final KylinConfig kylinConfig = KylinConfig.getInstanceFromEnv();
> String serverMode = kylinConfig.getServerMode();
> if (Constant.SERVER_MODE_JOB.equals(serverMode.toLowerCase())
> ||
> Constant.SERVER_MODE_ALL.equals(serverMode.toLowerCase())) {
> CubeInstance cube = getCubeManager().getCube(cubeName);
> if (cube != null) {
> CubeSegment seg = cube.getLatestBuiltSegment();
> if (seg != null && seg.getStatus() ==
> SegmentStatusEnum.READY) {
> keepCubeRetention(cubeName);
> mergeCubeSegment(cubeName);
> }
> }
> }
> }
>
>
>
> private void mergeCubeSegment(String cubeName) {
> CubeInstance cube = getCubeManager().getCube(cubeName);
> if (!cube.needAutoMerge())
> return;
>
>
> synchronized (CubeService.class) {
> try {
> cube = getCubeManager().getCube(cubeName);
> SegmentRange offsets = cube.autoMergeCubeSegments();
> if (offsets != null) {
> CubeSegment newSeg =
> getCubeManager().mergeSegments(cube, null, offsets,
> true);//这个触发是流式cube的merge。但是我的cube不是流式cube。报异常。
> logger.debug("Will submit merge job on " + newSeg);
> DefaultChainedExecutable job =
> EngineFactory.createBatchMergeJob(newSeg, "SYSTEM");
> getExecutableManager().addJob(job);
> } else {
> logger.debug("Not ready for merge on cube " +
> cubeName);
> }
> } catch (IOException e) {
> logger.error("Failed to auto merge cube " + cubeName, e);
> }
> }
> }
>
>
>
> public CubeSegment mergeSegments(CubeInstance cube, TSRange tsRange,
> SegmentRange segRange, boolean force)
> throws IOException {
> if (cube.getSegments().isEmpty())
> throw new IllegalArgumentException("Cube " + cube + " has no
> segments");
>
>
> checkInputRanges(tsRange, segRange);
> checkBuildingSegment(cube);
> checkCubeIsPartitioned(cube);
>
>
> if (cube.getSegments().getFirstSegment().isOffsetCube()) {
> // offset cube, merge by date range?
> if (segRange == null && tsRange != null) {
> Pair pair =
> cube.getSegments(SegmentStatusEnum.READY)
> .findMergeOffsetsByDateRange(tsRange,
> Long.MAX_VALUE);
> if (pair == null)
> throw new IllegalArgumentException("Find no segments
> to merge by " + tsRange + " for cube " + cube);
> segRange = new
> SegmentRange(pair.getFirst().getSegRange().start,
> pair.getSecond().getSegRange().end);
> }
> tsRange = null;
> Preconditions.checkArgument(segRange != null);
> } else {
> segRange = null;
> Preconditions.checkArgument(tsRange != null);//抛出异常
> }
>
>
>
> 通过跟踪代码,自动合并有问题,我使用的是kylin2.2.x源码,是否存在问题?