1. Making it not run as mapred local I had to explictly add in the kylin
job config, it wasnt taking it from HADOOP_HOME/etc/
<property>
<name>mapreduce.cluster.administrators</name>
<value>hadoop</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
2. Kylin was defaulting to 8020 for hdfs when making hbase requests, and
was using 9000 when making hive request. I had to explicitly
set kylin.hbase.cluster.fs=hdfs://HOSTXX:9000/
I guess this is less of issue with Kylin but more with what combination of
HBASE, HIVE and DFS and YARN people use. HPD solves it to a certain degree,
but its not free of its own ness either. Compounded with the fact that
kylin does hbase tomcat to launch the binary, its some bit of a challenge
to get all the ducks in line.
It will be good if the wider community shares their combinations of binary
versions that have worked for them "error free".
FYI - i am using HBASE 1.1.1, hadoop(fs,yarn) 2.5.1, hive 0.14 and Kylin
1.3 prebuilt
Another thing to note is in the docs
http://kylin.apache.org/download/
It says requirements as Hbase 1.1.3+ (There hasnt been any + release thus
far, even 1.1.3 (not +) is only available RC0 is currently, Is this a typo?
On Sun, Jan 10, 2016 at 5:08 PM, ShaoFeng Shi <[email protected]>
wrote:
> Hi Michael, I saw your origin message but had no idea;
>
> I don't undrestand "realized that kylin by default was trying to do mapred
> local instead of on submitting to yarn clusters"; Kylin will load the
> default hadoop configs from user's installation, so user need make sure the
> server that runs Kylin has been properly configured with the cluster. You
> mentioned that "fixed that with params in the kylin.properties and
> yarn-site.xml", could you please elaborate which parameter is related with
> this and what change you made?
>
> The HDFS port issue, I didn't find where Kylin hard-code 8020 (except the
> test case configuration which assume connecting with a HDP sandbox). If
> this is a bug, could you report a JIRA with your environment information as
> well as the error logs? That would be helpful for understanding the issue,
> thank you;
>
>
>
> 2016-01-10 12:44 GMT+08:00 michael jones <[email protected]>:
>
> > After some bit of digging realized that kylin by default was trying to do
> > mapred local instead of on submitting to yarn clusters, fixed that with
> > params in the kylin.properties and yarn-site.xml. These errors were not
> > quite intuitive in the logs, prolly need more trace in debug mode.
> >
> > Another issue is with default dfs port. Hadoop 2.5 onwards frequent usage
> > is 9000. While older version defaults to 8020.
> >
> > In kylin 1.2 and 1.3 explicit override the dfs port dint work for me/
> dint
> > find the right property to set.
> > Instead restart dfs to listen on 8020 instead of 9000. This solve the
> > subsequent issue.
> >
> > FYI for those who are using this configuration
> > Kylin 1.2 or 1.3
> > Hbase 1.1.X
> > Hadoop 2.5.x or 2.6.x
> >
> >
> >
> > On Fri, Jan 8, 2016 at 1:32 PM, michael jones <[email protected]>
> > wrote:
> >
> > > I am consistently getting NPE for any cube building on stage2 of the
> > > process, for any cube.
> > > The sessionId consistently comes as null. Most likely i am missing
> > setting
> > > somthing in the conf. Am using the default kylin.properties of prebuit
> > > kylin 1.3 binary download.
> > >
> > >
> > >
> >
> +------------------------------------------------------------------------------------------------------+
> > > | Extract Fact Table Distinct Columns
> > > |
> > >
> > >
> >
> +------------------------------------------------------------------------------------------------------+
> > > 2016-01-08 13:26:24,895 INFO [pool-7-thread-1]
> > Configuration.deprecation:
> > > session.id is deprecated. Instead, use dfs.metrics.session-id
> > > 2016-01-08 13:26:24,896 INFO [pool-7-thread-1] jvm.JvmMetrics:
> > > Initializing JVM Metrics with processName=JobTracker, *sessionId=*
> > >
> > > ==> /opt/kylin/tomcat/logs/kylin_job.log <==
> > > [pool-7-thread-1]:[2016-01-08
> > >
> >
> 13:26:24,905][ERROR][org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:109)]
> > > - error running Executable
> > > java.lang.NullPointerException
> > > at org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(
> > > *MapReduceExecutable.java:79*)
> > > at
> > >
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
> > > at
> > >
> >
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
> > > at
> > >
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
> > > at
> > >
> >
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
> > > at
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > at
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > >
> > >
> > >
> >
> https://github.com/apache/kylin/blob/master/job/src/main/java/org/apache/kylin/job/common/MapReduceExecutable.java#L79
> > >
> > > Configuration conf = HadoopUtil.getCurrentConfiguration();
> > > Job job = new Cluster(conf).getJob(JobID.forName(mrJobId));
> > > if (job.getJobState() == JobStatus.State.FAILED) {
> > > //remove previous mr job info
> > > super.onExecuteStart(executableContext);
> > > }
> > >
> > > Either Kylin is not able to get the hadoop cluster or not able to
> submit
> > > the job.
> > > Anyone encountered this issue before ? I couldnt find in the docs on
> how
> > > to yarn cluster endpoint properties in kylin.properties
> > >
> > > Thanks
> > >
> > >
> > >
> > >
> > >
> > > On Thu, Jan 7, 2016 at 4:59 PM, michael jones <[email protected]>
> > > wrote:
> > >
> > >> Kyline - apache-kylin-1.3-HBase-1.1-SNAPSHOT-bin
> > >> Hbase - 1.1.2
> > >> Hadoop - 2.7.1
> > >> Hive 1.2.1 - Updated hive metastore to use mysqldb (instead of derby)
> > >> Hive/Hbase by itself work fine tested with created tables etc.
> > >>
> > >> Kylin Sandbox/Local setup.
> > >>
> > >> Kylin starts properly except for this error
> > >> java.io.FileNotFoundException:
> > >> /opt/hadoop-2.7.1/contrib/capacity-scheduler/*.jar (No such file or
> > >> directory)
> > >> (I figure this is ok to ignore since the default task scheduler in
> local
> > >> mode should work fine ?)
> > >>
> > >> After running bin/sample.sh
> > >>
> > >> Tried to follow steps as mentioned in the tutorial, When building the
> > >> cube, stage 1 successed.
> > >> #2 Step Name: Extract Fact Table Distinct Columns
> > >> Fails with NPE.
> > >>
> > >> -conf /opt/kylin/conf/kylin_job_conf.xml -cubename kylin_sales_cube
> > >> -output
> > >>
> >
> /kylin/kylin_metadata/kylin-1575f8f6-98df-44da-8ac2-b480ccd0380a/kylin_sales_cube/fact_distinct_columns
> > >> -jobname Kylin_Fact_Distinct_Columns_kylin_sales_cube_Step -tablename
> > >>
> >
> default.kylin_intermediate_kylin_sales_cube_desc_19700101000000_20160112000000_1575f8f6_98df_44da_8ac2_b480ccd0380a
> > >>
> > >> java.lang.NullPointerException
> > >> at
> > >>
> >
> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:79)
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
> > >> at
> > >>
> >
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
> > >> at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > >> at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > >> at java.lang.Thread.run(Thread.java:745)
> > >>
> > >>
> > >> kylin.log
> > >>
> > >> [http-bio-7070-exec-4]:[2016-01-07
> > >>
> >
> 15:31:37,797][DEBUG][org.apache.kylin.rest.filter.KylinApiFilter.logRequest(KylinApiFilter.java:120)]
> > >> - REQUEST: REQUESTER=ADMIN;REQ_TIME=GMT-08:00 2016-01-07
> > >>
> >
> 15:31:37;URI=/kylin/api/jobs/1575f8f6-98df-44da-8ac2-b480ccd0380a/resume;METHOD=PUT;QUERY_STRING=null;PAYLOAD=;RESP_STATUS=200;
> > >> [pool-6-thread-1]:[2016-01-07
> > >>
> >
> 15:32:17,880][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:102)]
> > >> - CubingJob{id=1575f8f6-98df-44da-8ac2-b480ccd0380a,
> > name=kylin_sales_cube
> > >> - 19700101000000_20160112000000 - BUILD - GMT-08:00 2016-01-07
> 14:39:59,
> > >> state=READY} prepare to schedule
> > >> [pool-6-thread-1]:[2016-01-07
> > >>
> >
> 15:32:17,880][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:106)]
> > >> - CubingJob{id=1575f8f6-98df-44da-8ac2-b480ccd0380a,
> > name=kylin_sales_cube
> > >> - 19700101000000_20160112000000 - BUILD - GMT-08:00 2016-01-07
> 14:39:59,
> > >> state=READY} scheduled
> > >>
> >
> +------------------------------------------------------------------------------------------------------+[pool-6-thread-1]:[2016-01-07
> > >>
> >
> 15:32:17,880][INFO][org.apache.kylin.job.impl.threadpool.DefaultScheduler$FetcherRunner.run(DefaultScheduler.java:112)]
> > >> - Job Fetcher: 0 running, 1 actual running, 1 ready, 0 others
> > >>
> > >> | kylin_sales_cube - 19700101000000_20160112000000 - BUILD - GMT-08:00
> > >> 2016-01-07 14:39:59 |
> > >>
> > >>
> >
> +------------------------------------------------------------------------------------------------------+
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,883][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:200)]
> > >> - Saving resource /execute_output/1575f8f6-98df-44da-8ac2-b480ccd0380a
> > >> (Store kylin_metadata@hbase)
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,885][INFO][org.apache.kylin.job.manager.ExecutableManager.updateJobOutput(ExecutableManager.java:241)]
> > >> - job id:1575f8f6-98df-44da-8ac2-b480ccd0380a from READY to RUNNING
> > >>
> > >>
> >
> +------------------------------------------------------------------------------------------------------+
> > >> | Extract Fact Table Distinct Columns
> > >> |
> > >>
> > >>
> >
> +------------------------------------------------------------------------------------------------------+
> > >> 2016-01-07 15:32:17,917 INFO [pool-7-thread-2] jvm.JvmMetrics: Cannot
> > >> initialize JVM Metrics with processName=JobTracker, sessionId= -
> already
> > >> initialized
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,918][ERROR][org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:109)]
> > >> - error running Executable
> > >> java.lang.NullPointerException
> > >> at
> > >>
> >
> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:79)
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
> > >> at
> > >>
> >
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
> > >> at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > >> at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > >> at java.lang.Thread.run(Thread.java:745)
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,923][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:200)]
> > >> - Saving resource
> > /execute_output/1575f8f6-98df-44da-8ac2-b480ccd0380a-01
> > >> (Store kylin_metadata@hbase)
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,927][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:200)]
> > >> - Saving resource
> > /execute_output/1575f8f6-98df-44da-8ac2-b480ccd0380a-01
> > >> (Store kylin_metadata@hbase)
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,929][INFO][org.apache.kylin.job.manager.ExecutableManager.updateJobOutput(ExecutableManager.java:241)]
> > >> - job id:1575f8f6-98df-44da-8ac2-b480ccd0380a-01 from READY to ERROR
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,929][ERROR][org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:109)]
> > >> - error running Executable
> > >> org.apache.kylin.job.exception.ExecuteException:
> > >> java.lang.NullPointerException
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:111)
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:51)
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:107)
> > >> at
> > >>
> >
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:130)
> > >> at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > >> at
> > >>
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > >> at java.lang.Thread.run(Thread.java:745)
> > >> Caused by: java.lang.NullPointerException
> > >> at
> > >>
> >
> org.apache.kylin.job.common.MapReduceExecutable.onExecuteStart(MapReduceExecutable.java:79)
> > >> at
> > >>
> >
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:106)
> > >> ... 6 more
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,933][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:200)]
> > >> - Saving resource /execute_output/1575f8f6-98df-44da-8ac2-b480ccd0380a
> > >> (Store kylin_metadata@hbase)
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,937][DEBUG][org.apache.kylin.common.persistence.ResourceStore.putResource(ResourceStore.java:200)]
> > >> - Saving resource /execute_output/1575f8f6-98df-44da-8ac2-b480ccd0380a
> > >> (Store kylin_metadata@hbase)
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,938][INFO][org.apache.kylin.job.manager.ExecutableManager.updateJobOutput(ExecutableManager.java:241)]
> > >> - job id:1575f8f6-98df-44da-8ac2-b480ccd0380a from RUNNING to ERROR
> > >> [pool-7-thread-2]:[2016-01-07
> > >>
> >
> 15:32:17,938][ERROR][org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:134)]
> > >> - ExecuteException job:1575f8f6-98df-44da-8ac2-b480ccd0380a
> > >> org.apache.kylin.job.exception.ExecuteException:
> > >> org.apache.kylin.job.exception.ExecuteException:
> > >> java.lang.NullPointerException
> > >>
> > >>
> > >>
> > >>
> > >> Am I missing something setup on kylin properties or missing jars ?
> > >>
> > >>
> > >>
> > >
> >
>
>
>
> --
> Best regards,
>
> Shaofeng Shi
>