Just double check it; The error message is clear, and do some search with Spark + Hive.
If possible, we suggest using the sequence file (default config) for the intermediate hive table. Best regards, Shaofeng Shi 史少锋 Apache Kylin PMC Work email: [email protected] Kyligence Inc: https://kyligence.io/ Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html Join Kylin user mail group: [email protected] Join Kylin dev mail group: [email protected] Kang-Sen Lu <[email protected]> 于2018年12月3日周一 下午9:33写道: > Hi, Shaofeng: > > > > Thanks for the reply. > > > > This is a line in my kylin.properties: > > > > kylin.source.hive.flat-table-storage-format=TEXTFILE > > > > I copied hive-site.xml into spark/conf and try to resume the cube rebuild. > > (cp /etc/hive2/2.5.6.0-40/0/hive-site.xml spark/conf) > > > > The cube-build still failed, the stderr log is as follows: > > > > 18/12/03 08:27:02 INFO metastore.ObjectStore: Initialized ObjectStore > > 18/12/03 08:27:02 WARN metastore.ObjectStore: Version information not > found in metastore. hive.metastore.schema.verification is not enabled so > recording the schema version 1.2.0 > > 18/12/03 08:27:02 WARN metastore.ObjectStore: Failed to get database > default, returning NoSuchObjectException > > 18/12/03 08:27:02 INFO metastore.HiveMetaStore: Added admin role in > metastore > > 18/12/03 08:27:02 INFO metastore.HiveMetaStore: Added public role in > metastore > > 18/12/03 08:27:02 INFO metastore.HiveMetaStore: No user is added in admin > role, since config is empty > > 18/12/03 08:27:02 INFO metastore.HiveMetaStore: 0: get_all_databases > > 18/12/03 08:27:02 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_all_databases > > 18/12/03 08:27:02 INFO metastore.HiveMetaStore: 0: get_functions: > db=default pat=* > > 18/12/03 08:27:02 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_functions: db=default pat=* > > 18/12/03 08:27:02 INFO DataNucleus.Datastore: The class > "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as > "embedded-only" so does not have its own datastore table. > > 18/12/03 08:27:03 INFO session.SessionState: Created local directory: > /data1/hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0098/container_e05_1543422353836_0098_02_000001/tmp/yarn > > 18/12/03 08:27:03 INFO session.SessionState: Created local directory: > /data1/hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0098/container_e05_1543422353836_0098_02_000001/tmp/1dc9dffd-a306-4929-a387-833486436fb8_resources > > 18/12/03 08:27:03 INFO session.SessionState: Created HDFS directory: > /tmp/hive/zettics/1dc9dffd-a306-4929-a387-833486436fb8 > > 18/12/03 08:27:03 INFO session.SessionState: Created local directory: > /data1/hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0098/container_e05_1543422353836_0098_02_000001/tmp/yarn/1dc9dffd-a306-4929-a387-833486436fb8 > > 18/12/03 08:27:03 INFO session.SessionState: Created HDFS directory: > /tmp/hive/zettics/1dc9dffd-a306-4929-a387-833486436fb8/_tmp_space.db > > 18/12/03 08:27:03 INFO client.HiveClientImpl: Warehouse location for Hive > client (version 1.2.1) is > file:/data1/hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0098/container_e05_1543422353836_0098_02_000001/spark-warehouse > > 18/12/03 08:27:03 INFO metastore.HiveMetaStore: 0: get_database: default > > 18/12/03 08:27:03 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_database: default > > 18/12/03 08:27:03 INFO metastore.HiveMetaStore: 0: get_database: > global_temp > > 18/12/03 08:27:03 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_database: global_temp > > 18/12/03 08:27:03 WARN metastore.ObjectStore: Failed to get database > global_temp, returning NoSuchObjectException > > 18/12/03 08:27:03 INFO execution.SparkSqlParser: Parsing command: > zetticsdw.kylin_intermediate_ma_aggs_topn_cube_test_968d6f19_8f2b_38d7_69a2_bea7278058ef > > 18/12/03 08:27:03 INFO metastore.HiveMetaStore: 0: get_table : > db=zetticsdw > tbl=kylin_intermediate_ma_aggs_topn_cube_test_968d6f19_8f2b_38d7_69a2_bea7278058ef > > 18/12/03 08:27:03 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_table : db=zetticsdw > tbl=kylin_intermediate_ma_aggs_topn_cube_test_968d6f19_8f2b_38d7_69a2_bea7278058ef > > > 18/12/03 08:27:03 ERROR yarn.ApplicationMaster: User class threw > exception: java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Table or view > 'kylin_intermediate_ma_aggs_topn_cube_test_968d6f19_8f2b_38d7_69a2_bea7278058ef' > not found in database 'zetticsdw'; > > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Table or view > 'kylin_intermediate_ma_aggs_topn_cube_test_968d6f19_8f2b_38d7_69a2_bea7278058ef' > not found in database 'zetticsdw'; > > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42) > > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:636) > > Caused by: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: > Table or view > 'kylin_intermediate_ma_aggs_topn_cube_test_968d6f19_8f2b_38d7_69a2_bea7278058ef' > not found in database 'zetticsdw'; > > at > org.apache.spark.sql.hive.client.HiveClient$$anonfun$getTable$1.apply(HiveClient.scala:74) > > at > org.apache.spark.sql.hive.client.HiveClient$$anonfun$getTable$1.apply(HiveClient.scala:74) > > at scala.Option.getOrElse(Option.scala:121) > > at > org.apache.spark.sql.hive.client.HiveClient$class.getTable(HiveClient.scala:74) > > at > org.apache.spark.sql.hive.client.HiveClientImpl.getTable(HiveClientImpl.scala:78) > > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$getRawTable$1.apply(HiveExternalCatalog.scala:118) > > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$getRawTable$1.apply(HiveExternalCatalog.scala:118) > > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > > at org.apache.spark.sql.hive.HiveExternalCatalog.org > $apache$spark$sql$hive$HiveExternalCatalog$$getRawTable(HiveExternalCatalog.scala:117) > > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getTable$1.apply(HiveExternalCatalog.scala:628) > > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getTable$1.apply(HiveExternalCatalog.scala:628) > > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > > at > org.apache.spark.sql.hive.HiveExternalCatalog.getTable(HiveExternalCatalog.scala:627) > > at > org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:124) > > at > org.apache.spark.sql.hive.HiveSessionCatalog.lookupRelation(HiveSessionCatalog.scala:70) > > at org.apache.spark.sql.SparkSession.table(SparkSession.scala:586) > > at org.apache.spark.sql.SparkSession.table(SparkSession.scala:582) > > at > org.apache.kylin.engine.spark.SparkUtil.hiveRecordInputRDD(SparkUtil.java:157) > > at > org.apache.kylin.engine.spark.SparkFactDistinct.execute(SparkFactDistinct.java:186) > > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37) > > ... 6 more > > 18/12/03 08:27:03 INFO yarn.ApplicationMaster: Final app status: FAILED, > exitCode: 15, (reason: User class threw exception: > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Table or view > 'kylin_intermediate_ma_aggs_topn_cube_test_968d6f19_8f2b_38d7_69a2_bea7278058ef' > not found in database 'zetticsdw';) > > 18/12/03 08:27:03 INFO spark.SparkContext: Invoking stop() from shutdown > hook > > 18/12/03 08:27:03 INFO server.ServerConnector: Stopped Spark@56c07a8e > {HTTP/1.1}{0.0.0.0:0} > > > > > > > > *From:* ShaoFeng Shi <[email protected]> > *Sent:* Sunday, December 02, 2018 2:04 AM > *To:* user <[email protected]> > *Subject:* Re: anybody used spark to build cube in kylin 2.5.1? > > > > Hi Kang-sen, > > > > When the intermediate table's file format is not sequence file, Kylin will > use Hive catalog to parse the data into RDD. In this case, it needs the > "hive-site.xml" in spark/conf folder. Please confirm whether it is this > case, if true, put the file and then try again. > > > Best regards, > > > > Shaofeng Shi 史少锋 > > Apache Kylin PMC > > Work email: [email protected] > > Kyligence Inc: https://kyligence.io/ > > > > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html > > Join Kylin user mail group: [email protected] > > Join Kylin dev mail group: [email protected] > > > > > > > > > > Kang-Sen Lu <[email protected]> 于2018年12月1日周六 上午12:30写道: > > Hi, SHaofeng: > > > > Your suggestion made some progress. Now the step3 of cube build go further > and showed another problem. Here is the stderr log: > > > > 18/11/30 11:14:20 INFO spark.SparkFactDistinct: counter path > hdfs://anovadata6.anovadata.local:8020/user/zettics/kylin/25x/anova_kylin_25x_metadata/kylin-26646d80-3923-8ce4-1972-d24d197bcef7/ma_aggs_topn_cube_test/counter > > 18/11/30 11:14:20 WARN spark.SparkContext: Using an existing SparkContext; > some configuration may not take effect. > > 18/11/30 11:14:20 INFO internal.SharedState: Warehouse path is > 'file:/hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0093/container_e05_1543422353836_0093_02_000001/spark-warehouse'. > > 18/11/30 11:14:20 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@68871c45{/SQL,null,AVAILABLE,@Spark} > > 18/11/30 11:14:20 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@3071483{/SQL/json,null,AVAILABLE,@Spark} > > 18/11/30 11:14:20 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@53f1ff78{/SQL/execution,null,AVAILABLE,@Spark} > > 18/11/30 11:14:20 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@740d6f25{/SQL/execution/json,null,AVAILABLE,@Spark} > > 18/11/30 11:14:20 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@6e2af876{/static/sql,null,AVAILABLE,@Spark} > > 18/11/30 11:14:20 INFO hive.HiveUtils: Initializing > HiveMetastoreConnection version 1.2.1 using Spark classes. > > 18/11/30 11:14:21 INFO metastore.HiveMetaStore: 0: Opening raw store with > implemenation class:org.apache.hadoop.hive.metastore.ObjectStore > > 18/11/30 11:14:21 INFO metastore.ObjectStore: ObjectStore, initialize > called > > 18/11/30 11:14:21 INFO DataNucleus.Persistence: Property > datanucleus.cache.level2 unknown - will be ignored > > 18/11/30 11:14:21 INFO DataNucleus.Persistence: Property > hive.metastore.integral.jdo.pushdown unknown - will be ignored > > 18/11/30 11:14:22 INFO metastore.ObjectStore: Setting MetaStore object pin > classes with > hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" > > 18/11/30 11:14:24 INFO DataNucleus.Datastore: The class > "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as > "embedded-only" so does not have its own datastore table. > > 18/11/30 11:14:24 INFO DataNucleus.Datastore: The class > "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as > "embedded-only" so does not have its own datastore table. > > 18/11/30 11:14:24 INFO DataNucleus.Datastore: The class > "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as > "embedded-only" so does not have its own datastore table. > > 18/11/30 11:14:24 INFO DataNucleus.Datastore: The class > "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as > "embedded-only" so does not have its own datastore table. > > 18/11/30 11:14:25 INFO metastore.MetaStoreDirectSql: Using direct SQL, > underlying DB is DERBY > > 18/11/30 11:14:25 INFO metastore.ObjectStore: Initialized ObjectStore > > 18/11/30 11:14:25 WARN metastore.ObjectStore: Version information not > found in metastore. hive.metastore.schema.verification is not enabled so > recording the schema version 1.2.0 > > 18/11/30 11:14:25 WARN metastore.ObjectStore: Failed to get database > default, returning NoSuchObjectException > > 18/11/30 11:14:25 INFO metastore.HiveMetaStore: Added admin role in > metastore > > 18/11/30 11:14:25 INFO metastore.HiveMetaStore: Added public role in > metastore > > 18/11/30 11:14:25 INFO metastore.HiveMetaStore: No user is added in admin > role, since config is empty > > 18/11/30 11:14:25 INFO metastore.HiveMetaStore: 0: get_all_databases > > 18/11/30 11:14:25 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_all_databases > > 18/11/30 11:14:25 INFO metastore.HiveMetaStore: 0: get_functions: > db=default pat=* > > 18/11/30 11:14:25 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_functions: db=default pat=* > > 18/11/30 11:14:25 INFO DataNucleus.Datastore: The class > "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as > "embedded-only" so does not have its own datastore table. > > 18/11/30 11:14:25 INFO session.SessionState: Created local directory: > /hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0093/container_e05_1543422353836_0093_02_000001/tmp/yarn > > 18/11/30 11:14:25 INFO session.SessionState: Created local directory: > /hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0093/container_e05_1543422353836_0093_02_000001/tmp/0cd659c1-1104-4364-a9fb-878539d9208c_resources > > 18/11/30 11:14:25 INFO session.SessionState: Created HDFS directory: > /tmp/hive/zettics/0cd659c1-1104-4364-a9fb-878539d9208c > > 18/11/30 11:14:25 INFO session.SessionState: Created local directory: > /hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0093/container_e05_1543422353836_0093_02_000001/tmp/yarn/0cd659c1-1104-4364-a9fb-878539d9208c > > 18/11/30 11:14:25 INFO session.SessionState: Created HDFS directory: > /tmp/hive/zettics/0cd659c1-1104-4364-a9fb-878539d9208c/_tmp_space.db > > 18/11/30 11:14:25 INFO client.HiveClientImpl: Warehouse location for Hive > client (version 1.2.1) is > file:/hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0093/container_e05_1543422353836_0093_02_000001/spark-warehouse > > 18/11/30 11:14:25 INFO metastore.HiveMetaStore: 0: get_database: default > > 18/11/30 11:14:25 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_database: default > > 18/11/30 11:14:25 INFO metastore.HiveMetaStore: 0: get_database: > global_temp > > 18/11/30 11:14:25 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_database: global_temp > > 18/11/30 11:14:25 WARN metastore.ObjectStore: Failed to get database > global_temp, returning NoSuchObjectException > > 18/11/30 11:14:25 INFO execution.SparkSqlParser: Parsing command: > zetticsdw.kylin_intermediate_ma_aggs_topn_cube_test_c870139e_7a00_f5e8_4c6c_bad29be12b69 > > 18/11/30 11:14:26 INFO metastore.HiveMetaStore: 0: get_table : > db=zetticsdw > tbl=kylin_intermediate_ma_aggs_topn_cube_test_c870139e_7a00_f5e8_4c6c_bad29be12b69 > > 18/11/30 11:14:26 INFO HiveMetaStore.audit: ugi=zettics > ip=unknown-ip-addr cmd=get_table : db=zetticsdw > tbl=kylin_intermediate_ma_aggs_topn_cube_test_c870139e_7a00_f5e8_4c6c_bad29be12b69 > > > 18/11/30 11:14:26 ERROR yarn.ApplicationMaster: User class threw > exception: java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Table or view > 'kylin_intermediate_ma_aggs_topn_cube_test_c870139e_7a00_f5e8_4c6c_bad29be12b69' > not found in database 'zetticsdw'; > > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Table or view > 'kylin_intermediate_ma_aggs_topn_cube_test_c870139e_7a00_f5e8_4c6c_bad29be12b69' > not found in database 'zetticsdw'; > > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:42) > > at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:636) > > Caused by: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: > Table or view > 'kylin_intermediate_ma_aggs_topn_cube_test_c870139e_7a00_f5e8_4c6c_bad29be12b69' > not found in database 'zetticsdw'; > > at > org.apache.spark.sql.hive.client.HiveClient$$anonfun$getTable$1.apply(HiveClient.scala:74) > > at > org.apache.spark.sql.hive.client.HiveClient$$anonfun$getTable$1.apply(HiveClient.scala:74) > > at scala.Option.getOrElse(Option.scala:121) > > at > org.apache.spark.sql.hive.client.HiveClient$class.getTable(HiveClient.scala:74) > > at > org.apache.spark.sql.hive.client.HiveClientImpl.getTable(HiveClientImpl.scala:78) > > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$getRawTable$1.apply(HiveExternalCatalog.scala:118) > > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$org$apache$spark$sql$hive$HiveExternalCatalog$$getRawTable$1.apply(HiveExternalCatalog.scala:118) > > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > > at org.apache.spark.sql.hive.HiveExternalCatalog.org > $apache$spark$sql$hive$HiveExternalCatalog$$getRawTable(HiveExternalCatalog.scala:117) > > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getTable$1.apply(HiveExternalCatalog.scala:628) > > at > org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$getTable$1.apply(HiveExternalCatalog.scala:628) > > at > org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97) > > at > org.apache.spark.sql.hive.HiveExternalCatalog.getTable(HiveExternalCatalog.scala:627) > > at > org.apache.spark.sql.hive.HiveMetastoreCatalog.lookupRelation(HiveMetastoreCatalog.scala:124) > > at > org.apache.spark.sql.hive.HiveSessionCatalog.lookupRelation(HiveSessionCatalog.scala:70) > > at org.apache.spark.sql.SparkSession.table(SparkSession.scala:586) > > at org.apache.spark.sql.SparkSession.table(SparkSession.scala:582) > > at > org.apache.kylin.engine.spark.SparkUtil.hiveRecordInputRDD(SparkUtil.java:157) > > at > org.apache.kylin.engine.spark.SparkFactDistinct.execute(SparkFactDistinct.java:186) > > at > org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37) > > ... 6 more > > 18/11/30 11:14:26 INFO yarn.ApplicationMaster: Final app status: FAILED, > exitCode: 15, (reason: User class threw exception: > java.lang.RuntimeException: error execute > org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Table or view > 'kylin_intermediate_ma_aggs_topn_cube_test_c870139e_7a00_f5e8_4c6c_bad29be12b69' > not found in database 'zetticsdw';) > > 18/11/30 11:14:26 INFO spark.SparkContext: Invoking stop() from shutdown > hook > > > > Kang-sen > > > > *From:* ShaoFeng Shi <[email protected]> > *Sent:* Friday, November 30, 2018 8:53 AM > *To:* user <[email protected]> > *Subject:* Re: anybody used spark to build cube in kylin 2.5.1? > > > > A solution is to put a "java-opts" file in spark/conf folder, adding the > 'hdp.version' configuration, like this: > > > > cat /usr/local/spark/conf/java-opts > > -Dhdp.version=2.4.0.0-169 > > > > > > Best regards, > > > > Shaofeng Shi 史少锋 > > Apache Kylin PMC > > Work email: [email protected] > > Kyligence Inc: https://kyligence.io/ > > > > Apache Kylin FAQ: https://kylin.apache.org/docs/gettingstarted/faq.html > > Join Kylin user mail group: [email protected] > > Join Kylin dev mail group: [email protected] > > > > > > > > > > Kang-Sen Lu <[email protected]> 于2018年11月30日周五 下午9:04写道: > > Thanks for the reply from Yichen and Aron. This is my kylin.properties: > > > > kylin.engine.spark-conf.spark.yarn.archive=hdfs:// > 192.168.230.199:8020/user/zettics/spark/spark-libs.jar > > > ##kylin.engine.spark-conf.spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec > > # > > ## uncomment for HDP > > > kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.version=2.5.6.0-40 > > > kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp.version=2.5.6.0-40 > > > kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhdp.version=2.5.6.0-40 > > > > But I still get the same error. > > > > Stack trace: ExitCodeException exitCode=1: > /data5/hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0091/container_e05_1543422353836_0091_02_000001/launch_container.sh: > line 26: > $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: > bad substitution > > > > at org.apache.hadoop.util.Shell.runCommand(Shell.java:944) > > at org.apache.hadoop.util.Shell.run(Shell.java:848) > > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1142) > > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:237) > > > > I also saw in stderr: > > > > Log Type: stderr > > Log Upload Time: Fri Nov 30 07:54:45 -0500 2018 > > Log Length: 88 > > Error: Could not find or load main class > org.apache.spark.deploy.yarn.ApplicationMaster > > > > I suspect my problem is related to the fact that “${hdp.version}” was not > resolved somehow. It seems that kylin.properties parameters like > “extraJavaOptions=-Dhdp.version=2.5.6.0-40” was not enough. > > > > Kang-sen > > > > > > > > > > > > *From:* Yichen Zhou <[email protected]> > *Sent:* Thursday, November 29, 2018 9:08 PM > *To:* [email protected] > *Subject:* Re: anybody used spark to build cube in kylin 2.5.1? > > > > Hi Kang-Sen, > > > > I think Jiatao is right. If you want to use spark to build cube in HDP > cluster, you need to config -Dhdp.version in > $KYLIN_HOME/conf/kylin.properties. > > ## uncomment for HDP > > #kylin.engine.spark-conf.spark.driver.extraJavaOptions=-Dhdp.version=current > > #kylin.engine.spark-conf.spark.yarn.am.extraJavaOptions=-Dhdp.version=current > > #kylin.engine.spark-conf.spark.executor.extraJavaOptions=-Dhdp.version=current > > Please refer to this: > http://kylin.apache.org/docs/tutorial/cube_spark.html > > > > Regards, > > Yichen > > > > > > JiaTao Tao <[email protected]> 于2018年11月30日周五 上午9:57写道: > > Hi > > > > I took a look at the Internet and found these links, take a try and hope > it helps. > > > > > https://community.hortonworks.com/questions/23699/bad-substitution-error-running-spark-on-yarn.html > > > > > https://stackoverflow.com/questions/32341709/bad-substitution-when-submitting-spark-job-to-yarn-cluster > > > > -- > > > > Regards! > > Aron Tao > > > > > > Kang-Sen Lu <[email protected]> 于2018年11月29日周四 下午3:11写道: > > We are running kylin 2.5.1. For a specific cube created, the cube build > for one hour of data took 200 minutes. So I am thinking about building cube > with spark, instead of map-reduce. > > > > I selected spark in the cube design, advanced setting. > > > > The cube build failed at step 3, with the following error log: > > > > OS command error exit with return code: 1, error message: 18/11/29 > 09:50:33 INFO client.RMProxy: Connecting to ResourceManager at > anovadata6.anovadata.local/192.168.230.199:8050 > > 18/11/29 09:50:33 INFO yarn.Client: Requesting a new application from > cluster with 1 NodeManagers > > 18/11/29 09:50:33 INFO yarn.Client: Verifying our application has not > requested more than the maximum memory capability of the cluster (191488 MB > per container) > > 18/11/29 09:50:33 INFO yarn.Client: Will allocate AM container, with 2432 > MB memory including 384 MB overhead > > 18/11/29 09:50:33 INFO yarn.Client: Setting up container launch context > for our AM > > 18/11/29 09:50:33 INFO yarn.Client: Setting up the launch environment for > our AM container > > 18/11/29 09:50:33 INFO yarn.Client: Preparing resources for our AM > container > > 18/11/29 09:50:35 WARN yarn.Client: Neither spark.yarn.jars nor > spark.yarn.archive is set, falling back to uploading libraries under > SPARK_HOME. > > 18/11/29 09:50:38 INFO yarn.Client: Uploading resource > file:/tmp/spark-507691d4-f131-4bc5-bf6c-c8ff7606e201/__spark_libs__6261254232609828730.zip > -> > hdfs://anovadata6.anovadata.local:8020/user/zettics/.sparkStaging/application_1543422353836_0088/__spark_libs__6261254232609828730.zip > > 18/11/29 09:50:39 INFO yarn.Client: Uploading resource > file:/home/zettics/kylin/apache-kylin-2.5.1-anovadata-bin/lib/kylin-job-2.5.1-anovadata.jar > -> > hdfs://anovadata6.anovadata.local:8020/user/zettics/.sparkStaging/application_1543422353836_0088/kylin-job-2.5.1-anovadata.jar > > 18/11/29 09:50:39 WARN yarn.Client: Same path resource > file:/home/zettics/kylin/apache-kylin-2.5.1-anovadata-bin/lib/kylin-job-2.5.1-anovadata.jar > added multiple times to distributed cache. > > 18/11/29 09:50:39 INFO yarn.Client: Uploading resource > file:/tmp/spark-507691d4-f131-4bc5-bf6c-c8ff7606e201/__spark_conf__1525388499029792228.zip > -> > hdfs://anovadata6.anovadata.local:8020/user/zettics/.sparkStaging/application_1543422353836_0088/__spark_conf__.zip > > 18/11/29 09:50:39 WARN yarn.Client: spark.yarn.am.extraJavaOptions will > not take effect in cluster mode > > 18/11/29 09:50:39 INFO spark.SecurityManager: Changing view acls to: > zettics > > 18/11/29 09:50:39 INFO spark.SecurityManager: Changing modify acls to: > zettics > > 18/11/29 09:50:39 INFO spark.SecurityManager: Changing view acls groups > to: > > 18/11/29 09:50:39 INFO spark.SecurityManager: Changing modify acls groups > to: > > 18/11/29 09:50:39 INFO spark.SecurityManager: SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(zettics); groups with view permissions: Set(); users with modify > permissions: Set(zettics); groups with modify permissions: Set() > > 18/11/29 09:50:39 INFO yarn.Client: Submitting application > application_1543422353836_0088 to ResourceManager > > 18/11/29 09:50:39 INFO impl.YarnClientImpl: Submitted application > application_1543422353836_0088 > > 18/11/29 09:50:40 INFO yarn.Client: Application report for > application_1543422353836_0088 (state: ACCEPTED) > > 18/11/29 09:50:40 INFO yarn.Client: > > client token: N/A > > diagnostics: AM container is launched, waiting for AM container to > Register with RM > > ApplicationMaster host: N/A > > ApplicationMaster RPC port: -1 > > queue: default > > start time: 1543503039903 > > final status: UNDEFINED > > tracking URL: > http://anovadata6.anovadata.local:8088/proxy/application_1543422353836_0088/ > > user: zettics > > 18/11/29 09:50:41 INFO yarn.Client: Application report for > application_1543422353836_0088 (state: ACCEPTED) > > 18/11/29 09:50:42 INFO yarn.Client: Application report for > application_1543422353836_0088 (state: ACCEPTED) > > 18/11/29 09:50:43 INFO yarn.Client: Application report for > application_1543422353836_0088 (state: FAILED) > > 18/11/29 09:50:43 INFO yarn.Client: > > client token: N/A > > diagnostics: Application application_1543422353836_0088 failed 2 > times due to AM Container for appattempt_1543422353836_0088_000002 exited > with exitCode: 1 > > For more detailed output, check the application tracking page: > http://anovadata6.anovadata.local:8088/cluster/app/application_1543422353836_0088 > Then click on links to logs of each attempt. > > Diagnostics: Exception from container-launch. > > Container id: container_e05_1543422353836_0088_02_000001 > > Exit code: 1 > > Exception message: > /hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0088/container_e05_1543422353836_0088_02_000001/launch_container.sh: > line 26: > $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: > bad substitution > > > > Stack trace: ExitCodeException exitCode=1: > /hadoop/yarn/local/usercache/zettics/appcache/application_1543422353836_0088/container_e05_1543422353836_0088_02_000001/launch_container.sh: > line 26: > $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: > bad substitution > > > > at org.apache.hadoop.util.Shell.runCommand(Shell.java:944) > > at org.apache.hadoop.util.Shell.run(Shell.java:848) > > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1142) > > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:237) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83) > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > > > > > Thanks. > > > > Kang-sen > > > > >
