[ https://issues.apache.org/jira/browse/KYLIN-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453651#comment-16453651 ]
Hokyung Song commented on KYLIN-3349: ------------------------------------- Hello Shaofeng :) In my case, I already set column type as *decimal(19,0)* in Hive table. And then.. When I query in Hive. the result is 0 and also data is 0. But, after hive table loading in kylin, column type is *decimal(19,4)* in Kylin Admin: ({color:#FF0000}Kylin Admin > Model Menu > Data Source Tab{color}) ||type||column type||data|| |hive|decimal(19,0)|0| |kylin datasource|decimal(19,4)|0.0000| I guess...that is the reason. > Cube Build NumberFormatException when using Spark > ------------------------------------------------- > > Key: KYLIN-3349 > URL: https://issues.apache.org/jira/browse/KYLIN-3349 > Project: Kylin > Issue Type: Bug > Components: Job Engine > Affects Versions: v2.2.0, v2.3.0, v2.3.1 > Reporter: Hokyung Song > Priority: Major > > When I use spark engine to build cube, I have this error in spark when > building cube. > In my opinion, data has 0.00 as string, it cannot cast to long or double. > stack trace as follows > {code:java} > 2018-04-24 12:54:11,685 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : 18/04/24 > 12:54:11 WARN TaskSetManager: Lost task 193.0 in stage 0.0 (TID 1, hadoop, > executor 1): java.lang.NumberFormatException: For input string: "0.0000" > 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > java.lang.Long.parseLong(Long.java:589) > 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > java.lang.Long.valueOf(Long.java:803) > 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.kylin.measure.basic.LongIngester.valueOf(LongIngester.java:38) > 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.kylin.measure.basic.LongIngester.valueOf(LongIngester.java:28) > 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.kylin.engine.mr.common.BaseCuboidBuilder.buildValueOf(BaseCuboidBuilder.java:163) > 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.kylin.engine.mr.common.BaseCuboidBuilder.buildValueObjects(BaseCuboidBuilder.java:128) > 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.kylin.engine.spark.SparkCubingByLayer$EncodeBaseCuboid.call(SparkCubingByLayer.java:309) > 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.kylin.engine.spark.SparkCubingByLayer$EncodeBaseCuboid.call(SparkCubingByLayer.java:271) > 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1043) > 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1043) > 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:193) > 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63) > 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > 2018-04-24 12:54:11,688 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.spark.scheduler.Task.run(Task.scala:99) > 2018-04-24 12:54:11,688 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325) > 2018-04-24 12:54:11,688 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > 2018-04-24 12:54:11,688 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > 2018-04-24 12:54:11,688 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at > java.lang.Thread.run(Thread.java:745) > 2018-04-24 12:54:11,688 INFO [Scheduler 1401715751 Job > c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 :{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)