[jira] [Assigned] (KYLIN-3354) KeywordDefaultDirtyHack cannot handle double-quoted defaultCatalog identifier

2018-04-27 Thread Dong Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Li reassigned KYLIN-3354:
--

Assignee: Dong Li

> KeywordDefaultDirtyHack cannot handle double-quoted defaultCatalog identifier
> -
>
> Key: KYLIN-3354
> URL: https://issues.apache.org/jira/browse/KYLIN-3354
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Dong Li
>Assignee: Dong Li
>Priority: Major
>
> for example, following SQL cannot be escaped by KeywordDefaultDirtyHack:
> {quote}
> select count(*) from "defaultCatalog"."DEFAULT"."TABLE"
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3354) KeywordDefaultDirtyHack cannot handle double-quoted defaultCatalog identifier

2018-04-27 Thread Dong Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-3354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Li updated KYLIN-3354:
---
Description: 
for example, following SQL cannot be escaped by KeywordDefaultDirtyHack:
```
select count(*) from "defaultCatalog"."DEFAULT"."TABLE"
```

  was:
for example, following SQL cannot be escaped by KeywordDefaultDirtyHack:
select count(*) from "defaultCatalog"."DEFAULT"."TABLE"


> KeywordDefaultDirtyHack cannot handle double-quoted defaultCatalog identifier
> -
>
> Key: KYLIN-3354
> URL: https://issues.apache.org/jira/browse/KYLIN-3354
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Dong Li
>Priority: Major
>
> for example, following SQL cannot be escaped by KeywordDefaultDirtyHack:
> ```
> select count(*) from "defaultCatalog"."DEFAULT"."TABLE"
> ```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3354) KeywordDefaultDirtyHack cannot handle double-quoted defaultCatalog identifier

2018-04-27 Thread Dong Li (JIRA)
Dong Li created KYLIN-3354:
--

 Summary: KeywordDefaultDirtyHack cannot handle double-quoted 
defaultCatalog identifier
 Key: KYLIN-3354
 URL: https://issues.apache.org/jira/browse/KYLIN-3354
 Project: Kylin
  Issue Type: Improvement
Reporter: Dong Li


for example, following SQL cannot be escaped by KeywordDefaultDirtyHack:
select count(*) from "defaultCatalog"."DEFAULT"."TABLE"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KYLIN-3345) Use Apache Parent POM 19

2018-04-27 Thread Dong Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Li reassigned KYLIN-3345:
--

Assignee: Dong Li

> Use Apache Parent POM 19
> 
>
> Key: KYLIN-3345
> URL: https://issues.apache.org/jira/browse/KYLIN-3345
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Dong Li
>Priority: Major
>
> Kylin is still using Apache Parent POM 16. Apache Parent POM 19 is out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3353) Merge job should not be blocked by "kylin.cube.max-building-segments"

2018-04-27 Thread Shaofeng SHI (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shaofeng SHI updated KYLIN-3353:

Description: 
Currently there is a config "kylin.cube.max-building-segments" (default be 10) 
set the max. jobs for a cube.

In a frequently build case, that is possible that have 10 segments being built 
concurrently. Then there is no room for the merge jobs. If the merge job is 
blocked, more segments will be accumulated, and then impact on the query 
performance.

 

So I suggest to disable the checking for merge jobs.

 

 

> Merge job should not be blocked by "kylin.cube.max-building-segments"
> -
>
> Key: KYLIN-3353
> URL: https://issues.apache.org/jira/browse/KYLIN-3353
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Reporter: Shaofeng SHI
>Priority: Minor
>
> Currently there is a config "kylin.cube.max-building-segments" (default be 
> 10) set the max. jobs for a cube.
> In a frequently build case, that is possible that have 10 segments being 
> built concurrently. Then there is no room for the merge jobs. If the merge 
> job is blocked, more segments will be accumulated, and then impact on the 
> query performance.
>  
> So I suggest to disable the checking for merge jobs.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3353) Merge job should not be blocked by "kylin.cube.max-building-segments"

2018-04-27 Thread Shaofeng SHI (JIRA)
Shaofeng SHI created KYLIN-3353:
---

 Summary: Merge job should not be blocked by 
"kylin.cube.max-building-segments"
 Key: KYLIN-3353
 URL: https://issues.apache.org/jira/browse/KYLIN-3353
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine
Reporter: Shaofeng SHI






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3352) Segment pruning bug, e.g. date_col > "max_date+1"

2018-04-27 Thread liyang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-3352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyang updated KYLIN-3352:
--
Summary: Segment pruning bug, e.g. date_col > "max_date+1"  (was: Segment 
pruning bug when date_col > "max_date+1")

> Segment pruning bug, e.g. date_col > "max_date+1"
> -
>
> Key: KYLIN-3352
> URL: https://issues.apache.org/jira/browse/KYLIN-3352
> Project: Kylin
>  Issue Type: Bug
>Reporter: liyang
>Assignee: liyang
>Priority: Major
>
> Currently {{date_col > "max_date+1"}} is rounded down to {{date_col > 
> "max_date"}} during encoding and further evaluated as {{date_col >= 
> "max_date"}} during segment pruning. This causes a segment can be pruned is 
> not pruned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3352) Segment pruning bug when date_col > "max_date+1"

2018-04-27 Thread liyang (JIRA)
liyang created KYLIN-3352:
-

 Summary: Segment pruning bug when date_col > "max_date+1"
 Key: KYLIN-3352
 URL: https://issues.apache.org/jira/browse/KYLIN-3352
 Project: Kylin
  Issue Type: Bug
Reporter: liyang
Assignee: liyang


Currently {{date_col > "max_date+1"}} is rounded down to {{date_col > 
"max_date"}} during encoding and further evaluated as {{date_col >= 
"max_date"}} during segment pruning. This causes a segment can be pruned is not 
pruned.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3349) Cube Build NumberFormatException when using Spark

2018-04-27 Thread Hokyung Song (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455810#comment-16455810
 ] 

Hokyung Song commented on KYLIN-3349:
-

I tried this again,  but It was reproduced.

> Cube Build NumberFormatException when using Spark
> -
>
> Key: KYLIN-3349
> URL: https://issues.apache.org/jira/browse/KYLIN-3349
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v2.2.0, v2.3.0, v2.3.1
>Reporter: Hokyung Song
>Priority: Major
>
> When I use spark engine to build cube, I have this error in spark when 
> building cube.
> In my opinion, data has 0.00 as string, it cannot cast to long or double.
> stack trace as follows
> {code:java}
> 2018-04-24 12:54:11,685 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : 18/04/24 
> 12:54:11 WARN TaskSetManager: Lost task 193.0 in stage 0.0 (TID 1, hadoop, 
> executor 1): java.lang.NumberFormatException: For input string: "0."
> 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
> 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> java.lang.Long.parseLong(Long.java:589)
> 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> java.lang.Long.valueOf(Long.java:803)
> 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.kylin.measure.basic.LongIngester.valueOf(LongIngester.java:38)
> 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.kylin.measure.basic.LongIngester.valueOf(LongIngester.java:28)
> 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.kylin.engine.mr.common.BaseCuboidBuilder.buildValueOf(BaseCuboidBuilder.java:163)
> 2018-04-24 12:54:11,686 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.kylin.engine.mr.common.BaseCuboidBuilder.buildValueObjects(BaseCuboidBuilder.java:128)
> 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.kylin.engine.spark.SparkCubingByLayer$EncodeBaseCuboid.call(SparkCubingByLayer.java:309)
> 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.kylin.engine.spark.SparkCubingByLayer$EncodeBaseCuboid.call(SparkCubingByLayer.java:271)
> 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1043)
> 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.spark.api.java.JavaPairRDD$$anonfun$pairFunToScalaFun$1.apply(JavaPairRDD.scala:1043)
> 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
> 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:193)
> 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
> 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> 2018-04-24 12:54:11,687 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> 2018-04-24 12:54:11,688 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
> org.apache.spark.scheduler.Task.run(Task.scala:99)
> 2018-04-24 12:54:11,688 INFO [Scheduler 1401715751 Job 
> c1e5e47c-89fc-4ad6-8ae0-629879919aa5-264] spark.SparkExecutable:38 : at 
>