[jira] [Created] (KYLIN-3616) Create Intermediate Flat Hive Table失败

2018-10-07 Thread LinRongRen (JIRA)
LinRongRen created KYLIN-3616:
-

 Summary: Create Intermediate Flat Hive Table失败
 Key: KYLIN-3616
 URL: https://issues.apache.org/jira/browse/KYLIN-3616
 Project: Kylin
  Issue Type: Bug
  Components: Environment 
Affects Versions: v1.5.3
Reporter: LinRongRen
 Fix For: v1.5.3


创建cube的时候,在Monitor模块观察,执行完第一步:Count Source Table
执行第二步:Create Intermediate Flat Hive Table出现错误Error
日志提示:
total input rows = 4541
expected input rows per mapper = 100
reducers for RedistributeFlatHiveTableStep = 1
Create and distribute table, cmd: 
hive -e "SET dfs.replication=2;
SET hive.exec.compress.output=true;
SET hive.auto.convert.join.noconditionaltask=true;
SET hive.auto.convert.join.noconditionaltask.size=1;
SET 
mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET 
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;
SET mapreduce.job.split.metainfo.maxsize=-1;

set mapreduce.job.reduces=1;

set hive.merge.mapredfiles=false;

USE default;
DROP TABLE IF EXISTS 
kylin_intermediate_kylin_sales_cube_desc_2012010100_2012120100;
CREATE EXTERNAL TABLE IF NOT EXISTS 
kylin_intermediate_kylin_sales_cube_desc_2012010100_2012120100
(
DEFAULT_KYLIN_SALES_PART_DT date
,DEFAULT_KYLIN_SALES_LEAF_CATEG_ID bigint
,DEFAULT_KYLIN_SALES_LSTG_SITE_ID int
,DEFAULT_KYLIN_CATEGORY_GROUPINGS_META_CATEG_NAME string
,DEFAULT_KYLIN_CATEGORY_GROUPINGS_CATEG_LVL2_NAME string
,DEFAULT_KYLIN_CATEGORY_GROUPINGS_CATEG_LVL3_NAME string
,DEFAULT_KYLIN_SALES_LSTG_FORMAT_NAME string
,DEFAULT_KYLIN_SALES_PRICE decimal(19,4)
,DEFAULT_KYLIN_SALES_SELLER_ID bigint
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\177'
STORED AS SEQUENCEFILE
LOCATION 
'/kylin/kylin_metadata/kylin-cc7811cc-3494-4db0-b24e-4a3d76d22186/kylin_intermediate_kylin_sales_cube_desc_2012010100_2012120100';
SET dfs.replication=2;
SET hive.exec.compress.output=true;
SET hive.auto.convert.join.noconditionaltask=true;
SET hive.auto.convert.join.noconditionaltask.size=1;
SET 
mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET 
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;
SET mapreduce.job.split.metainfo.maxsize=-1;
INSERT OVERWRITE TABLE 
kylin_intermediate_kylin_sales_cube_desc_2012010100_2012120100 SELECT
KYLIN_SALES.PART_DT
,KYLIN_SALES.LEAF_CATEG_ID
,KYLIN_SALES.LSTG_SITE_ID
,KYLIN_CATEGORY_GROUPINGS.META_CATEG_NAME
,KYLIN_CATEGORY_GROUPINGS.CATEG_LVL2_NAME
,KYLIN_CATEGORY_GROUPINGS.CATEG_LVL3_NAME
,KYLIN_SALES.LSTG_FORMAT_NAME
,KYLIN_SALES.PRICE
,KYLIN_SALES.SELLER_ID
FROM DEFAULT.KYLIN_SALES as KYLIN_SALES 
INNER JOIN DEFAULT.KYLIN_CAL_DT as KYLIN_CAL_DT
ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT
INNER JOIN DEFAULT.KYLIN_CATEGORY_GROUPINGS as KYLIN_CATEGORY_GROUPINGS
ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND 
KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID
WHERE (KYLIN_SALES.PART_DT >= '2012-01-01' AND KYLIN_SALES.PART_DT < 
'2012-12-01')
 DISTRIBUTE BY RAND();

"

Logging initialized using configuration in 
jar:file:/home/hadoop/apps/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 0.884 seconds
OK
Time taken: 0.577 seconds
OK
Time taken: 0.815 seconds
Query ID = hadoop_20181008182637_13291646-6c23-4431-8a8e-401ced7aa67a
Total jobs = 1
18/10/08 18:26:54 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Execution log at: 
/tmp/hadoop/hadoop_20181008182637_13291646-6c23-4431-8a8e-401ced7aa67a.log
2018-10-08 18:26:57 Starting to launch local task to process map join; maximum 
memory = 518979584
2018-10-08 18:26:59 Dump the side-table for tag: 1 with group count: 144 into 
file: 
file:/tmp/hadoop/bba84efe-adae-431c-a167-14cd272ebd80/hive_2018-10-08_18-26-37_130_1316609404572806670-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable
2018-10-08 18:26:59 Uploaded 1 File to: 
file:/tmp/hadoop/bba84efe-adae-431c-a167-14cd272ebd80/hive_2018-10-08_18-26-37_130_1316609404572806670-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable
 (10893 bytes)
2018-10-08 18:26:59 Dump the side-table for tag: 0 with group count: 334 into 
file: 
file:/tmp/hadoop/bba84efe-adae-431c-a167-14cd272ebd80/hive_2018-10-08_18-26-37_130_1316609404572806670-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile10--.hashtable
2018-10-08 18:26:59 Uploaded 1 File to: 
file:/tmp/hadoop/bba84efe-adae-431c-a167-14cd272ebd80/hive_2018-10-08_18-26-37_130_1316609404572806670-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile10--.hashtable
 (123354 bytes)
2018-10-08 18:26:59 End of local task; Time Taken: 2.845 sec.
Execution completed successfully
MapredLo

[jira] [Created] (KYLIN-3615) Create Intermediate Flat Hive Table

2018-10-07 Thread LinRongRen (JIRA)
LinRongRen created KYLIN-3615:
-

 Summary: Create Intermediate Flat Hive Table
 Key: KYLIN-3615
 URL: https://issues.apache.org/jira/browse/KYLIN-3615
 Project: Kylin
  Issue Type: Bug
  Components: Environment 
Affects Versions: v1.5.3
 Environment: 集群环境:
hadoop 2.6.4
hive 1.2.1(apache-hive-2.3.3-bin)
hbase-1.1.3-bin.tar
zookeeper 3.4.5
apache-kylin-1.5.3-HBase1.x-bin.tar
Reporter: LinRongRen


在创建cube的时候,执行完第一步:Count Source Table.到第二步:Create Intermediate Flat Hive 
Table,就提示Error

日志提示:
total input rows = 4541
expected input rows per mapper = 100
reducers for RedistributeFlatHiveTableStep = 1
Create and distribute table, cmd: 
hive -e "SET dfs.replication=2;
SET hive.exec.compress.output=true;
SET hive.auto.convert.join.noconditionaltask=true;
SET hive.auto.convert.join.noconditionaltask.size=1;
SET 
mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET 
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;
SET mapreduce.job.split.metainfo.maxsize=-1;

set mapreduce.job.reduces=1;

set hive.merge.mapredfiles=false;

USE default;
DROP TABLE IF EXISTS 
kylin_intermediate_kylin_sales_cube_desc_2012010100_2012120100;
CREATE EXTERNAL TABLE IF NOT EXISTS 
kylin_intermediate_kylin_sales_cube_desc_2012010100_2012120100
(
DEFAULT_KYLIN_SALES_PART_DT date
,DEFAULT_KYLIN_SALES_LEAF_CATEG_ID bigint
,DEFAULT_KYLIN_SALES_LSTG_SITE_ID int
,DEFAULT_KYLIN_CATEGORY_GROUPINGS_META_CATEG_NAME string
,DEFAULT_KYLIN_CATEGORY_GROUPINGS_CATEG_LVL2_NAME string
,DEFAULT_KYLIN_CATEGORY_GROUPINGS_CATEG_LVL3_NAME string
,DEFAULT_KYLIN_SALES_LSTG_FORMAT_NAME string
,DEFAULT_KYLIN_SALES_PRICE decimal(19,4)
,DEFAULT_KYLIN_SALES_SELLER_ID bigint
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\177'
STORED AS SEQUENCEFILE
LOCATION 
'/kylin/kylin_metadata/kylin-cc7811cc-3494-4db0-b24e-4a3d76d22186/kylin_intermediate_kylin_sales_cube_desc_2012010100_2012120100';
SET dfs.replication=2;
SET hive.exec.compress.output=true;
SET hive.auto.convert.join.noconditionaltask=true;
SET hive.auto.convert.join.noconditionaltask.size=1;
SET 
mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET 
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;
SET mapreduce.job.split.metainfo.maxsize=-1;
INSERT OVERWRITE TABLE 
kylin_intermediate_kylin_sales_cube_desc_2012010100_2012120100 SELECT
KYLIN_SALES.PART_DT
,KYLIN_SALES.LEAF_CATEG_ID
,KYLIN_SALES.LSTG_SITE_ID
,KYLIN_CATEGORY_GROUPINGS.META_CATEG_NAME
,KYLIN_CATEGORY_GROUPINGS.CATEG_LVL2_NAME
,KYLIN_CATEGORY_GROUPINGS.CATEG_LVL3_NAME
,KYLIN_SALES.LSTG_FORMAT_NAME
,KYLIN_SALES.PRICE
,KYLIN_SALES.SELLER_ID
FROM DEFAULT.KYLIN_SALES as KYLIN_SALES 
INNER JOIN DEFAULT.KYLIN_CAL_DT as KYLIN_CAL_DT
ON KYLIN_SALES.PART_DT = KYLIN_CAL_DT.CAL_DT
INNER JOIN DEFAULT.KYLIN_CATEGORY_GROUPINGS as KYLIN_CATEGORY_GROUPINGS
ON KYLIN_SALES.LEAF_CATEG_ID = KYLIN_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND 
KYLIN_SALES.LSTG_SITE_ID = KYLIN_CATEGORY_GROUPINGS.SITE_ID
WHERE (KYLIN_SALES.PART_DT >= '2012-01-01' AND KYLIN_SALES.PART_DT < 
'2012-12-01')
 DISTRIBUTE BY RAND();

"

Logging initialized using configuration in 
jar:file:/home/hadoop/apps/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 0.884 seconds
OK
Time taken: 0.577 seconds
OK
Time taken: 0.815 seconds
Query ID = hadoop_20181008182637_13291646-6c23-4431-8a8e-401ced7aa67a
Total jobs = 1
18/10/08 18:26:54 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Execution log at: 
/tmp/hadoop/hadoop_20181008182637_13291646-6c23-4431-8a8e-401ced7aa67a.log
2018-10-08 18:26:57 Starting to launch local task to process map join;  
maximum memory = 518979584
2018-10-08 18:26:59 Dump the side-table for tag: 1 with group count: 144 
into file: 
file:/tmp/hadoop/bba84efe-adae-431c-a167-14cd272ebd80/hive_2018-10-08_18-26-37_130_1316609404572806670-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable
2018-10-08 18:26:59 Uploaded 1 File to: 
file:/tmp/hadoop/bba84efe-adae-431c-a167-14cd272ebd80/hive_2018-10-08_18-26-37_130_1316609404572806670-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile01--.hashtable
 (10893 bytes)
2018-10-08 18:26:59 Dump the side-table for tag: 0 with group count: 334 
into file: 
file:/tmp/hadoop/bba84efe-adae-431c-a167-14cd272ebd80/hive_2018-10-08_18-26-37_130_1316609404572806670-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile10--.hashtable
2018-10-08 18:26:59 Uploaded 1 File to: 
file:/tmp/hadoop/bba84efe-adae-431c-a167-14cd272ebd80/hive_2018-10-08_18-26-37_130_1316609404572806670-1/-local-10004/HashTable-Stage-3/MapJoin-mapfile10--

[jira] [Created] (KYLIN-3614) kylin2.4维度表字段内容超长报错

2018-10-07 Thread wlxie (JIRA)
wlxie created KYLIN-3614:


 Summary: kylin2.4维度表字段内容超长报错
 Key: KYLIN-3614
 URL: https://issues.apache.org/jira/browse/KYLIN-3614
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine, Tools, Build and Test
Affects Versions: v2.4.0
Reporter: wlxie
 Attachments: 维度表字段内容超长.txt

各位老师好,

    kylin2.4关联维度表创建cube,如果维度表某些字段值超长会报错,报错信息请参考附件。

对此有两个建议。

 1. 报错信息不友好,没有告诉用户是哪个表哪个字段超长,导致定位问题困难。

 2. 在cube中未使用到的字段建议不要做字段长度检查,目前是所有字段都有去检查,包括并未使用到的字段。

   谢谢



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3596) kafka使用特殊字符或hql关键字作为key,跑cube作业的时候创建hive临时表报错

2018-10-07 Thread wlxie (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641370#comment-16641370
 ] 

wlxie commented on KYLIN-3596:
--

好的,谢谢

> kafka使用特殊字符或hql关键字作为key,跑cube作业的时候创建hive临时表报错
> -
>
> Key: KYLIN-3596
> URL: https://issues.apache.org/jira/browse/KYLIN-3596
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Major
> Attachments: kafka使用关键字作为key.txt
>
>
> 各位老师,
>    在使用kafka作为Streaming Table 
> 的时候,kafka的key包含下划线开头或hql关键字的时候,在跑cube作业的时候创建hive临时表报错。有没有什么配置可以让kylin在创建hive临时表的时候统一给字段名和表名加上反引号。我尝试把hive.support.sql11.reserved.keywords设置为false,但是还是不支持下划线开头和部分关键字。具体错误请参考附件。
>          谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3596) kafka使用特殊字符或hql关键字作为key,跑cube作业的时候创建hive临时表报错

2018-10-07 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641317#comment-16641317
 ] 

Shaofeng SHI commented on KYLIN-3596:
-

I agree with Xiaoxiang; please avoid using the invalid characters or keywords 
in the data flow.

> kafka使用特殊字符或hql关键字作为key,跑cube作业的时候创建hive临时表报错
> -
>
> Key: KYLIN-3596
> URL: https://issues.apache.org/jira/browse/KYLIN-3596
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Major
> Attachments: kafka使用关键字作为key.txt
>
>
> 各位老师,
>    在使用kafka作为Streaming Table 
> 的时候,kafka的key包含下划线开头或hql关键字的时候,在跑cube作业的时候创建hive临时表报错。有没有什么配置可以让kylin在创建hive临时表的时候统一给字段名和表名加上反引号。我尝试把hive.support.sql11.reserved.keywords设置为false,但是还是不支持下划线开头和部分关键字。具体错误请参考附件。
>          谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-3613) Kylin with Standalone HBase Cluster (enabled kerberos) could not find the main cluster namespace at "Create HTable" step

2018-10-07 Thread powerinf (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

powerinf updated KYLIN-3613:

Attachment: Kylin_standalone_hbase.log

> Kylin with Standalone HBase Cluster (enabled kerberos) could not find the 
> main cluster namespace at  "Create HTable" step
> -
>
> Key: KYLIN-3613
> URL: https://issues.apache.org/jira/browse/KYLIN-3613
> Project: Kylin
>  Issue Type: Bug
>  Components: Environment 
>Affects Versions: v2.4.0
>Reporter: powerinf
>Priority: Major
> Attachments: Kylin_standalone_hbase.log
>
>
> I deployed two hadoop cluster(also enabled kerberos ,with cross-realm trust) 
> the main cluster and hbase cluster,Kylin Server can access both clusters 
> using hdfs shell with fully qualifiered path ,can submit MR job to main 
> cluster, and can use hive shell to access data warehouse
> on Kylin Server, the configurations of hadoop and hive points to main 
> cluster,and can access hbase cluster using hbase shell.
> when I build the cube, at "Create HTable" step, it reported the error 
> "java.net.UnknownHostException: ctyunbigdata Set hbase.table.sanity.checks to 
> false 
> at conf or table descriptor if you want to bypass sanity checks",but after I 
> restart Kylin serer , resume it can run normally,Why?
> more detail message on Kylin_standalone_hbase.log



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-3613) Kylin with Standalone HBase Cluster (enabled kerberos) could not find the main cluster namespace at "Create HTable" step

2018-10-07 Thread powerinf (JIRA)
powerinf created KYLIN-3613:
---

 Summary: Kylin with Standalone HBase Cluster (enabled kerberos) 
could not find the main cluster namespace at  "Create HTable" step
 Key: KYLIN-3613
 URL: https://issues.apache.org/jira/browse/KYLIN-3613
 Project: Kylin
  Issue Type: Bug
  Components: Environment 
Affects Versions: v2.4.0
Reporter: powerinf


I deployed two hadoop cluster(also enabled kerberos ,with cross-realm trust) 
the main cluster and hbase cluster,Kylin Server can access both clusters using 
hdfs shell with fully qualifiered path ,can submit MR job to main cluster, and 
can use hive shell to access data warehouse
on Kylin Server, the configurations of hadoop and hive points to main 
cluster,and can access hbase cluster using hbase shell.
when I build the cube, at "Create HTable" step, it reported the error 
"java.net.UnknownHostException: ctyunbigdata Set hbase.table.sanity.checks to 
false 
at conf or table descriptor if you want to bypass sanity checks",but after I 
restart Kylin serer , resume it can run normally,Why?

more detail message on Kylin_standalone_hbase.log



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KYLIN-3594) Select with Catalog fails

2018-10-07 Thread XiaoXiang Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

XiaoXiang Yu resolved KYLIN-3594.
-
Resolution: Fixed

> Select with Catalog fails
> -
>
> Key: KYLIN-3594
> URL: https://issues.apache.org/jira/browse/KYLIN-3594
> Project: Kylin
>  Issue Type: Bug
>Reporter: Hosur Narahari
>Assignee: XiaoXiang Yu
>Priority: Major
> Fix For: v2.6.0
>
>
> By using DatabaseMetaData if we get catalog using getCatalogs() method, it 
> return value "defaultCatalog". It returns actual hive schema when we execute 
> getSchemas().
> According to JDBC contract, catalog.schema.table should be valid from clause 
> and many query layers use that. But kylin fails when we execute that query.
> I've tried to write sample code piece for that below.
>  
>         _DatabaseMetaData db = conn.getMetaData();_
>         _ResultSet catalogSet = db.getCatalogs();_
>         _String catalog = "";_
>         _if(catalogSet.next()) {_
>             _catalog = catalogSet.getString("TABLE_CAT");_
>         _}_
>         _ResultSet schemaSet = db.getSchemas();_
>         _String schema = "";_
>         _if(schemaSet.next()) {_
>             _schema = schemaSet.getString("TABLE_SCHEM");_
>         _}_
>         _StringBuilder sb = new StringBuilder("SELECT * FROM ");_
>         _if(!catalog.isEmpty()) {_
>             _sb.append(catalog + ".");_
>         _}_
>         _if(!schema.isEmpty()) {_
>             _sb.append(schema + ".");_
>         _}_
>         _sb.append("kylin_sales limit 10");_
>         _String query = sb.toString();_
>         _Statement stat = conn.createStatement();_
>         _ResultSet rs = stat.executeQuery(query);_
>         _while(rs.next()) {_
>             _System.out.println(rs.getObject("trans_id"));_
>         _}_
> In short, the above snippet is executing the query,
> _select * from defaultCatalog.DEFAULT.kylin_sales._
>  
> Same thing happens even with different schemas if we have like,
> _select * from defaultCatalog.test.kylin_sales_ also fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3491) Improve the cube building process when using global dictionary

2018-10-07 Thread Shaofeng SHI (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641080#comment-16641080
 ] 

Shaofeng SHI commented on KYLIN-3491:
-

Hi Ruslan, this optimization is only for cube building, has no impact on query 
performance.

For high cardinality columns, its performance won't be okay because the bitmap 
is compressed format, please avoid much post-aggregation in the query time. The 
main difficulty might be building the global dictionary if the date type is not 
integer.

> Improve the cube building process when using global dictionary
> --
>
> Key: KYLIN-3491
> URL: https://issues.apache.org/jira/browse/KYLIN-3491
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Reporter: Zhong Yanghong
>Assignee: Zhong Yanghong
>Priority: Major
> Fix For: v2.5.0
>
> Attachments: APACHE-KYLIN-3491-with-fix.patch, APACHE-KYLIN-3491.patch
>
>
> By current cubing process, if the global dictionary is very large, since the 
> raw data records are unsorted, it's hard to encode raw values into ids for 
> the input of bitmap due to frequent swap of the dictionary slices. We need a 
> refined process. The idea is as follows:
>  # for each source data block, there will be a mapper generating the distinct 
> values & sort them
>  # encode the sorted distinct values and generate a shrunken dict for each 
> source data block.
>  # when building base cuboid, use the shrunken dict for each source data 
> block for encoding.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3586) Boxing/unboxing to parse a primitive is suboptimal

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641077#comment-16641077
 ] 

ASF GitHub Bot commented on KYLIN-3586:
---

shaofengshi closed pull request #280: KYLIN-3586 Fix remaining boxing/unboxing 
problem
URL: https://github.com/apache/kylin/pull/280
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/core-common/src/main/java/org/apache/kylin/common/debug/BackdoorToggles.java 
b/core-common/src/main/java/org/apache/kylin/common/debug/BackdoorToggles.java
index efc5030fa9..405d36143e 100644
--- 
a/core-common/src/main/java/org/apache/kylin/common/debug/BackdoorToggles.java
+++ 
b/core-common/src/main/java/org/apache/kylin/common/debug/BackdoorToggles.java
@@ -114,7 +114,7 @@ public static int getQueryTimeout() {
 if (v == null)
 return -1;
 else
-return Integer.valueOf(v);
+return Integer.parseInt(v);
 }
 
 public static Pair getShardAssignment() {
diff --git 
a/core-metadata/src/main/java/org/apache/kylin/metadata/tuple/Tuple.java 
b/core-metadata/src/main/java/org/apache/kylin/metadata/tuple/Tuple.java
index cd2cfe316c..86db79b0f2 100644
--- a/core-metadata/src/main/java/org/apache/kylin/metadata/tuple/Tuple.java
+++ b/core-metadata/src/main/java/org/apache/kylin/metadata/tuple/Tuple.java
@@ -176,7 +176,7 @@ public String toString() {
 public static long getTs(ITuple row, TblColRef partitionCol) {
 //ts column type differentiate
 if (partitionCol.getDatatype().equals("date")) {
-return 
epicDaysToMillis(Integer.valueOf(row.getValue(partitionCol).toString()));
+return 
epicDaysToMillis(Integer.parseInt(row.getValue(partitionCol).toString()));
 } else {
 return Long.parseLong(row.getValue(partitionCol).toString());
 }
diff --git a/jdbc/src/main/java/org/apache/kylin/jdbc/KylinClient.java 
b/jdbc/src/main/java/org/apache/kylin/jdbc/KylinClient.java
index 6814f2d8bc..9826de5827 100644
--- a/jdbc/src/main/java/org/apache/kylin/jdbc/KylinClient.java
+++ b/jdbc/src/main/java/org/apache/kylin/jdbc/KylinClient.java
@@ -176,20 +176,20 @@ public static Object wrapObject(String value, int 
sqlType) {
 return new BigDecimal(value);
 case Types.BIT:
 case Types.BOOLEAN:
-return Boolean.parseBoolean(value);
+return Boolean.valueOf(value);
 case Types.TINYINT:
 return Byte.valueOf(value);
 case Types.SMALLINT:
 return Short.valueOf(value);
 case Types.INTEGER:
-return Integer.parseInt(value);
+return Integer.valueOf(value);
 case Types.BIGINT:
-return Long.parseLong(value);
+return Long.valueOf(value);
 case Types.FLOAT:
-return Float.parseFloat(value);
+return Float.valueOf(value);
 case Types.REAL:
 case Types.DOUBLE:
-return Double.parseDouble(value);
+return Double.valueOf(value);
 case Types.BINARY:
 case Types.VARBINARY:
 case Types.LONGVARBINARY:


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Boxing/unboxing to parse a primitive is suboptimal
> --
>
> Key: KYLIN-3586
> URL: https://issues.apache.org/jira/browse/KYLIN-3586
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.6.0
>
>
> An example is from HBaseLookupRowEncoder:
> {code}
> int valIdx = Integer.valueOf(Bytes.toString(qualifier));
> {code}
> valueOf returns an Integer object which would be unboxed and assigned to 
> valIdx.
> Integer.parseInt() should be used instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3594) Select with Catalog fails

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641079#comment-16641079
 ] 

ASF GitHub Bot commented on KYLIN-3594:
---

shaofengshi closed pull request #279: KYLIN-3594 remove unnecessary sql files 
in integration test
URL: https://github.com/apache/kylin/pull/279
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/kylin-it/src/test/resources/query/sql_snowflake/query07.sql 
b/kylin-it/src/test/resources/query/sql_snowflake/query07.sql
deleted file mode 100644
index 2705f783d7..00
--- a/kylin-it/src/test/resources/query/sql_snowflake/query07.sql
+++ /dev/null
@@ -1,23 +0,0 @@
-SELECT
-
-count(*) as cnt
-
-FROM defaultCatalog.TEST_KYLIN_FACT as TEST_KYLIN_FACT
-INNER JOIN defaultCatalog.TEST_ORDER as TEST_ORDER
-ON TEST_KYLIN_FACT.ORDER_ID = TEST_ORDER.ORDER_ID
-INNER JOIN defaultCatalog.TEST_ACCOUNT as BUYER_ACCOUNT
-ON TEST_ORDER.BUYER_ID = BUYER_ACCOUNT.ACCOUNT_ID
-INNER JOIN defaultCatalog.TEST_ACCOUNT as SELLER_ACCOUNT
-ON TEST_KYLIN_FACT.SELLER_ID = SELLER_ACCOUNT.ACCOUNT_ID
-INNER JOIN defaultCatalog.EDW.TEST_CAL_DT as TEST_CAL_DT
-ON TEST_KYLIN_FACT.CAL_DT = TEST_CAL_DT.CAL_DT
-INNER JOIN defaultCatalog.TEST_CATEGORY_GROUPINGS as TEST_CATEGORY_GROUPINGS
-ON TEST_KYLIN_FACT.LEAF_CATEG_ID = TEST_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND 
TEST_KYLIN_FACT.LSTG_SITE_ID = TEST_CATEGORY_GROUPINGS.SITE_ID
-INNER JOIN defaultCatalog.EDW.TEST_SITES as TEST_SITES
-ON TEST_KYLIN_FACT.LSTG_SITE_ID = TEST_SITES.SITE_ID
-INNER JOIN defaultCatalog.EDW.TEST_SELLER_TYPE_DIM as TEST_SELLER_TYPE_DIM
-ON TEST_KYLIN_FACT.SLR_SEGMENT_CD = TEST_SELLER_TYPE_DIM.SELLER_TYPE_CD
-INNER JOIN defaultCatalog.TEST_COUNTRY as BUYER_COUNTRY
-ON BUYER_ACCOUNT.ACCOUNT_COUNTRY = BUYER_COUNTRY.COUNTRY
-INNER JOIN defaultCatalog.TEST_COUNTRY as SELLER_COUNTRY
-ON SELLER_ACCOUNT.ACCOUNT_COUNTRY = SELLER_COUNTRY.COUNTRY
diff --git a/kylin-it/src/test/resources/query/sql_snowflake/query08.sql 
b/kylin-it/src/test/resources/query/sql_snowflake/query08.sql
deleted file mode 100644
index 454779c0af..00
--- a/kylin-it/src/test/resources/query/sql_snowflake/query08.sql
+++ /dev/null
@@ -1,14 +0,0 @@
-SELECT
-
-count(*) as cnt, sum(price) as sum_price, SELLER_COUNTRY.NAME
-
-FROM TEST_KYLIN_FACT as TEST_KYLIN_FACT 
-INNER JOIN defaultCatalog.TEST_ACCOUNT as SELLER_ACCOUNT
-ON TEST_KYLIN_FACT.SELLER_ID = SELLER_ACCOUNT.ACCOUNT_ID
-INNER JOIN defaultCatalog.TEST_CATEGORY_GROUPINGS as TEST_CATEGORY_GROUPINGS
-ON TEST_KYLIN_FACT.LEAF_CATEG_ID = TEST_CATEGORY_GROUPINGS.LEAF_CATEG_ID AND 
TEST_KYLIN_FACT.LSTG_SITE_ID = TEST_CATEGORY_GROUPINGS.SITE_ID
-INNER JOIN defaultCatalog.TEST_COUNTRY as SELLER_COUNTRY
-ON SELLER_ACCOUNT.ACCOUNT_COUNTRY = SELLER_COUNTRY.COUNTRY
-
-where SELLER_ACCOUNT.ACCOUNT_SELLER_LEVEL=1
-group by SELLER_COUNTRY.NAME
\ No newline at end of file


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Select with Catalog fails
> -
>
> Key: KYLIN-3594
> URL: https://issues.apache.org/jira/browse/KYLIN-3594
> Project: Kylin
>  Issue Type: Bug
>Reporter: Hosur Narahari
>Assignee: XiaoXiang Yu
>Priority: Major
> Fix For: v2.6.0
>
>
> By using DatabaseMetaData if we get catalog using getCatalogs() method, it 
> return value "defaultCatalog". It returns actual hive schema when we execute 
> getSchemas().
> According to JDBC contract, catalog.schema.table should be valid from clause 
> and many query layers use that. But kylin fails when we execute that query.
> I've tried to write sample code piece for that below.
>  
>         _DatabaseMetaData db = conn.getMetaData();_
>         _ResultSet catalogSet = db.getCatalogs();_
>         _String catalog = "";_
>         _if(catalogSet.next()) {_
>             _catalog = catalogSet.getString("TABLE_CAT");_
>         _}_
>         _ResultSet schemaSet = db.getSchemas();_
>         _String schema = "";_
>         _if(schemaSet.next()) {_
>             _schema = schemaSet.getString("TABLE_SCHEM");_
>         _}_
>         _StringBuilder sb = new StringBuilder("SELECT * FROM ");_
>         _if(!catalog.isEmpty()) {_
>             _sb.append(catalog + ".");_
>         _}_
>         _if(!schema.isEmpty()) {_
>             _sb.append(schema + ".");_
>         _}_
>         _sb.append("kylin_sales limit 10");_
>         _String query = 

[jira] [Commented] (KYLIN-3594) Select with Catalog fails

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641072#comment-16641072
 ] 

ASF GitHub Bot commented on KYLIN-3594:
---

coveralls edited a comment on issue #279: KYLIN-3594 remove unnecessary sql 
files in integration test
URL: https://github.com/apache/kylin/pull/279#issuecomment-427654298
 
 
   ## Pull Request Test Coverage Report for [Build 
3735](https://coveralls.io/builds/19392205)
   
   * **0** of **0**   changed or added relevant lines in **0** files are 
covered.
   * **2** unchanged lines in **1** file lost coverage.
   * Overall coverage increased (+**0.01%**) to **23.293%**
   
   ---
   
   
   |  Files with Coverage Reduction | New Missed Lines | % |
   | :-|--|--: |
   | 
[core-cube/src/main/java/org/apache/kylin/cube/cuboid/TreeCuboidScheduler.java](https://coveralls.io/builds/19392205/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2FTreeCuboidScheduler.java#L124)
 | 2 | 68.46% |
   
   
   |  Totals | [![Coverage 
Status](https://coveralls.io/builds/19392205/badge)](https://coveralls.io/builds/19392205)
 |
   | :-- | --: |
   | Change from base [Build 3732](https://coveralls.io/builds/19385953): |  
0.01% |
   | Covered Lines: | 16301 |
   | Relevant Lines: | 69983 |
   
   ---
   # 💛  - [Coveralls](https://coveralls.io)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Select with Catalog fails
> -
>
> Key: KYLIN-3594
> URL: https://issues.apache.org/jira/browse/KYLIN-3594
> Project: Kylin
>  Issue Type: Bug
>Reporter: Hosur Narahari
>Assignee: XiaoXiang Yu
>Priority: Major
> Fix For: v2.6.0
>
>
> By using DatabaseMetaData if we get catalog using getCatalogs() method, it 
> return value "defaultCatalog". It returns actual hive schema when we execute 
> getSchemas().
> According to JDBC contract, catalog.schema.table should be valid from clause 
> and many query layers use that. But kylin fails when we execute that query.
> I've tried to write sample code piece for that below.
>  
>         _DatabaseMetaData db = conn.getMetaData();_
>         _ResultSet catalogSet = db.getCatalogs();_
>         _String catalog = "";_
>         _if(catalogSet.next()) {_
>             _catalog = catalogSet.getString("TABLE_CAT");_
>         _}_
>         _ResultSet schemaSet = db.getSchemas();_
>         _String schema = "";_
>         _if(schemaSet.next()) {_
>             _schema = schemaSet.getString("TABLE_SCHEM");_
>         _}_
>         _StringBuilder sb = new StringBuilder("SELECT * FROM ");_
>         _if(!catalog.isEmpty()) {_
>             _sb.append(catalog + ".");_
>         _}_
>         _if(!schema.isEmpty()) {_
>             _sb.append(schema + ".");_
>         _}_
>         _sb.append("kylin_sales limit 10");_
>         _String query = sb.toString();_
>         _Statement stat = conn.createStatement();_
>         _ResultSet rs = stat.executeQuery(query);_
>         _while(rs.next()) {_
>             _System.out.println(rs.getObject("trans_id"));_
>         _}_
> In short, the above snippet is executing the query,
> _select * from defaultCatalog.DEFAULT.kylin_sales._
>  
> Same thing happens even with different schemas if we have like,
> _select * from defaultCatalog.test.kylin_sales_ also fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3586) Boxing/unboxing to parse a primitive is suboptimal

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641070#comment-16641070
 ] 

ASF GitHub Bot commented on KYLIN-3586:
---

coveralls commented on issue #280: KYLIN-3586 Fix remaining boxing/unboxing 
problem
URL: https://github.com/apache/kylin/pull/280#issuecomment-427655794
 
 
   ## Pull Request Test Coverage Report for [Build 
3734](https://coveralls.io/builds/19392189)
   
   * **1** of **7**   **(14.29%)**  changed or added relevant lines in **3** 
files are covered.
   * **3** unchanged lines in **2** files lost coverage.
   * Overall coverage increased (+**0.01%**) to **23.293%**
   
   ---
   
   |  Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
   | :-|--||---: |
   | 
[core-common/src/main/java/org/apache/kylin/common/debug/BackdoorToggles.java](https://coveralls.io/builds/19392189/source?filename=core-common%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcommon%2Fdebug%2FBackdoorToggles.java#L117)
 | 0 | 1 | 0.0%
   | 
[core-metadata/src/main/java/org/apache/kylin/metadata/tuple/Tuple.java](https://coveralls.io/builds/19392189/source?filename=core-metadata%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fmetadata%2Ftuple%2FTuple.java#L179)
 | 0 | 1 | 0.0%
   | 
[jdbc/src/main/java/org/apache/kylin/jdbc/KylinClient.java](https://coveralls.io/builds/19392189/source?filename=jdbc%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fjdbc%2FKylinClient.java#L179)
 | 1 | 5 | 20.0%
   
   
   |  Files with Coverage Reduction | New Missed Lines | % |
   | :-|--|--: |
   | 
[core-cube/src/main/java/org/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://coveralls.io/builds/19392189/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Finmemcubing%2FMemDiskStore.java#L553)
 | 1 | 78.12% |
   | 
[core-cube/src/main/java/org/apache/kylin/cube/cuboid/TreeCuboidScheduler.java](https://coveralls.io/builds/19392189/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2FTreeCuboidScheduler.java#L124)
 | 2 | 68.46% |
   
   
   |  Totals | [![Coverage 
Status](https://coveralls.io/builds/19392189/badge)](https://coveralls.io/builds/19392189)
 |
   | :-- | --: |
   | Change from base [Build 3732](https://coveralls.io/builds/19385953): |  
0.01% |
   | Covered Lines: | 16301 |
   | Relevant Lines: | 69983 |
   
   ---
   # 💛  - [Coveralls](https://coveralls.io)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Boxing/unboxing to parse a primitive is suboptimal
> --
>
> Key: KYLIN-3586
> URL: https://issues.apache.org/jira/browse/KYLIN-3586
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.6.0
>
>
> An example is from HBaseLookupRowEncoder:
> {code}
> int valIdx = Integer.valueOf(Bytes.toString(qualifier));
> {code}
> valueOf returns an Integer object which would be unboxed and assigned to 
> valIdx.
> Integer.parseInt() should be used instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3586) Boxing/unboxing to parse a primitive is suboptimal

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641069#comment-16641069
 ] 

ASF GitHub Bot commented on KYLIN-3586:
---

codecov-io commented on issue #280: KYLIN-3586 Fix remaining boxing/unboxing 
problem
URL: https://github.com/apache/kylin/pull/280#issuecomment-427655739
 
 
   # [Codecov](https://codecov.io/gh/apache/kylin/pull/280?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@eb01b69`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `14.28%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/kylin/pull/280/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/280?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master #280   +/-   ##
   =
 Coverage  ?   21.28%   
 Complexity? 4437   
   =
 Files ? 1087   
 Lines ?69983   
 Branches  ?10126   
   =
 Hits  ?14896   
 Misses?53684   
 Partials  ? 1403
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/kylin/pull/280?src=pr&el=tree) | Coverage Δ 
| Complexity Δ | |
   |---|---|---|---|
   | 
[...in/java/org/apache/kylin/metadata/tuple/Tuple.java](https://codecov.io/gh/apache/kylin/pull/280/diff?src=pr&el=tree#diff-Y29yZS1tZXRhZGF0YS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vbWV0YWRhdGEvdHVwbGUvVHVwbGUuamF2YQ==)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   | 
[...org/apache/kylin/common/debug/BackdoorToggles.java](https://codecov.io/gh/apache/kylin/pull/280/diff?src=pr&el=tree#diff-Y29yZS1jb21tb24vc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2t5bGluL2NvbW1vbi9kZWJ1Zy9CYWNrZG9vclRvZ2dsZXMuamF2YQ==)
 | `0% <0%> (ø)` | `0 <0> (?)` | |
   | 
[...c/main/java/org/apache/kylin/jdbc/KylinClient.java](https://codecov.io/gh/apache/kylin/pull/280/diff?src=pr&el=tree#diff-amRiYy9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUva3lsaW4vamRiYy9LeWxpbkNsaWVudC5qYXZh)
 | `67.83% <20%> (ø)` | `33 <0> (?)` | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/kylin/pull/280?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/kylin/pull/280?src=pr&el=footer). Last 
update 
[eb01b69...4bc2939](https://codecov.io/gh/apache/kylin/pull/280?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Boxing/unboxing to parse a primitive is suboptimal
> --
>
> Key: KYLIN-3586
> URL: https://issues.apache.org/jira/browse/KYLIN-3586
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.6.0
>
>
> An example is from HBaseLookupRowEncoder:
> {code}
> int valIdx = Integer.valueOf(Bytes.toString(qualifier));
> {code}
> valueOf returns an Integer object which would be unboxed and assigned to 
> valIdx.
> Integer.parseInt() should be used instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3596) kafka使用特殊字符或hql关键字作为key,跑cube作业的时候创建hive临时表报错

2018-10-07 Thread XiaoXiang Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641067#comment-16641067
 ] 

XiaoXiang Yu commented on KYLIN-3596:
-

[~xiewenliang] Is it possible to rename these column before it was sent to 
Kylin? I think it is a better idea.

> kafka使用特殊字符或hql关键字作为key,跑cube作业的时候创建hive临时表报错
> -
>
> Key: KYLIN-3596
> URL: https://issues.apache.org/jira/browse/KYLIN-3596
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Major
> Attachments: kafka使用关键字作为key.txt
>
>
> 各位老师,
>    在使用kafka作为Streaming Table 
> 的时候,kafka的key包含下划线开头或hql关键字的时候,在跑cube作业的时候创建hive临时表报错。有没有什么配置可以让kylin在创建hive临时表的时候统一给字段名和表名加上反引号。我尝试把hive.support.sql11.reserved.keywords设置为false,但是还是不支持下划线开头和部分关键字。具体错误请参考附件。
>          谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3586) Boxing/unboxing to parse a primitive is suboptimal

2018-10-07 Thread Lijun Cao (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641066#comment-16641066
 ] 

Lijun Cao commented on KYLIN-3586:
--

Thanks, Ted Yu. I also modify parseInt to valueOf in KylinClient#wrapObject 
beacause its return type is Object.

> Boxing/unboxing to parse a primitive is suboptimal
> --
>
> Key: KYLIN-3586
> URL: https://issues.apache.org/jira/browse/KYLIN-3586
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.6.0
>
>
> An example is from HBaseLookupRowEncoder:
> {code}
> int valIdx = Integer.valueOf(Bytes.toString(qualifier));
> {code}
> valueOf returns an Integer object which would be unboxed and assigned to 
> valIdx.
> Integer.parseInt() should be used instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3594) Select with Catalog fails

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641062#comment-16641062
 ] 

ASF GitHub Bot commented on KYLIN-3594:
---

coveralls commented on issue #279: KYLIN-3594 remove unnecessary sql files in 
integration test
URL: https://github.com/apache/kylin/pull/279#issuecomment-427654298
 
 
   ## Pull Request Test Coverage Report for [Build 
3733](https://coveralls.io/builds/19392081)
   
   * **0** of **0**   changed or added relevant lines in **0** files are 
covered.
   * **4** unchanged lines in **3** files lost coverage.
   * Overall coverage increased (+**0.009%**) to **23.29%**
   
   ---
   
   
   |  Files with Coverage Reduction | New Missed Lines | % |
   | :-|--|--: |
   | 
[core-metadata/src/main/java/org/apache/kylin/source/datagen/ColumnGenerator.java](https://coveralls.io/builds/19392081/source?filename=core-metadata%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fsource%2Fdatagen%2FColumnGenerator.java#L319)
 | 1 | 81.08% |
   | 
[core-cube/src/main/java/org/apache/kylin/cube/inmemcubing/MemDiskStore.java](https://coveralls.io/builds/19392081/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Finmemcubing%2FMemDiskStore.java#L553)
 | 1 | 78.12% |
   | 
[core-cube/src/main/java/org/apache/kylin/cube/cuboid/TreeCuboidScheduler.java](https://coveralls.io/builds/19392081/source?filename=core-cube%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fkylin%2Fcube%2Fcuboid%2FTreeCuboidScheduler.java#L124)
 | 2 | 68.46% |
   
   
   |  Totals | [![Coverage 
Status](https://coveralls.io/builds/19392081/badge)](https://coveralls.io/builds/19392081)
 |
   | :-- | --: |
   | Change from base [Build 3732](https://coveralls.io/builds/19385953): |  
0.009% |
   | Covered Lines: | 16299 |
   | Relevant Lines: | 69983 |
   
   ---
   # 💛  - [Coveralls](https://coveralls.io)
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Select with Catalog fails
> -
>
> Key: KYLIN-3594
> URL: https://issues.apache.org/jira/browse/KYLIN-3594
> Project: Kylin
>  Issue Type: Bug
>Reporter: Hosur Narahari
>Assignee: XiaoXiang Yu
>Priority: Major
> Fix For: v2.6.0
>
>
> By using DatabaseMetaData if we get catalog using getCatalogs() method, it 
> return value "defaultCatalog". It returns actual hive schema when we execute 
> getSchemas().
> According to JDBC contract, catalog.schema.table should be valid from clause 
> and many query layers use that. But kylin fails when we execute that query.
> I've tried to write sample code piece for that below.
>  
>         _DatabaseMetaData db = conn.getMetaData();_
>         _ResultSet catalogSet = db.getCatalogs();_
>         _String catalog = "";_
>         _if(catalogSet.next()) {_
>             _catalog = catalogSet.getString("TABLE_CAT");_
>         _}_
>         _ResultSet schemaSet = db.getSchemas();_
>         _String schema = "";_
>         _if(schemaSet.next()) {_
>             _schema = schemaSet.getString("TABLE_SCHEM");_
>         _}_
>         _StringBuilder sb = new StringBuilder("SELECT * FROM ");_
>         _if(!catalog.isEmpty()) {_
>             _sb.append(catalog + ".");_
>         _}_
>         _if(!schema.isEmpty()) {_
>             _sb.append(schema + ".");_
>         _}_
>         _sb.append("kylin_sales limit 10");_
>         _String query = sb.toString();_
>         _Statement stat = conn.createStatement();_
>         _ResultSet rs = stat.executeQuery(query);_
>         _while(rs.next()) {_
>             _System.out.println(rs.getObject("trans_id"));_
>         _}_
> In short, the above snippet is executing the query,
> _select * from defaultCatalog.DEFAULT.kylin_sales._
>  
> Same thing happens even with different schemas if we have like,
> _select * from defaultCatalog.test.kylin_sales_ also fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3594) Select with Catalog fails

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641061#comment-16641061
 ] 

ASF GitHub Bot commented on KYLIN-3594:
---

codecov-io commented on issue #279: KYLIN-3594 remove unnecessary sql files in 
integration test
URL: https://github.com/apache/kylin/pull/279#issuecomment-427654295
 
 
   # [Codecov](https://codecov.io/gh/apache/kylin/pull/279?src=pr&el=h1) Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@eb01b69`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/kylin/pull/279/graphs/tree.svg?width=650&token=JawVgbgsVo&height=150&src=pr)](https://codecov.io/gh/apache/kylin/pull/279?src=pr&el=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master #279   +/-   ##
   =
 Coverage  ?   21.28%   
 Complexity? 4436   
   =
 Files ? 1087   
 Lines ?69983   
 Branches  ?10126   
   =
 Hits  ?14893   
 Misses?53686   
 Partials  ? 1404
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/kylin/pull/279?src=pr&el=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/kylin/pull/279?src=pr&el=footer). Last 
update 
[eb01b69...d1eff61](https://codecov.io/gh/apache/kylin/pull/279?src=pr&el=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Select with Catalog fails
> -
>
> Key: KYLIN-3594
> URL: https://issues.apache.org/jira/browse/KYLIN-3594
> Project: Kylin
>  Issue Type: Bug
>Reporter: Hosur Narahari
>Assignee: XiaoXiang Yu
>Priority: Major
> Fix For: v2.6.0
>
>
> By using DatabaseMetaData if we get catalog using getCatalogs() method, it 
> return value "defaultCatalog". It returns actual hive schema when we execute 
> getSchemas().
> According to JDBC contract, catalog.schema.table should be valid from clause 
> and many query layers use that. But kylin fails when we execute that query.
> I've tried to write sample code piece for that below.
>  
>         _DatabaseMetaData db = conn.getMetaData();_
>         _ResultSet catalogSet = db.getCatalogs();_
>         _String catalog = "";_
>         _if(catalogSet.next()) {_
>             _catalog = catalogSet.getString("TABLE_CAT");_
>         _}_
>         _ResultSet schemaSet = db.getSchemas();_
>         _String schema = "";_
>         _if(schemaSet.next()) {_
>             _schema = schemaSet.getString("TABLE_SCHEM");_
>         _}_
>         _StringBuilder sb = new StringBuilder("SELECT * FROM ");_
>         _if(!catalog.isEmpty()) {_
>             _sb.append(catalog + ".");_
>         _}_
>         _if(!schema.isEmpty()) {_
>             _sb.append(schema + ".");_
>         _}_
>         _sb.append("kylin_sales limit 10");_
>         _String query = sb.toString();_
>         _Statement stat = conn.createStatement();_
>         _ResultSet rs = stat.executeQuery(query);_
>         _while(rs.next()) {_
>             _System.out.println(rs.getObject("trans_id"));_
>         _}_
> In short, the above snippet is executing the query,
> _select * from defaultCatalog.DEFAULT.kylin_sales._
>  
> Same thing happens even with different schemas if we have like,
> _select * from defaultCatalog.test.kylin_sales_ also fails.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3586) Boxing/unboxing to parse a primitive is suboptimal

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641058#comment-16641058
 ] 

ASF GitHub Bot commented on KYLIN-3586:
---

caolijun1166 opened a new pull request #280: KYLIN-3586 Fix remaining 
boxing/unboxing problem
URL: https://github.com/apache/kylin/pull/280
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Boxing/unboxing to parse a primitive is suboptimal
> --
>
> Key: KYLIN-3586
> URL: https://issues.apache.org/jira/browse/KYLIN-3586
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.6.0
>
>
> An example is from HBaseLookupRowEncoder:
> {code}
> int valIdx = Integer.valueOf(Bytes.toString(qualifier));
> {code}
> valueOf returns an Integer object which would be unboxed and assigned to 
> valIdx.
> Integer.parseInt() should be used instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3586) Boxing/unboxing to parse a primitive is suboptimal

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641059#comment-16641059
 ] 

ASF GitHub Bot commented on KYLIN-3586:
---

asfgit commented on issue #280: KYLIN-3586 Fix remaining boxing/unboxing problem
URL: https://github.com/apache/kylin/pull/280#issuecomment-427654284
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Boxing/unboxing to parse a primitive is suboptimal
> --
>
> Key: KYLIN-3586
> URL: https://issues.apache.org/jira/browse/KYLIN-3586
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.6.0
>
>
> An example is from HBaseLookupRowEncoder:
> {code}
> int valIdx = Integer.valueOf(Bytes.toString(qualifier));
> {code}
> valueOf returns an Integer object which would be unboxed and assigned to 
> valIdx.
> Integer.parseInt() should be used instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3586) Boxing/unboxing to parse a primitive is suboptimal

2018-10-07 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641060#comment-16641060
 ] 

ASF GitHub Bot commented on KYLIN-3586:
---

asfgit commented on issue #280: KYLIN-3586 Fix remaining boxing/unboxing problem
URL: https://github.com/apache/kylin/pull/280#issuecomment-427654285
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Boxing/unboxing to parse a primitive is suboptimal
> --
>
> Key: KYLIN-3586
> URL: https://issues.apache.org/jira/browse/KYLIN-3586
> Project: Kylin
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: Lijun Cao
>Priority: Major
> Fix For: v2.6.0
>
>
> An example is from HBaseLookupRowEncoder:
> {code}
> int valIdx = Integer.valueOf(Bytes.toString(qualifier));
> {code}
> valueOf returns an Integer object which would be unboxed and assigned to 
> valIdx.
> Integer.parseInt() should be used instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3596) kafka使用特殊字符或hql关键字作为key,跑cube作业的时候创建hive临时表报错

2018-10-07 Thread XiaoXiang Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16641064#comment-16641064
 ] 

XiaoXiang Yu commented on KYLIN-3596:
-

Hi [~xiewenliang], I think add back quote to each column is not a good idea. If 
so, kylin can create intermediate flat table properly, but other part of 
code(which build the cube and query the cube) should be modified at the same 
time, or they will throw error for the same reason. 

 

我觉得这样加反引号的方式不适合, 因为要改动的地方比较多, 涉及到后面的构建cube和查询cube, 都需要做代码层面的修改, 我还没有什么更好的想法, 
不过我觉得这样的解决方式不合适.

> kafka使用特殊字符或hql关键字作为key,跑cube作业的时候创建hive临时表报错
> -
>
> Key: KYLIN-3596
> URL: https://issues.apache.org/jira/browse/KYLIN-3596
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Affects Versions: v2.4.0
>Reporter: wlxie
>Assignee: XiaoXiang Yu
>Priority: Major
> Attachments: kafka使用关键字作为key.txt
>
>
> 各位老师,
>    在使用kafka作为Streaming Table 
> 的时候,kafka的key包含下划线开头或hql关键字的时候,在跑cube作业的时候创建hive临时表报错。有没有什么配置可以让kylin在创建hive临时表的时候统一给字段名和表名加上反引号。我尝试把hive.support.sql11.reserved.keywords设置为false,但是还是不支持下划线开头和部分关键字。具体错误请参考附件。
>          谢谢。



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)