[jira] [Commented] (KYLIN-1676) High CPU in TrieDictionary due to incorrect use of HashMap
[ https://issues.apache.org/jira/browse/KYLIN-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283829#comment-15283829 ] liyang commented on KYLIN-1676: --- Nice catch! Yanghong can do the merge and save me some time. :-) > High CPU in TrieDictionary due to incorrect use of HashMap > -- > > Key: KYLIN-1676 > URL: https://issues.apache.org/jira/browse/KYLIN-1676 > Project: Kylin > Issue Type: Bug > Components: Metadata >Affects Versions: v1.4.0 >Reporter: qianqiaoneng >Assignee: liyang > Attachments: fix_hashmap_concurrency_issue_1.4rc.patch, > fix_hashmap_concurrency_issue_master.patch > > > 10015 b_kylin 20 0 62.5g 6.7g 29m R 99.9 4.7 431:15.42 java > 10723 b_kylin 20 0 62.5g 6.7g 29m R 99.9 4.7 432:30.48 java > 10724 b_kylin 20 0 62.5g 6.7g 29m R 99.9 4.7 432:30.76 java > 10781 b_kylin 20 0 62.5g 6.7g 29m R 99.9 4.7 429:02.64 java > 30929 b_kylin 20 0 62.5g 6.7g 29m R 99.9 4.7 430:21.31 java > 10014 b_kylin 20 0 62.5g 6.7g 29m R 99.6 4.7 432:32.71 java > 10722 b_kylin 20 0 62.5g 6.7g 29m R 99.6 4.7 433:05.26 java > 10827 b_kylin 20 0 62.5g 6.7g 29m R 99.6 4.7 430:27.80 java > > > > at java.util.HashMap.getEntry(HashMap.java:465) > at java.util.HashMap.get(HashMap.java:417) > at > org.apache.kylin.dict.TrieDictionary.getIdFromValueImpl(TrieDictionary.java:151) > at org.apache.kylin.dict.Dictionary.getIdFromValue(Dictionary.java:98) > at > org.apache.kylin.cube.gridtable.CubeCodeSystem$DictionarySerializer.serializeWithRounding(CubeCodeSystem.java:219) > at > org.apache.kylin.cube.gridtable.CubeCodeSystem.encodeColumnValue(CubeCodeSystem.java:130) > at org.apache.kylin.gridtable.GTUtil$1.translate(GTUtil.java:207) > at > org.apache.kylin.gridtable.GTUtil$1.encodeConstants(GTUtil.java:140) >at org.apache.kylin.gridtable.GTUtil$1.onSerialize(GTUtil.java:105) > at > org.apache.kylin.metadata.filter.TupleFilterSerializer.internalSerialize(TupleFilterSerializer.java:63) > at > org.apache.kylin.metadata.filter.TupleFilterSerializer.internalSerialize(TupleFilterSerializer.java:75) > at > org.apache.kylin.metadata.filter.TupleFilterSerializer.serialize(TupleFilterSerializer.java:55) > at org.apache.kylin.gridtable.GTUtil.convertFilter(GTUtil.java:76) > at > org.apache.kylin.gridtable.GTUtil.convertFilterColumnsAndConstants(GTUtil.java:66) > at > org.apache.kylin.storage.hbase.cube.v2.CubeSegmentScanner.(CubeSegmentScanner.java:89) > at > org.apache.kylin.storage.hbase.cube.v2.CubeStorageQuery.search(CubeStorageQuery.java:120) > at > org.apache.kylin.storage.cache.CacheFledgedStaticQuery.search(CacheFledgedStaticQuery.java:59) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.queryStorage(OLAPEnumerator.java:125) > at > org.apache.kylin.query.enumerator.OLAPEnumerator.moveNext(OLAPEnumerator.java:71) > at Baz$1$1.moveNext(Unknown Source) > at > org.apache.calcite.linq4j.EnumerableDefaults.aggregate(EnumerableDefaults.java:116) > at > org.apache.calcite.linq4j.DefaultEnumerable.aggregate(DefaultEnumerable.java:107) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-1670) Unable to find measures, once after cube built successfully.
[ https://issues.apache.org/jira/browse/KYLIN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyang resolved KYLIN-1670. --- Resolution: Not A Problem > Unable to find measures, once after cube built successfully. > > > Key: KYLIN-1670 > URL: https://issues.apache.org/jira/browse/KYLIN-1670 > Project: Kylin > Issue Type: Bug > Components: Tools, Build and Test >Affects Versions: v1.5.1 > Environment: Testing >Reporter: Bhanuprakash >Assignee: hongbin ma > Attachments: Cube.PNG, Measure.PNG > > > We are facing couple of issues , after building the cube in Kylin, > 1) Measures are not displayed once after the cube is succesffully built. > 2) cant find option to join / group by b/w Dimensions & Measure for Slice & > Dice. > 3) connected through EXCEL but the complete data is getting loaded into excel > and again need to give relationship b/w dimension and Fact to pivot the data. > 4) excel limitation to 1 million, how will it support for Big data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1670) Unable to find measures, once after cube built successfully.
[ https://issues.apache.org/jira/browse/KYLIN-1670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283832#comment-15283832 ] liyang commented on KYLIN-1670: --- Unlike traditional cube engine, Kylin exposes SQL interface rather than MDX. That's why at query time, what user see is a relational model rather than a cube model. There are requirements about exposing cube model and supporting MDX interface on the long term roadmap. - https://issues.apache.org/jira/browse/KYLIN-776 - https://issues.apache.org/jira/browse/KYLIN-1525 > Unable to find measures, once after cube built successfully. > > > Key: KYLIN-1670 > URL: https://issues.apache.org/jira/browse/KYLIN-1670 > Project: Kylin > Issue Type: Bug > Components: Tools, Build and Test >Affects Versions: v1.5.1 > Environment: Testing >Reporter: Bhanuprakash >Assignee: hongbin ma > Attachments: Cube.PNG, Measure.PNG > > > We are facing couple of issues , after building the cube in Kylin, > 1) Measures are not displayed once after the cube is succesffully built. > 2) cant find option to join / group by b/w Dimensions & Measure for Slice & > Dice. > 3) connected through EXCEL but the complete data is getting loaded into excel > and again need to give relationship b/w dimension and Fact to pivot the data. > 4) excel limitation to 1 million, how will it support for Big data. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1684) query on table "kylin_sales" return empty resultset after cube "kylin_sales_cube" which generated by sample.sh is ready
[ https://issues.apache.org/jira/browse/KYLIN-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283851#comment-15283851 ] hongbin ma commented on KYLIN-1684: --- hi [~whenwin] the check is for skipping segments those have 0 records. As I remember if you removed the check, the query runtime will throw some exceptions when 0 record segment is met. Have you verified that? > query on table "kylin_sales" return empty resultset after cube > "kylin_sales_cube" which generated by sample.sh is ready > --- > > Key: KYLIN-1684 > URL: https://issues.apache.org/jira/browse/KYLIN-1684 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v1.5.1 > Environment: cluster: > hadoop-2.6.0 > hbase-0.98.8 > hive-0.14.0 >Reporter: wangxianbin >Assignee: wangxianbin > Attachments: > 1.5.1-release-hotfix-KYLIN-1684-query-on-table-kylin_sales-return-empty-r.patch, > log for Build Base Cuboid Data.png, log when run query.png > > > there is a check for "InputRecords" in CubeStorageQuery search method which > seem like unnecessary, as follow: > List scanners = Lists.newArrayList(); > for (CubeSegment cubeSeg : > cubeInstance.getSegments(SegmentStatusEnum.READY)) { > CubeSegmentScanner scanner; > if (cubeSeg.getInputRecords() == 0) { > logger.info("Skip cube segment {} because its input record is > 0", cubeSeg); > continue; > } > scanner = new CubeSegmentScanner(cubeSeg, cuboid, dimensionsD, > groupsD, metrics, filterD, !isExactAggregation); > scanners.add(scanner); > } > if (scanners.isEmpty()) > return ITupleIterator.EMPTY_TUPLE_ITERATOR; > return new SequentialCubeTupleIterator(scanners, cuboid, dimensionsD, > metrics, returnTupleInfo, context); > this check will cause query return empty resultset, even there is data in > storage engine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1664) rest api '/kylin/api/admin/config' without security check
[ https://issues.apache.org/jira/browse/KYLIN-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283862#comment-15283862 ] hongbin ma commented on KYLIN-1664: --- hi, This API as well as some others are set to be authentication-free for some CLI tool's convenience. The issue is not fixed because kylin configs are treated as not sensitive. Do you have any security concerns on this? > rest api '/kylin/api/admin/config' without security check > - > > Key: KYLIN-1664 > URL: https://issues.apache.org/jira/browse/KYLIN-1664 > Project: Kylin > Issue Type: Bug > Components: REST Service >Affects Versions: v1.5.1 > Environment: Ubuntu 14.4 > Jdk 1.7.0 > Kylin 1.5.1 binary >Reporter: Hanhui LI >Assignee: Zhong,Jason > Labels: test > Original Estimate: 24h > Remaining Estimate: 24h > > rest api '/kylin/api/admin/config' without security check. > Please check the follwoing: > === > GET Request: > http://127.0.0.1:7070/kylin/api/admin/config > Response: > {"config":"kylin.hbase.region.cut.large=50\nkylin.hbase.default.compression.codec=snappy\ndeploy.env=QA\nacl.adminRole=ROLE_ADMIN\nkylin.sandbox=true\nkylin.hdfs.working.dir=/kylin\nldap.user.searchBase=\nkylin.job.concurrent.max.limit=10\nkylin.job.remote.cli.password=\nsaml.metadata.file=classpath:sso_metadata.xml\nkylin.job.yarn.app.rest.check.interval.seconds=10\nmail.sender=\nmail.password=\nkylin.job.remote.cli.username=\nmail.username=\nsaml.context.serverPort=443\nkylin.web.help.length=4\nkylin.job.run.as.remote.cmd=false\nldap.service.searchPattern=\nkylin.web.contact_mail=\nldap.user.groupSearchBase=\nkylin.hbase.region.cut.small=5\nkylin.web.hive.limit=20\nkylin.job.mapreduce.default.reduce.input.mb=500\nkylin.job.hive.database.for.intermediatetable=default\nkylin.metadata.url=kylin_metadata@hbase\nldap.password=\nldap.username=\nkylin.storage.url=hbase\nganglia.port=8664\nldap.user.searchPattern=\nkylin.job.status.with.kerberos=false\nganglia.group=\nkylin.hbase.cluster.fs=\nacl.defaultRole=ROLE_ANALYST,ROLE_MODELER\nsaml.context.contextPath=/kylin\nmail.host=\nkylin.job.remote.cli.working.dir=/tmp/kylin\nkylin.web.diagnostic=\nsaml.context.scheme=https\nkylin.job.cubing.inmem.sampling.percent=100\nldap.service.groupSearchBase=\nsaml.metadata.entityBaseURL=https://hostname/kylin\nkylin.hbase.hfile.size.gb=5\nldap.service.searchBase=\nkylin.owner=who...@kylin.apache.org\nmail.enabled=false\nkylin.rest.servers=localhost:7070\nkylin.security.profile=testing\nkylin.job.retry=0\nsaml.context.serverName=hostname\nldap.server=ldap://ldap_server:389\nkylin.job.remote.cli.hostname=\nkylin.query.security.enabled=true\nkylin.server.mode=all\nkylin.web.help.3=onboard|Cube > Design Tutorial|\nkylin.web.help.2=tableau|Tableau > Guide|\nkylin.web.help.1=odbc|ODBC > Driver|\nkylin.hbase.region.cut.medium=10\nkylin.web.help.0=start|Getting > Started|\nkylin.web.hadoop=\nkylin.web.streaming.guide=http://kylin.apache.org/\n"} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (KYLIN-1664) rest api '/kylin/api/admin/config' without security check
[ https://issues.apache.org/jira/browse/KYLIN-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hongbin ma reassigned KYLIN-1664: - Assignee: hongbin ma (was: Zhong,Jason) > rest api '/kylin/api/admin/config' without security check > - > > Key: KYLIN-1664 > URL: https://issues.apache.org/jira/browse/KYLIN-1664 > Project: Kylin > Issue Type: Bug > Components: REST Service >Affects Versions: v1.5.1 > Environment: Ubuntu 14.4 > Jdk 1.7.0 > Kylin 1.5.1 binary >Reporter: Hanhui LI >Assignee: hongbin ma > Labels: test > Original Estimate: 24h > Remaining Estimate: 24h > > rest api '/kylin/api/admin/config' without security check. > Please check the follwoing: > === > GET Request: > http://127.0.0.1:7070/kylin/api/admin/config > Response: > {"config":"kylin.hbase.region.cut.large=50\nkylin.hbase.default.compression.codec=snappy\ndeploy.env=QA\nacl.adminRole=ROLE_ADMIN\nkylin.sandbox=true\nkylin.hdfs.working.dir=/kylin\nldap.user.searchBase=\nkylin.job.concurrent.max.limit=10\nkylin.job.remote.cli.password=\nsaml.metadata.file=classpath:sso_metadata.xml\nkylin.job.yarn.app.rest.check.interval.seconds=10\nmail.sender=\nmail.password=\nkylin.job.remote.cli.username=\nmail.username=\nsaml.context.serverPort=443\nkylin.web.help.length=4\nkylin.job.run.as.remote.cmd=false\nldap.service.searchPattern=\nkylin.web.contact_mail=\nldap.user.groupSearchBase=\nkylin.hbase.region.cut.small=5\nkylin.web.hive.limit=20\nkylin.job.mapreduce.default.reduce.input.mb=500\nkylin.job.hive.database.for.intermediatetable=default\nkylin.metadata.url=kylin_metadata@hbase\nldap.password=\nldap.username=\nkylin.storage.url=hbase\nganglia.port=8664\nldap.user.searchPattern=\nkylin.job.status.with.kerberos=false\nganglia.group=\nkylin.hbase.cluster.fs=\nacl.defaultRole=ROLE_ANALYST,ROLE_MODELER\nsaml.context.contextPath=/kylin\nmail.host=\nkylin.job.remote.cli.working.dir=/tmp/kylin\nkylin.web.diagnostic=\nsaml.context.scheme=https\nkylin.job.cubing.inmem.sampling.percent=100\nldap.service.groupSearchBase=\nsaml.metadata.entityBaseURL=https://hostname/kylin\nkylin.hbase.hfile.size.gb=5\nldap.service.searchBase=\nkylin.owner=who...@kylin.apache.org\nmail.enabled=false\nkylin.rest.servers=localhost:7070\nkylin.security.profile=testing\nkylin.job.retry=0\nsaml.context.serverName=hostname\nldap.server=ldap://ldap_server:389\nkylin.job.remote.cli.hostname=\nkylin.query.security.enabled=true\nkylin.server.mode=all\nkylin.web.help.3=onboard|Cube > Design Tutorial|\nkylin.web.help.2=tableau|Tableau > Guide|\nkylin.web.help.1=odbc|ODBC > Driver|\nkylin.hbase.region.cut.medium=10\nkylin.web.help.0=start|Getting > Started|\nkylin.web.hadoop=\nkylin.web.streaming.guide=http://kylin.apache.org/\n"} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1689) bug when a column being dimension as well as in a sum metric
hongbin ma created KYLIN-1689: - Summary: bug when a column being dimension as well as in a sum metric Key: KYLIN-1689 URL: https://issues.apache.org/jira/browse/KYLIN-1689 Project: Kylin Issue Type: Bug Reporter: hongbin ma Assignee: hongbin ma Hi all, I recently built a cube named c1, use 2 columns as dimensions ,”rule_name”,” PARTNER_GAIN_PAY_PT_DOC_CNT”, also use ” sum(PARTNER_GAIN_PAY_PT_DOC_CNT)” as measure. C1 was built successfully. So, I made a query sql to test, that is “select rule_name,PARTNER_GAIN_PAY_PT_DOC_CNT ,count(*),sum(PARTNER_GAIN_PAY_PT_DOC_CNT) from CUB_PARTNER_GAIN_PAY_PT_PRE0_AT0_S where rule_name='1号店3C产品' group by rule_name,PARTNER_GAIN_PAY_PT_DOC_CNT;”, but the result is not probably exactly. RULE_NAME PARTNER... EXPR$2 EXPR$3 1号店3C产品1860 301860 1号店3C产品700 2 700 1号店3C产品7410 387410 1号店3C产品2940 602940 In my opinion,”count(*)” means the amount of records with the same rule_name and PARTNER_GAIN_PAY_PT_DOC_CNT, so I think sum(PARTNER…) equals Count(*) * PARTNER_GAIN_PAY_PT_DOC_CNT, but the truth is not , I wonder if there is something wrong with my understanding? Insight snapshot as below: -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1690) always returning 0 or 1 for sum(a)/sum(b) for integer type a and b
hongbin ma created KYLIN-1690: - Summary: always returning 0 or 1 for sum(a)/sum(b) for integer type a and b Key: KYLIN-1690 URL: https://issues.apache.org/jira/browse/KYLIN-1690 Project: Kylin Issue Type: Bug Reporter: hongbin ma Assignee: hongbin ma I want to get a value which is defined as sum(a)/sum(b), how can I do this kind of anlysis. Now I build a cube which have sum(a) and sum(b), when I execute “select sum(a)/sum(b) from table1 group by c” ,the result is wrong. sum(a)/sum(b) the result is all 0 and sum(b)/sum(a) result is all 1. MMENE_NAMESUCC ATTSUCC/ATT CSMME15BZX 336981 368366 1 CSMME32BZX 338754 366842 1 CSMME07BZX 687965 747694 1 CSMME03BHW 703269 747623 1 CSMME12BZX 705856 764656 1 CSMME16BHW 1962293142173 1 MMENE_NAME SUCC ATT ATT/SUCC CSMME15BZX 336981 368366 0 CSMME32BZX 338754 366842 0 CSMME07BZX 687965 747694 0 CSMME03BHW 703269 747623 0 CSMME12BZX 705856 764656 0 CSMME16BHW 1962293142173 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1641) Spark - pagination
[ https://issues.apache.org/jira/browse/KYLIN-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283870#comment-15283870 ] hongbin ma commented on KYLIN-1641: --- it seems posted in wrong project > Spark - pagination > -- > > Key: KYLIN-1641 > URL: https://issues.apache.org/jira/browse/KYLIN-1641 > Project: Kylin > Issue Type: Improvement >Reporter: Dileep > > Issue: we have inserted around 10 million records in hive and show the > results in web interface through spark dataframe. We cannot get all those 10 > million and do the pagination in the front end. So we did the pagination in > the spark dataframe using following approach > df1 =df.limit(rowsperPage * pagenumer) > df2 = df1.limit(rowsperPage * (pagenumer -1)) > df1.subtract(df2)).collect(). > This working fine but when we go up the pagenumber (last page ) it is slowing > down and not get the results back to front end. > Just want to check what we are doing right or any other solution for this > problem > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1641) Spark - pagination
[ https://issues.apache.org/jira/browse/KYLIN-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283879#comment-15283879 ] Dileep commented on KYLIN-1641: --- may i know which project I need to post > Spark - pagination > -- > > Key: KYLIN-1641 > URL: https://issues.apache.org/jira/browse/KYLIN-1641 > Project: Kylin > Issue Type: Improvement >Reporter: Dileep > > Issue: we have inserted around 10 million records in hive and show the > results in web interface through spark dataframe. We cannot get all those 10 > million and do the pagination in the front end. So we did the pagination in > the spark dataframe using following approach > df1 =df.limit(rowsperPage * pagenumer) > df2 = df1.limit(rowsperPage * (pagenumer -1)) > df1.subtract(df2)).collect(). > This working fine but when we go up the pagenumber (last page ) it is slowing > down and not get the results back to front end. > Just want to check what we are doing right or any other solution for this > problem > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1641) Spark - pagination
[ https://issues.apache.org/jira/browse/KYLIN-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284044#comment-15284044 ] Dong Li commented on KYLIN-1641: Is this a Spark issue? If so please go to Spark project. > Spark - pagination > -- > > Key: KYLIN-1641 > URL: https://issues.apache.org/jira/browse/KYLIN-1641 > Project: Kylin > Issue Type: Improvement >Reporter: Dileep > > Issue: we have inserted around 10 million records in hive and show the > results in web interface through spark dataframe. We cannot get all those 10 > million and do the pagination in the front end. So we did the pagination in > the spark dataframe using following approach > df1 =df.limit(rowsperPage * pagenumer) > df2 = df1.limit(rowsperPage * (pagenumer -1)) > df1.subtract(df2)).collect(). > This working fine but when we go up the pagenumber (last page ) it is slowing > down and not get the results back to front end. > Just want to check what we are doing right or any other solution for this > problem > Thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1690) always returning 0 or 1 for sum(a)/sum(b) for integer type a and b
[ https://issues.apache.org/jira/browse/KYLIN-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284046#comment-15284046 ] Dong Li commented on KYLIN-1690: This is a duplicated JIRA, which is resolved with workaround. Link them for tracking. > always returning 0 or 1 for sum(a)/sum(b) for integer type a and b > -- > > Key: KYLIN-1690 > URL: https://issues.apache.org/jira/browse/KYLIN-1690 > Project: Kylin > Issue Type: Bug >Reporter: hongbin ma >Assignee: hongbin ma > > I want to get a value which is defined as sum(a)/sum(b), how can I do > this kind of anlysis. > Now I build a cube which have sum(a) and sum(b), when I execute “select > sum(a)/sum(b) from table1 group by c” ,the result is wrong. sum(a)/sum(b) the > result is all 0 and sum(b)/sum(a) result is all 1. > MMENE_NAMESUCC ATTSUCC/ATT > CSMME15BZX 336981 368366 1 > CSMME32BZX 338754 366842 1 > CSMME07BZX 687965 747694 1 > CSMME03BHW 703269 747623 1 > CSMME12BZX 705856 764656 1 > CSMME16BHW 1962293142173 1 >MMENE_NAME SUCC ATT ATT/SUCC > CSMME15BZX 336981 368366 0 > CSMME32BZX 338754 366842 0 > CSMME07BZX 687965 747694 0 > CSMME03BHW 703269 747623 0 > CSMME12BZX 705856 764656 0 > CSMME16BHW 1962293142173 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KYLIN-1690) always returning 0 or 1 for sum(a)/sum(b) for integer type a and b
[ https://issues.apache.org/jira/browse/KYLIN-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284046#comment-15284046 ] Dong Li edited comment on KYLIN-1690 at 5/16/16 1:13 AM: - This is a duplicated JIRA KYLIN-1630, which is resolved with workaround. Link them for tracking. was (Author: lidong_sjtu): This is a duplicated JIRA, which is resolved with workaround. Link them for tracking. > always returning 0 or 1 for sum(a)/sum(b) for integer type a and b > -- > > Key: KYLIN-1690 > URL: https://issues.apache.org/jira/browse/KYLIN-1690 > Project: Kylin > Issue Type: Bug >Reporter: hongbin ma >Assignee: hongbin ma > > I want to get a value which is defined as sum(a)/sum(b), how can I do > this kind of anlysis. > Now I build a cube which have sum(a) and sum(b), when I execute “select > sum(a)/sum(b) from table1 group by c” ,the result is wrong. sum(a)/sum(b) the > result is all 0 and sum(b)/sum(a) result is all 1. > MMENE_NAMESUCC ATTSUCC/ATT > CSMME15BZX 336981 368366 1 > CSMME32BZX 338754 366842 1 > CSMME07BZX 687965 747694 1 > CSMME03BHW 703269 747623 1 > CSMME12BZX 705856 764656 1 > CSMME16BHW 1962293142173 1 >MMENE_NAME SUCC ATT ATT/SUCC > CSMME15BZX 336981 368366 0 > CSMME32BZX 338754 366842 0 > CSMME07BZX 687965 747694 0 > CSMME03BHW 703269 747623 0 > CSMME12BZX 705856 764656 0 > CSMME16BHW 1962293142173 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (KYLIN-1690) always returning 0 or 1 for sum(a)/sum(b) for integer type a and b
[ https://issues.apache.org/jira/browse/KYLIN-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284046#comment-15284046 ] Dong Li edited comment on KYLIN-1690 at 5/16/16 1:20 AM: - There is a duplicated JIRA KYLIN-1630, which is resolved with workaround. Link them for tracking. was (Author: lidong_sjtu): This is a duplicated JIRA KYLIN-1630, which is resolved with workaround. Link them for tracking. > always returning 0 or 1 for sum(a)/sum(b) for integer type a and b > -- > > Key: KYLIN-1690 > URL: https://issues.apache.org/jira/browse/KYLIN-1690 > Project: Kylin > Issue Type: Bug >Reporter: hongbin ma >Assignee: hongbin ma > > I want to get a value which is defined as sum(a)/sum(b), how can I do > this kind of anlysis. > Now I build a cube which have sum(a) and sum(b), when I execute “select > sum(a)/sum(b) from table1 group by c” ,the result is wrong. sum(a)/sum(b) the > result is all 0 and sum(b)/sum(a) result is all 1. > MMENE_NAMESUCC ATTSUCC/ATT > CSMME15BZX 336981 368366 1 > CSMME32BZX 338754 366842 1 > CSMME07BZX 687965 747694 1 > CSMME03BHW 703269 747623 1 > CSMME12BZX 705856 764656 1 > CSMME16BHW 1962293142173 1 >MMENE_NAME SUCC ATT ATT/SUCC > CSMME15BZX 336981 368366 0 > CSMME32BZX 338754 366842 0 > CSMME07BZX 687965 747694 0 > CSMME03BHW 703269 747623 0 > CSMME12BZX 705856 764656 0 > CSMME16BHW 1962293142173 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1672) support kylin on cdh 5.7
[ https://issues.apache.org/jira/browse/KYLIN-1672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lingyan Jiang updated KYLIN-1672: - Attachment: (was: 0001-KYLIN-1672-support-for-kylin-on-cdh5.7.0.patch) > support kylin on cdh 5.7 > > > Key: KYLIN-1672 > URL: https://issues.apache.org/jira/browse/KYLIN-1672 > Project: Kylin > Issue Type: New Feature > Components: Environment >Reporter: Dong Li >Assignee: Lingyan Jiang >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1664) rest api '/kylin/api/admin/config' without security check
[ https://issues.apache.org/jira/browse/KYLIN-1664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284084#comment-15284084 ] Hanhui LI commented on KYLIN-1664: -- Thanks a lot It is not a good solution to set APIs to be authentication-free just for CLI tool's convenience. Are there any sensitive info for these APIs? I am not sure whether kylin configs are sensitive. But I am sure someone will think it is. :) > rest api '/kylin/api/admin/config' without security check > - > > Key: KYLIN-1664 > URL: https://issues.apache.org/jira/browse/KYLIN-1664 > Project: Kylin > Issue Type: Bug > Components: REST Service >Affects Versions: v1.5.1 > Environment: Ubuntu 14.4 > Jdk 1.7.0 > Kylin 1.5.1 binary >Reporter: Hanhui LI >Assignee: hongbin ma > Labels: test > Original Estimate: 24h > Remaining Estimate: 24h > > rest api '/kylin/api/admin/config' without security check. > Please check the follwoing: > === > GET Request: > http://127.0.0.1:7070/kylin/api/admin/config > Response: > {"config":"kylin.hbase.region.cut.large=50\nkylin.hbase.default.compression.codec=snappy\ndeploy.env=QA\nacl.adminRole=ROLE_ADMIN\nkylin.sandbox=true\nkylin.hdfs.working.dir=/kylin\nldap.user.searchBase=\nkylin.job.concurrent.max.limit=10\nkylin.job.remote.cli.password=\nsaml.metadata.file=classpath:sso_metadata.xml\nkylin.job.yarn.app.rest.check.interval.seconds=10\nmail.sender=\nmail.password=\nkylin.job.remote.cli.username=\nmail.username=\nsaml.context.serverPort=443\nkylin.web.help.length=4\nkylin.job.run.as.remote.cmd=false\nldap.service.searchPattern=\nkylin.web.contact_mail=\nldap.user.groupSearchBase=\nkylin.hbase.region.cut.small=5\nkylin.web.hive.limit=20\nkylin.job.mapreduce.default.reduce.input.mb=500\nkylin.job.hive.database.for.intermediatetable=default\nkylin.metadata.url=kylin_metadata@hbase\nldap.password=\nldap.username=\nkylin.storage.url=hbase\nganglia.port=8664\nldap.user.searchPattern=\nkylin.job.status.with.kerberos=false\nganglia.group=\nkylin.hbase.cluster.fs=\nacl.defaultRole=ROLE_ANALYST,ROLE_MODELER\nsaml.context.contextPath=/kylin\nmail.host=\nkylin.job.remote.cli.working.dir=/tmp/kylin\nkylin.web.diagnostic=\nsaml.context.scheme=https\nkylin.job.cubing.inmem.sampling.percent=100\nldap.service.groupSearchBase=\nsaml.metadata.entityBaseURL=https://hostname/kylin\nkylin.hbase.hfile.size.gb=5\nldap.service.searchBase=\nkylin.owner=who...@kylin.apache.org\nmail.enabled=false\nkylin.rest.servers=localhost:7070\nkylin.security.profile=testing\nkylin.job.retry=0\nsaml.context.serverName=hostname\nldap.server=ldap://ldap_server:389\nkylin.job.remote.cli.hostname=\nkylin.query.security.enabled=true\nkylin.server.mode=all\nkylin.web.help.3=onboard|Cube > Design Tutorial|\nkylin.web.help.2=tableau|Tableau > Guide|\nkylin.web.help.1=odbc|ODBC > Driver|\nkylin.hbase.region.cut.medium=10\nkylin.web.help.0=start|Getting > Started|\nkylin.web.hadoop=\nkylin.web.streaming.guide=http://kylin.apache.org/\n"} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1691) can not load project info from hbase when startup.
Hanhui LI created KYLIN-1691: Summary: can not load project info from hbase when startup. Key: KYLIN-1691 URL: https://issues.apache.org/jira/browse/KYLIN-1691 Project: Kylin Issue Type: Bug Components: Environment Affects Versions: v1.5.1 Environment: Ubuntu 14 JDK 1.7 kylin 1.5.1 Reporter: Hanhui LI Assignee: hongbin ma can not load project info from hbase when startup if directory kylin_metadata@hbase is created in $KYLIN_HOME -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1692) kylin server hanged during startup if hbase is not up.
Hanhui LI created KYLIN-1692: Summary: kylin server hanged during startup if hbase is not up. Key: KYLIN-1692 URL: https://issues.apache.org/jira/browse/KYLIN-1692 Project: Kylin Issue Type: Bug Affects Versions: v1.5.1 Environment: Ubuntu 14 JDK 1.7 kylin 1.5.1 Reporter: Hanhui LI kylin server hanged during startup if hbase is not up. kylin can not re-connect to hbase even hbase is up later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1691) can not load project info from hbase when startup.
[ https://issues.apache.org/jira/browse/KYLIN-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanhui LI updated KYLIN-1691: - Environment: Ubuntu 14 JDK 1.7 kylin 1.5.1 - Binary Package (for running on HBase 1.x) was: Ubuntu 14 JDK 1.7 kylin 1.5.1 > can not load project info from hbase when startup. > -- > > Key: KYLIN-1691 > URL: https://issues.apache.org/jira/browse/KYLIN-1691 > Project: Kylin > Issue Type: Bug > Components: Environment >Affects Versions: v1.5.1 > Environment: Ubuntu 14 > JDK 1.7 > kylin 1.5.1 - Binary Package (for running on HBase 1.x) >Reporter: Hanhui LI >Assignee: hongbin ma > > can not load project info from hbase when startup if directory > kylin_metadata@hbase is created in $KYLIN_HOME -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1692) kylin server hanged during startup if hbase is not up.
[ https://issues.apache.org/jira/browse/KYLIN-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanhui LI updated KYLIN-1692: - Environment: Ubuntu 14 JDK 1.7 kylin 1.5.1 - Binary Package (for running on HBase 1.x) was: Ubuntu 14 JDK 1.7 kylin 1.5.1 > kylin server hanged during startup if hbase is not up. > -- > > Key: KYLIN-1692 > URL: https://issues.apache.org/jira/browse/KYLIN-1692 > Project: Kylin > Issue Type: Bug >Affects Versions: v1.5.1 > Environment: Ubuntu 14 > JDK 1.7 > kylin 1.5.1 - Binary Package (for running on HBase 1.x) >Reporter: Hanhui LI > > kylin server hanged during startup if hbase is not up. > kylin can not re-connect to hbase even hbase is up later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1693) Support multiple group-by columns for TOP_N meausre
JunAn Chen created KYLIN-1693: - Summary: Support multiple group-by columns for TOP_N meausre Key: KYLIN-1693 URL: https://issues.apache.org/jira/browse/KYLIN-1693 Project: Kylin Issue Type: New Feature Components: Query Engine Affects Versions: v1.5.1 Reporter: JunAn Chen Assignee: liyang For this case: table name : "tbl" columns: (dim_city, dim_industry, keyword, pv) the "keyword" column has a large cardinality, for about ten million. currently I can build "top100 pv" in (dim_city), (dim_industry). But I also want to build "top100 pv" in (dim_city, dim_industry) and "top100 pv of keyword" in (dim_city), (dim_industry) and (dim_city, dim_industrt). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1694) make multiply coefficient configurable when estimating cuboid size
kangkaisen created KYLIN-1694: - Summary: make multiply coefficient configurable when estimating cuboid size Key: KYLIN-1694 URL: https://issues.apache.org/jira/browse/KYLIN-1694 Project: Kylin Issue Type: Bug Components: Job Engine Affects Versions: v1.5.1, v1.5.0 Reporter: kangkaisen Assignee: Dong Li In the current version of MRv2 build engine, in CubeStatsReader when estimating cuboid size , the curent method is "cube is memory hungry, storage size estimation multiply 0.05" and "cube is not memory hungry, storage size estimation multiply 0.25". This has one major problems:the default multiply coefficient is smaller, this will make the estimated cuboid size much less than the actual cuboid size,which will lead to the region numbers of HBase and the reducer numbers of CubeHFileJob are both smaller. obviously, the current method makes the job of CubeHFileJob much slower. After we remove the the default multiply coefficient, the job of CubeHFileJob becomes much faster. we'd better make multiply coefficient configurable and this could be more friendly for user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1695) disable cardinality calculation job when loading hive table
kangkaisen created KYLIN-1695: - Summary: disable cardinality calculation job when loading hive table Key: KYLIN-1695 URL: https://issues.apache.org/jira/browse/KYLIN-1695 Project: Kylin Issue Type: Bug Components: Job Engine Affects Versions: v1.5.1 Reporter: kangkaisen Assignee: Dong Li When user loads/reloads hive tables from web console, kylin will submit a mr job asynchronously to calculate column cardinalities. This has four major problems: # the calculated cardinality is stored in table metadata, but never used in cubing/querying # table may change after loading, so the cardinality doesn't necessarily reflect the actual value # the current `HiveColumnCardinalityJob` has many limitations, e.g., it doesn't support views # the `HiveColumnCardinalityJob` may use lots of resources when computing cardinality of partitioned table Due to these problems, we should disable it by default and (maybe) remove it in future releases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1319) Find a better way to check hadoop job status
[ https://issues.apache.org/jira/browse/KYLIN-1319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284137#comment-15284137 ] Zhong Yanghong commented on KYLIN-1319: --- I have tried the CI test. It seems it works well for the cube building. However, tests in "ITMassInQueryTest" always failed due to other reasons. I'll fix that before merging. > Find a better way to check hadoop job status > > > Key: KYLIN-1319 > URL: https://issues.apache.org/jira/browse/KYLIN-1319 > Project: Kylin > Issue Type: Improvement >Reporter: liyang >Assignee: Zhong Yanghong > Labels: newbie > Attachments: > Find_better_way_of_checking_hadoop_job_status_by_YarnClient_master.patch, > Find_better_way_of_checking_hadoop_job_status_via_job_API_master.patch > > > Currently Kylin retrieves jobs status via a resource manager web service like > {code}https://:/ws/v1/cluster/apps/${job_id}?anonymous=true{code} > It is not most robust. Some user does not have > "yarn.resourcemanager.webapp.address" set in yarm-site.xml, then get status > will fail out-of-box. They have to set a Kylin property > "kylin.job.yarn.app.rest.check.status.url" to overcome, which is not user > friendly. > Kerberos authentication might cause problem too if security is enabled. > Is there a more robust way to check job status? Via Job API? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-1696) Have caught exception when connection issue occurs for some Broker
Zhong Yanghong created KYLIN-1696: - Summary: Have caught exception when connection issue occurs for some Broker Key: KYLIN-1696 URL: https://issues.apache.org/jira/browse/KYLIN-1696 Project: Kylin Issue Type: Bug Components: streaming Reporter: Zhong Yanghong Assignee: Zhong Yanghong 2016-05-16 01:50:24,711 ERROR [main StreamingCLI:109]: error start streaming java.lang.RuntimeException: error when get StreamingMessages at org.apache.kylin.source.kafka.KafkaStreamingInput.getBatchWithTimeWindow(KafkaStreamingInput.java:93) at org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(OneOffStreamingBuilder.java:72) at org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneOffCubeStreaming(StreamingCLI.java:129) at org.apache.kylin.engine.streaming.cli.StreamingCLI.main(StreamingCLI.java:103) Caused by: java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:127) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:644) at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57) at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44) at kafka.consumer.SimpleConsumer.getOrMakeConnection(SimpleConsumer.scala:142) at kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69) at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:93) at kafka.javaapi.consumer.SimpleConsumer.send(SimpleConsumer.scala:68) at org.apache.kylin.source.kafka.util.KafkaRequester.getPartitionMetadata(KafkaRequester.java:132) at org.apache.kylin.source.kafka.util.KafkaUtils.getLeadBroker(KafkaUtils.java:53) at org.apache.kylin.source.kafka.util.KafkaUtils.getFirstAndLastOffset(KafkaUtils.java:113) at org.apache.kylin.source.kafka.util.KafkaUtils.findClosestOffsetWithDataTimestamp(KafkaUtils.java:102) at org.apache.kylin.source.kafka.KafkaStreamingInput$StreamingMessageProducer.call(KafkaStreamingInput.java:141) at org.apache.kylin.source.kafka.KafkaStreamingInput$StreamingMessageProducer.call(KafkaStreamingInput.java:104) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-1696) Have caught exception when connection issue occurs for some Broker
[ https://issues.apache.org/jira/browse/KYLIN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhong Yanghong updated KYLIN-1696: -- Affects Version/s: v1.4.0 v1.5.0 > Have caught exception when connection issue occurs for some Broker > -- > > Key: KYLIN-1696 > URL: https://issues.apache.org/jira/browse/KYLIN-1696 > Project: Kylin > Issue Type: Bug > Components: streaming >Affects Versions: v1.5.0, v1.4.0 >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > > 2016-05-16 01:50:24,711 ERROR [main StreamingCLI:109]: error start streaming > java.lang.RuntimeException: error when get StreamingMessages > at > org.apache.kylin.source.kafka.KafkaStreamingInput.getBatchWithTimeWindow(KafkaStreamingInput.java:93) > at > org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(OneOffStreamingBuilder.java:72) > at > org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneOffCubeStreaming(StreamingCLI.java:129) > at > org.apache.kylin.engine.streaming.cli.StreamingCLI.main(StreamingCLI.java:103) > Caused by: java.nio.channels.UnresolvedAddressException > at sun.nio.ch.Net.checkAddress(Net.java:127) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:644) > at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57) > at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44) > at > kafka.consumer.SimpleConsumer.getOrMakeConnection(SimpleConsumer.scala:142) > at > kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69) > at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:93) > at kafka.javaapi.consumer.SimpleConsumer.send(SimpleConsumer.scala:68) > at > org.apache.kylin.source.kafka.util.KafkaRequester.getPartitionMetadata(KafkaRequester.java:132) > at > org.apache.kylin.source.kafka.util.KafkaUtils.getLeadBroker(KafkaUtils.java:53) > at > org.apache.kylin.source.kafka.util.KafkaUtils.getFirstAndLastOffset(KafkaUtils.java:113) > at > org.apache.kylin.source.kafka.util.KafkaUtils.findClosestOffsetWithDataTimestamp(KafkaUtils.java:102) > at > org.apache.kylin.source.kafka.KafkaStreamingInput$StreamingMessageProducer.call(KafkaStreamingInput.java:141) > at > org.apache.kylin.source.kafka.KafkaStreamingInput$StreamingMessageProducer.call(KafkaStreamingInput.java:104) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1684) query on table "kylin_sales" return empty resultset after cube "kylin_sales_cube" which generated by sample.sh is ready
[ https://issues.apache.org/jira/browse/KYLIN-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284141#comment-15284141 ] wangxianbin commented on KYLIN-1684: hi hongbin! in your commit for "KYLIN-1465", I notice that NotEnoughGTInfoException is the only exception you are trying to catch in CubeStorageQuery search regardless segment record count, and CubeGridTable newGTInfo is the only point you throw NotEnoughGTInfoException when there is a dic info mismatch between CubeManager and Cuboid(in which case dict == null), however, seem like your guys have refactor it, for case where dict not found(dict == null) in CubeDimEncMap, FixedLenDimEnc is used, and therefore I just remove the check, if there is some other runtime exception I should worry about, I may miss it, anyway, test is aways a better choice. > query on table "kylin_sales" return empty resultset after cube > "kylin_sales_cube" which generated by sample.sh is ready > --- > > Key: KYLIN-1684 > URL: https://issues.apache.org/jira/browse/KYLIN-1684 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v1.5.1 > Environment: cluster: > hadoop-2.6.0 > hbase-0.98.8 > hive-0.14.0 >Reporter: wangxianbin >Assignee: wangxianbin > Attachments: > 1.5.1-release-hotfix-KYLIN-1684-query-on-table-kylin_sales-return-empty-r.patch, > log for Build Base Cuboid Data.png, log when run query.png > > > there is a check for "InputRecords" in CubeStorageQuery search method which > seem like unnecessary, as follow: > List scanners = Lists.newArrayList(); > for (CubeSegment cubeSeg : > cubeInstance.getSegments(SegmentStatusEnum.READY)) { > CubeSegmentScanner scanner; > if (cubeSeg.getInputRecords() == 0) { > logger.info("Skip cube segment {} because its input record is > 0", cubeSeg); > continue; > } > scanner = new CubeSegmentScanner(cubeSeg, cuboid, dimensionsD, > groupsD, metrics, filterD, !isExactAggregation); > scanners.add(scanner); > } > if (scanners.isEmpty()) > return ITupleIterator.EMPTY_TUPLE_ITERATOR; > return new SequentialCubeTupleIterator(scanners, cuboid, dimensionsD, > metrics, returnTupleInfo, context); > this check will cause query return empty resultset, even there is data in > storage engine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1696) Have caught exception when connection issue occurs for some Broker
[ https://issues.apache.org/jira/browse/KYLIN-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284142#comment-15284142 ] Zhong Yanghong commented on KYLIN-1696: --- To get the leader broker, every broker in the cluster needs to be visited. However, exceptions have been caught for those brokers with issues. > Have caught exception when connection issue occurs for some Broker > -- > > Key: KYLIN-1696 > URL: https://issues.apache.org/jira/browse/KYLIN-1696 > Project: Kylin > Issue Type: Bug > Components: streaming >Affects Versions: v1.5.0, v1.4.0 >Reporter: Zhong Yanghong >Assignee: Zhong Yanghong > > 2016-05-16 01:50:24,711 ERROR [main StreamingCLI:109]: error start streaming > java.lang.RuntimeException: error when get StreamingMessages > at > org.apache.kylin.source.kafka.KafkaStreamingInput.getBatchWithTimeWindow(KafkaStreamingInput.java:93) > at > org.apache.kylin.engine.streaming.OneOffStreamingBuilder$1.run(OneOffStreamingBuilder.java:72) > at > org.apache.kylin.engine.streaming.cli.StreamingCLI.startOneOffCubeStreaming(StreamingCLI.java:129) > at > org.apache.kylin.engine.streaming.cli.StreamingCLI.main(StreamingCLI.java:103) > Caused by: java.nio.channels.UnresolvedAddressException > at sun.nio.ch.Net.checkAddress(Net.java:127) > at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:644) > at kafka.network.BlockingChannel.connect(BlockingChannel.scala:57) > at kafka.consumer.SimpleConsumer.connect(SimpleConsumer.scala:44) > at > kafka.consumer.SimpleConsumer.getOrMakeConnection(SimpleConsumer.scala:142) > at > kafka.consumer.SimpleConsumer.kafka$consumer$SimpleConsumer$$sendRequest(SimpleConsumer.scala:69) > at kafka.consumer.SimpleConsumer.send(SimpleConsumer.scala:93) > at kafka.javaapi.consumer.SimpleConsumer.send(SimpleConsumer.scala:68) > at > org.apache.kylin.source.kafka.util.KafkaRequester.getPartitionMetadata(KafkaRequester.java:132) > at > org.apache.kylin.source.kafka.util.KafkaUtils.getLeadBroker(KafkaUtils.java:53) > at > org.apache.kylin.source.kafka.util.KafkaUtils.getFirstAndLastOffset(KafkaUtils.java:113) > at > org.apache.kylin.source.kafka.util.KafkaUtils.findClosestOffsetWithDataTimestamp(KafkaUtils.java:102) > at > org.apache.kylin.source.kafka.KafkaStreamingInput$StreamingMessageProducer.call(KafkaStreamingInput.java:141) > at > org.apache.kylin.source.kafka.KafkaStreamingInput$StreamingMessageProducer.call(KafkaStreamingInput.java:104) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-1684) query on table "kylin_sales" return empty resultset after cube "kylin_sales_cube" which generated by sample.sh is ready
[ https://issues.apache.org/jira/browse/KYLIN-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15284177#comment-15284177 ] wangxianbin commented on KYLIN-1684: hi! [~mahongbin],I have try on 0 record segment, it work well! > query on table "kylin_sales" return empty resultset after cube > "kylin_sales_cube" which generated by sample.sh is ready > --- > > Key: KYLIN-1684 > URL: https://issues.apache.org/jira/browse/KYLIN-1684 > Project: Kylin > Issue Type: Bug > Components: Query Engine >Affects Versions: v1.5.1 > Environment: cluster: > hadoop-2.6.0 > hbase-0.98.8 > hive-0.14.0 >Reporter: wangxianbin >Assignee: wangxianbin > Attachments: > 1.5.1-release-hotfix-KYLIN-1684-query-on-table-kylin_sales-return-empty-r.patch, > log for Build Base Cuboid Data.png, log when run query.png > > > there is a check for "InputRecords" in CubeStorageQuery search method which > seem like unnecessary, as follow: > List scanners = Lists.newArrayList(); > for (CubeSegment cubeSeg : > cubeInstance.getSegments(SegmentStatusEnum.READY)) { > CubeSegmentScanner scanner; > if (cubeSeg.getInputRecords() == 0) { > logger.info("Skip cube segment {} because its input record is > 0", cubeSeg); > continue; > } > scanner = new CubeSegmentScanner(cubeSeg, cuboid, dimensionsD, > groupsD, metrics, filterD, !isExactAggregation); > scanners.add(scanner); > } > if (scanners.isEmpty()) > return ITupleIterator.EMPTY_TUPLE_ITERATOR; > return new SequentialCubeTupleIterator(scanners, cuboid, dimensionsD, > metrics, returnTupleInfo, context); > this check will cause query return empty resultset, even there is data in > storage engine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)