[jira] [Assigned] (KYLIN-4372) Docker entrypoint delete file too later cause ZK started by HBase crash

2020-02-10 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4372:
-

Assignee: weibin0516

> Docker entrypoint delete file too later cause ZK started by HBase crash
> ---
>
> Key: KYLIN-4372
> URL: https://issues.apache.org/jira/browse/KYLIN-4372
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: v3.0.0-alpha2
>Reporter: Yue Zhang
>Assignee: weibin0516
>Priority: Critical
>
> In docker/entrypoint.sh
>  
> {code:java}
> # start hbase
> $HBASE_HOME/bin/start-hbase.sh
> # start kafka
> rm -rf /tmp/kafka-logs
> rm -rf /data/zookeeper/*
> nohup $KAFKA_HOME/bin/kafka-server-start.sh 
> $KAFKA_HOME/config/server.properties &
> {code}
> rm -rf /data/zookeeper/*  should before starting HBase instead of before 
> starting Kafka.
> It executes after HBase will cause ZK started by HBase crash.
> The crash logs of /home/admin/hbase-1.1.2/logs/hbase--master-9aef5f427eb6.log:
> {code:java}
> 2020-02-10 09:25:56,402 INFO [SyncThread:0] persistence.FileTxnLog: Creating 
> new log file: log.1 2020-02-10 09:25:56,402 ERROR [SyncThread:0] 
> server.SyncRequestProcessor: Severe unrecoverable error, exiting 
> java.io.FileNotFoundException: /data/zookeeper/zookeeper_0/version-2/log.1 
> (No such file or directory) at java.io.FileOutputStream.open0(Native Method) 
> at java.io.FileOutputStream.open(FileOutputStream.java:270) at 
> java.io.FileOutputStream.(FileOutputStream.java:213) at 
> java.io.FileOutputStream.(FileOutputStream.java:162) at 
> org.apache.zookeeper.server.persistence.FileTxnLog.append(FileTxnLog.java:205)
>  at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.append(FileTxnSnapLog.java:314)
>  at org.apache.zookeeper.server.ZKDatabase.append(ZKDatabase.java:476) at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:140)
> {code}
>  I think the shell should be like this
> {code:java}
> # start hbase 
> rm -rf /data/zookeeper/*
> $HBASE_HOME/bin/start-hbase.sh
> # start kafka 
> rm -rf /tmp/kafka-logs  
> nohup $KAFKA_HOME/bin/kafka-server-start.sh 
> $KAFKA_HOME/config/server.properties & {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4372) Docker entrypoint delete file too later cause ZK started by HBase crash

2020-02-10 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033561#comment-17033561
 ] 

weibin0516 commented on KYLIN-4372:
---

[~cijianzy], ok, if you don't mind, I can submit a pr to fix this bug

> Docker entrypoint delete file too later cause ZK started by HBase crash
> ---
>
> Key: KYLIN-4372
> URL: https://issues.apache.org/jira/browse/KYLIN-4372
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: v3.0.0-alpha2
>Reporter: Yue Zhang
>Priority: Critical
>
> In docker/entrypoint.sh
>  
> {code:java}
> # start hbase
> $HBASE_HOME/bin/start-hbase.sh
> # start kafka
> rm -rf /tmp/kafka-logs
> rm -rf /data/zookeeper/*
> nohup $KAFKA_HOME/bin/kafka-server-start.sh 
> $KAFKA_HOME/config/server.properties &
> {code}
> rm -rf /data/zookeeper/*  should before starting HBase instead of before 
> starting Kafka.
> It executes after HBase will cause ZK started by HBase crash.
> The crash logs of /home/admin/hbase-1.1.2/logs/hbase--master-9aef5f427eb6.log:
> {code:java}
> 2020-02-10 09:25:56,402 INFO [SyncThread:0] persistence.FileTxnLog: Creating 
> new log file: log.1 2020-02-10 09:25:56,402 ERROR [SyncThread:0] 
> server.SyncRequestProcessor: Severe unrecoverable error, exiting 
> java.io.FileNotFoundException: /data/zookeeper/zookeeper_0/version-2/log.1 
> (No such file or directory) at java.io.FileOutputStream.open0(Native Method) 
> at java.io.FileOutputStream.open(FileOutputStream.java:270) at 
> java.io.FileOutputStream.(FileOutputStream.java:213) at 
> java.io.FileOutputStream.(FileOutputStream.java:162) at 
> org.apache.zookeeper.server.persistence.FileTxnLog.append(FileTxnLog.java:205)
>  at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.append(FileTxnSnapLog.java:314)
>  at org.apache.zookeeper.server.ZKDatabase.append(ZKDatabase.java:476) at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:140)
> {code}
>  I think the shell should be like this
> {code:java}
> # start hbase 
> rm -rf /data/zookeeper/*
> $HBASE_HOME/bin/start-hbase.sh
> # start kafka 
> rm -rf /tmp/kafka-logs  
> nohup $KAFKA_HOME/bin/kafka-server-start.sh 
> $KAFKA_HOME/config/server.properties & {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4372) Docker entrypoint delete file too later cause ZK started by HBase crash

2020-02-10 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17033479#comment-17033479
 ] 

weibin0516 commented on KYLIN-4372:
---

Hi, [~cijianzy], is it the restart container that encountered this error?

> Docker entrypoint delete file too later cause ZK started by HBase crash
> ---
>
> Key: KYLIN-4372
> URL: https://issues.apache.org/jira/browse/KYLIN-4372
> Project: Kylin
>  Issue Type: Bug
>  Components: Others
>Affects Versions: v3.0.0-alpha2
>Reporter: Yue Zhang
>Priority: Critical
>
> In docker/entrypoint.sh
>  
> {code:java}
> # start hbase
> $HBASE_HOME/bin/start-hbase.sh
> # start kafka
> rm -rf /tmp/kafka-logs
> rm -rf /data/zookeeper/*
> nohup $KAFKA_HOME/bin/kafka-server-start.sh 
> $KAFKA_HOME/config/server.properties &
> {code}
> rm -rf /data/zookeeper/*  should before starting HBase instead of before 
> starting Kafka.
> It executes after HBase will cause ZK started by HBase crash.
> The crash logs of /home/admin/hbase-1.1.2/logs/hbase--master-9aef5f427eb6.log:
> {code:java}
> 2020-02-10 09:25:56,402 INFO [SyncThread:0] persistence.FileTxnLog: Creating 
> new log file: log.1 2020-02-10 09:25:56,402 ERROR [SyncThread:0] 
> server.SyncRequestProcessor: Severe unrecoverable error, exiting 
> java.io.FileNotFoundException: /data/zookeeper/zookeeper_0/version-2/log.1 
> (No such file or directory) at java.io.FileOutputStream.open0(Native Method) 
> at java.io.FileOutputStream.open(FileOutputStream.java:270) at 
> java.io.FileOutputStream.(FileOutputStream.java:213) at 
> java.io.FileOutputStream.(FileOutputStream.java:162) at 
> org.apache.zookeeper.server.persistence.FileTxnLog.append(FileTxnLog.java:205)
>  at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.append(FileTxnSnapLog.java:314)
>  at org.apache.zookeeper.server.ZKDatabase.append(ZKDatabase.java:476) at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:140)
> {code}
>  I think the shell should be like this
> {code:java}
> # start hbase 
> rm -rf /data/zookeeper/*
> $HBASE_HOME/bin/start-hbase.sh
> # start kafka 
> rm -rf /tmp/kafka-logs  
> nohup $KAFKA_HOME/bin/kafka-server-start.sh 
> $KAFKA_HOME/config/server.properties & {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4340) Fix bug of get value of isSparkFactDistinctEnable for cube not correct

2020-02-08 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4340:
--
Summary: Fix bug of get value of isSparkFactDistinctEnable for cube not 
correct  (was: Cube Configuration Overwrites not effective)

> Fix bug of get value of isSparkFactDistinctEnable for cube not correct
> --
>
> Key: KYLIN-4340
> URL: https://issues.apache.org/jira/browse/KYLIN-4340
> Project: Kylin
>  Issue Type: Bug
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Attachments: image-2020-01-13-23-20-23-476.png
>
>
> In kylin.properties, 
> {code:java}
> kylin.engine.spark-fact-distinct=true
> {code}
>  !image-2020-01-13-23-20-23-476.png! 
> set this config to false in cube, but not effective when build cube



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4361) Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with Sqoop.

2020-01-29 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025963#comment-17025963
 ] 

weibin0516 commented on KYLIN-4361:
---

OK, if there are results, please let me know.

> Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with 
> Sqoop.
> ---
>
> Key: KYLIN-4361
> URL: https://issues.apache.org/jira/browse/KYLIN-4361
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
> Environment: HDP3.1
>Reporter: Sonu Singh
>Assignee: weibin0516
>Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2020-01-28-11-39-25-860.png
>
>
> I am trying to submit a job with JDBC data sources and getting 
> nullpointerexception because of below code:
> File Path - 
> \kylin\source-jdbc\src\main\java\org\apache\kylin\source\jdbc\JdbcHiveInputBase.java
> method - createSqoopToFlatHiveStep
> //code start
> String partCol = null;
> if (partitionDesc.isPartitioned()) {
>  partCol = partitionDesc.getPartitionDateColumn();//tablename.colname
>  }
> // code end
> Fon non-partition cubes, the value of partCol will be always null and 
> creating a exception in below method:
> //code start
> static String quoteIdentifier(String identifier, SourceDialect dialect) {
>  if (KylinConfig.getInstanceFromEnv().enableHiveDdlQuote()) {
>  String[] identifierArray = identifier.split("\\.");
> //code end
> Environment Detail -
> HDP3.1
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4361) Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with Sqoop.

2020-01-29 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025670#comment-17025670
 ] 

weibin0516 commented on KYLIN-4361:
---

OK, what is your jdbc database? mysql or something else? Please provide the ddl 
of the table(using show create table {tableName}) for verification.

> Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with 
> Sqoop.
> ---
>
> Key: KYLIN-4361
> URL: https://issues.apache.org/jira/browse/KYLIN-4361
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
> Environment: HDP3.1
>Reporter: Sonu Singh
>Assignee: weibin0516
>Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2020-01-28-11-39-25-860.png
>
>
> I am trying to submit a job with JDBC data sources and getting 
> nullpointerexception because of below code:
> File Path - 
> \kylin\source-jdbc\src\main\java\org\apache\kylin\source\jdbc\JdbcHiveInputBase.java
> method - createSqoopToFlatHiveStep
> //code start
> String partCol = null;
> if (partitionDesc.isPartitioned()) {
>  partCol = partitionDesc.getPartitionDateColumn();//tablename.colname
>  }
> // code end
> Fon non-partition cubes, the value of partCol will be always null and 
> creating a exception in below method:
> //code start
> static String quoteIdentifier(String identifier, SourceDialect dialect) {
>  if (KylinConfig.getInstanceFromEnv().enableHiveDdlQuote()) {
>  String[] identifierArray = identifier.split("\\.");
> //code end
> Environment Detail -
> HDP3.1
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4362) Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop.

2020-01-28 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025616#comment-17025616
 ] 

weibin0516 commented on KYLIN-4362:
---

OK ~

> Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop.
> --
>
> Key: KYLIN-4362
> URL: https://issues.apache.org/jira/browse/KYLIN-4362
> Project: Kylin
>  Issue Type: Bug
>Reporter: Sonu Singh
>Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2020-01-28-11-39-59-098.png
>
>
> MR and SPark job are failing on HDP3.1 with below error:
> -00 execute finished with exception
> java.io.IOException: OS command error exit with return code: 1, error 
> message: Warning: /usr/hdp/3.0.1.0-187/accumulo does not exist! Accumulo 
> imports will fail.
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 20/01/27 17:09:19 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7.3.0.1.0-187
> Missing argument for option: split-by
> The command is:
> /usr/hdp/current/sqoop-client/bin/sqoop import 
> -Dorg.apache.sqoop.splitter.allow_text_splitter=true 
> -Dmapreduce.job.queuename=default --connect "jdbc:vdb:/ 
> /XX.XX.XX.XX:XX/X" --driver com..XX.jdbc.Driver --username X 
> --password "XXX" --query "SELECT \`sales\`.\`locationdim ensionid\` as 
> \`SALES_LOCATIONDIMENSIONID\` ,\`sales\`.\`storeitemdimensionid\` as 
> \`SALES_STOREITEMDIMENSIONID\` ,\`sales\`.\`basecostperunit\` as \`SALES_ 
> BASECOSTPERUNIT\` ,\`sales\`.\`createdby\` as \`SALES_CREATEDBY\` 
> ,\`sales\`.\`updateddate\` as \`SALES_UPDATEDDATE\` FROM \`\`.\`sales\` 
> \`sale s\` WHERE 1=1 AND \$CONDITIONS" --target-dir 
> hdfs://XX-master:8020/apps/XXX/XXX/kylin-4f367799-4993-bb67-da69-a9a147c62a1e/kylin_intermediate_cube_11_2701
>  2020_1d0a2dfd_bd66_d3e3_304b_9cd7f2018dbc --split-by --boundary-query 
> "SELECT min(\`\`), max(\`\`) FROM \`XX\`.\`sales\` " --null-string '\\N' 
> --n ull-non-string '\\N' --fields-terminated-by '|' --num-mappers 4
> at 
> org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:88)
> at org.apache.kylin.source.jdbc.CmdStep.sqoopFlatHiveTable(CmdStep.java:43)
> at org.apache.kylin.source.jdbc.CmdStep.doWork(CmdStep.java:54)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:62)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171)
> at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:106)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2020-01-27 17:09:19,362 INFO [Scheduler 1642300543 Job 
> 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.ExecutableManager:466 : 
> job id:4f367799-4993-bb6 7-da69-a9a147c62a1e-00 from RUNNING to ERROR
> 2020-01-27 17:09:19,365 ERROR [Scheduler 1642300543 Job 
> 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.AbstractExecutable:173 : 
> error running Executabl e: 
> CubingJob\{id=4f367799-4993-bb67-da69-a9a147c62a1e, name=BUILD CUBE - 
> cube_11_27012020 - FULL_BUILD - UTC 2020-01-27 17:09:00, state=RUNNING}
> 2020-01-27 17:09:19,372 DEBUG [pool-7-thread-1] cachesync.Broadcaster:111 : 
> Servers in the cluster: [localhost:7070]
> 2020-01-27 17:09:19,373 DEBUG [pool-7-thread-1] cachesync.Broadcaster:121 : 
> Announcing new bro
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4362) Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop.

2020-01-28 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4362:
-

Assignee: weibin0516

> Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop.
> --
>
> Key: KYLIN-4362
> URL: https://issues.apache.org/jira/browse/KYLIN-4362
> Project: Kylin
>  Issue Type: Bug
>Reporter: Sonu Singh
>Assignee: weibin0516
>Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2020-01-28-11-39-59-098.png
>
>
> MR and SPark job are failing on HDP3.1 with below error:
> -00 execute finished with exception
> java.io.IOException: OS command error exit with return code: 1, error 
> message: Warning: /usr/hdp/3.0.1.0-187/accumulo does not exist! Accumulo 
> imports will fail.
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 20/01/27 17:09:19 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7.3.0.1.0-187
> Missing argument for option: split-by
> The command is:
> /usr/hdp/current/sqoop-client/bin/sqoop import 
> -Dorg.apache.sqoop.splitter.allow_text_splitter=true 
> -Dmapreduce.job.queuename=default --connect "jdbc:vdb:/ 
> /XX.XX.XX.XX:XX/X" --driver com..XX.jdbc.Driver --username X 
> --password "XXX" --query "SELECT \`sales\`.\`locationdim ensionid\` as 
> \`SALES_LOCATIONDIMENSIONID\` ,\`sales\`.\`storeitemdimensionid\` as 
> \`SALES_STOREITEMDIMENSIONID\` ,\`sales\`.\`basecostperunit\` as \`SALES_ 
> BASECOSTPERUNIT\` ,\`sales\`.\`createdby\` as \`SALES_CREATEDBY\` 
> ,\`sales\`.\`updateddate\` as \`SALES_UPDATEDDATE\` FROM \`\`.\`sales\` 
> \`sale s\` WHERE 1=1 AND \$CONDITIONS" --target-dir 
> hdfs://XX-master:8020/apps/XXX/XXX/kylin-4f367799-4993-bb67-da69-a9a147c62a1e/kylin_intermediate_cube_11_2701
>  2020_1d0a2dfd_bd66_d3e3_304b_9cd7f2018dbc --split-by --boundary-query 
> "SELECT min(\`\`), max(\`\`) FROM \`XX\`.\`sales\` " --null-string '\\N' 
> --n ull-non-string '\\N' --fields-terminated-by '|' --num-mappers 4
> at 
> org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:88)
> at org.apache.kylin.source.jdbc.CmdStep.sqoopFlatHiveTable(CmdStep.java:43)
> at org.apache.kylin.source.jdbc.CmdStep.doWork(CmdStep.java:54)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:62)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171)
> at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:106)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2020-01-27 17:09:19,362 INFO [Scheduler 1642300543 Job 
> 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.ExecutableManager:466 : 
> job id:4f367799-4993-bb6 7-da69-a9a147c62a1e-00 from RUNNING to ERROR
> 2020-01-27 17:09:19,365 ERROR [Scheduler 1642300543 Job 
> 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.AbstractExecutable:173 : 
> error running Executabl e: 
> CubingJob\{id=4f367799-4993-bb67-da69-a9a147c62a1e, name=BUILD CUBE - 
> cube_11_27012020 - FULL_BUILD - UTC 2020-01-27 17:09:00, state=RUNNING}
> 2020-01-27 17:09:19,372 DEBUG [pool-7-thread-1] cachesync.Broadcaster:111 : 
> Servers in the cluster: [localhost:7070]
> 2020-01-27 17:09:19,373 DEBUG [pool-7-thread-1] cachesync.Broadcaster:121 : 
> Announcing new bro
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4361) Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with Sqoop.

2020-01-28 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025606#comment-17025606
 ] 

weibin0516 commented on KYLIN-4361:
---

Hi, Singh, this is indeed a bug in the old version, but in the latest code of 
the master branch, the original bug code has changed. Can you try to test with 
the latest code from the master branch? According to the code logic, I think it 
should also report an error.


> Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with 
> Sqoop.
> ---
>
> Key: KYLIN-4361
> URL: https://issues.apache.org/jira/browse/KYLIN-4361
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
> Environment: HDP3.1
>Reporter: Sonu Singh
>Assignee: weibin0516
>Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2020-01-28-11-39-25-860.png
>
>
> I am trying to submit a job with JDBC data sources and getting 
> nullpointerexception because of below code:
> File Path - 
> \kylin\source-jdbc\src\main\java\org\apache\kylin\source\jdbc\JdbcHiveInputBase.java
> method - createSqoopToFlatHiveStep
> //code start
> String partCol = null;
> if (partitionDesc.isPartitioned()) {
>  partCol = partitionDesc.getPartitionDateColumn();//tablename.colname
>  }
> // code end
> Fon non-partition cubes, the value of partCol will be always null and 
> creating a exception in below method:
> //code start
> static String quoteIdentifier(String identifier, SourceDialect dialect) {
>  if (KylinConfig.getInstanceFromEnv().enableHiveDdlQuote()) {
>  String[] identifierArray = identifier.split("\\.");
> //code end
> Environment Detail -
> HDP3.1
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (KYLIN-4361) Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with Sqoop.

2020-01-28 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4361:
--
Comment: was deleted

(was: I think this is a bug. When the jdbc table is not partitioned, the 
corresponding partition identifier is null, but splitting the null partition 
identifier will cause this exception. I will mention a pr to fix the bug.)

> Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with 
> Sqoop.
> ---
>
> Key: KYLIN-4361
> URL: https://issues.apache.org/jira/browse/KYLIN-4361
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
> Environment: HDP3.1
>Reporter: Sonu Singh
>Assignee: weibin0516
>Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2020-01-28-11-39-25-860.png
>
>
> I am trying to submit a job with JDBC data sources and getting 
> nullpointerexception because of below code:
> File Path - 
> \kylin\source-jdbc\src\main\java\org\apache\kylin\source\jdbc\JdbcHiveInputBase.java
> method - createSqoopToFlatHiveStep
> //code start
> String partCol = null;
> if (partitionDesc.isPartitioned()) {
>  partCol = partitionDesc.getPartitionDateColumn();//tablename.colname
>  }
> // code end
> Fon non-partition cubes, the value of partCol will be always null and 
> creating a exception in below method:
> //code start
> static String quoteIdentifier(String identifier, SourceDialect dialect) {
>  if (KylinConfig.getInstanceFromEnv().enableHiveDdlQuote()) {
>  String[] identifierArray = identifier.split("\\.");
> //code end
> Environment Detail -
> HDP3.1
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4361) Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with Sqoop.

2020-01-28 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17025588#comment-17025588
 ] 

weibin0516 commented on KYLIN-4361:
---

I think this is a bug. When the jdbc table is not partitioned, the 
corresponding partition identifier is null, but splitting the null partition 
identifier will cause this exception. I will mention a pr to fix the bug.

> Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with 
> Sqoop.
> ---
>
> Key: KYLIN-4361
> URL: https://issues.apache.org/jira/browse/KYLIN-4361
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
> Environment: HDP3.1
>Reporter: Sonu Singh
>Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2020-01-28-11-39-25-860.png
>
>
> I am trying to submit a job with JDBC data sources and getting 
> nullpointerexception because of below code:
> File Path - 
> \kylin\source-jdbc\src\main\java\org\apache\kylin\source\jdbc\JdbcHiveInputBase.java
> method - createSqoopToFlatHiveStep
> //code start
> String partCol = null;
> if (partitionDesc.isPartitioned()) {
>  partCol = partitionDesc.getPartitionDateColumn();//tablename.colname
>  }
> // code end
> Fon non-partition cubes, the value of partCol will be always null and 
> creating a exception in below method:
> //code start
> static String quoteIdentifier(String identifier, SourceDialect dialect) {
>  if (KylinConfig.getInstanceFromEnv().enableHiveDdlQuote()) {
>  String[] identifierArray = identifier.split("\\.");
> //code end
> Environment Detail -
> HDP3.1
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4361) Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with Sqoop.

2020-01-28 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4361:
-

Assignee: weibin0516

> Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with 
> Sqoop.
> ---
>
> Key: KYLIN-4361
> URL: https://issues.apache.org/jira/browse/KYLIN-4361
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
> Environment: HDP3.1
>Reporter: Sonu Singh
>Assignee: weibin0516
>Priority: Blocker
> Fix For: v3.0.0
>
> Attachments: image-2020-01-28-11-39-25-860.png
>
>
> I am trying to submit a job with JDBC data sources and getting 
> nullpointerexception because of below code:
> File Path - 
> \kylin\source-jdbc\src\main\java\org\apache\kylin\source\jdbc\JdbcHiveInputBase.java
> method - createSqoopToFlatHiveStep
> //code start
> String partCol = null;
> if (partitionDesc.isPartitioned()) {
>  partCol = partitionDesc.getPartitionDateColumn();//tablename.colname
>  }
> // code end
> Fon non-partition cubes, the value of partCol will be always null and 
> creating a exception in below method:
> //code start
> static String quoteIdentifier(String identifier, SourceDialect dialect) {
>  if (KylinConfig.getInstanceFromEnv().enableHiveDdlQuote()) {
>  String[] identifierArray = identifier.split("\\.");
> //code end
> Environment Detail -
> HDP3.1
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4361) Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with Sqoop.

2020-01-27 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024834#comment-17024834
 ] 

weibin0516 commented on KYLIN-4361:
---

Can you show the full error stack information? This will help find the cause of 
the error.

> Kylin 3.0.0 Release - Not able to submit job with JDBC Data Sources with 
> Sqoop.
> ---
>
> Key: KYLIN-4361
> URL: https://issues.apache.org/jira/browse/KYLIN-4361
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
> Environment: HDP3.1
>Reporter: Sonu Singh
>Priority: Blocker
> Fix For: v3.0.0
>
>
> I am trying to submit a job with JDBC data sources and getting 
> nullpointerexception because of below code:
> File Path - 
> \kylin\source-jdbc\src\main\java\org\apache\kylin\source\jdbc\JdbcHiveInputBase.java
> method - createSqoopToFlatHiveStep
> //code start
> String partCol = null;
> if (partitionDesc.isPartitioned()) {
>  partCol = partitionDesc.getPartitionDateColumn();//tablename.colname
>  }
> // code end
> Fon non-partition cubes, the value of partCol will be always null and 
> creating a exception in below method:
> //code start
> static String quoteIdentifier(String identifier, SourceDialect dialect) {
>  if (KylinConfig.getInstanceFromEnv().enableHiveDdlQuote()) {
>  String[] identifierArray = identifier.split("\\.");
> //code end
> Environment Detail -
> HDP3.1
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4362) Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop.

2020-01-27 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024833#comment-17024833
 ] 

weibin0516 commented on KYLIN-4362:
---

Hi, the solution in this link may solve your problem: 
https://community.cloudera.com/t5/Support-Questions/Warning-usr-lib-sqoop-accumulo-does-not-exist-Accumulo/td-p/22304

> Kylin 3.0.0 Release: MR & Spark Job is failing with JDBC connection and Sqoop.
> --
>
> Key: KYLIN-4362
> URL: https://issues.apache.org/jira/browse/KYLIN-4362
> Project: Kylin
>  Issue Type: Bug
>Reporter: Sonu Singh
>Priority: Blocker
> Fix For: v3.0.0
>
>
> MR and SPark job are failing on HDP3.1 with below error:
> -00 execute finished with exception
> java.io.IOException: OS command error exit with return code: 1, error 
> message: Warning: /usr/hdp/3.0.1.0-187/accumulo does not exist! Accumulo 
> imports will fail.
> Please set $ACCUMULO_HOME to the root of your Accumulo installation.
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.1.0-187/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.0.1.0-187/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 20/01/27 17:09:19 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7.3.0.1.0-187
> Missing argument for option: split-by
> The command is:
> /usr/hdp/current/sqoop-client/bin/sqoop import 
> -Dorg.apache.sqoop.splitter.allow_text_splitter=true 
> -Dmapreduce.job.queuename=default --connect "jdbc:vdb:/ 
> /XX.XX.XX.XX:XX/X" --driver com..XX.jdbc.Driver --username X 
> --password "XXX" --query "SELECT \`sales\`.\`locationdim ensionid\` as 
> \`SALES_LOCATIONDIMENSIONID\` ,\`sales\`.\`storeitemdimensionid\` as 
> \`SALES_STOREITEMDIMENSIONID\` ,\`sales\`.\`basecostperunit\` as \`SALES_ 
> BASECOSTPERUNIT\` ,\`sales\`.\`createdby\` as \`SALES_CREATEDBY\` 
> ,\`sales\`.\`updateddate\` as \`SALES_UPDATEDDATE\` FROM \`\`.\`sales\` 
> \`sale s\` WHERE 1=1 AND \$CONDITIONS" --target-dir 
> hdfs://XX-master:8020/apps/XXX/XXX/kylin-4f367799-4993-bb67-da69-a9a147c62a1e/kylin_intermediate_cube_11_2701
>  2020_1d0a2dfd_bd66_d3e3_304b_9cd7f2018dbc --split-by --boundary-query 
> "SELECT min(\`\`), max(\`\`) FROM \`XX\`.\`sales\` " --null-string '\\N' 
> --n ull-non-string '\\N' --fields-terminated-by '|' --num-mappers 4
> at 
> org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:88)
> at org.apache.kylin.source.jdbc.CmdStep.sqoopFlatHiveTable(CmdStep.java:43)
> at org.apache.kylin.source.jdbc.CmdStep.doWork(CmdStep.java:54)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171)
> at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:62)
> at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:171)
> at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:106)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2020-01-27 17:09:19,362 INFO [Scheduler 1642300543 Job 
> 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.ExecutableManager:466 : 
> job id:4f367799-4993-bb6 7-da69-a9a147c62a1e-00 from RUNNING to ERROR
> 2020-01-27 17:09:19,365 ERROR [Scheduler 1642300543 Job 
> 4f367799-4993-bb67-da69-a9a147c62a1e-160] execution.AbstractExecutable:173 : 
> error running Executabl e: 
> CubingJob\{id=4f367799-4993-bb67-da69-a9a147c62a1e, name=BUILD CUBE - 
> cube_11_27012020 - FULL_BUILD - UTC 2020-01-27 17:09:00, state=RUNNING}
> 2020-01-27 17:09:19,372 DEBUG [pool-7-thread-1] cachesync.Broadcaster:111 : 
> Servers in the cluster: [localhost:7070]
> 2020-01-27 17:09:19,373 DEBUG [pool-7-thread-1] cachesync.Broadcaster:121 : 
> Announcing new bro
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4350) Pushdown improperly rewrites the query causing it to fail

2020-01-19 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019162#comment-17019162
 ] 

weibin0516 commented on KYLIN-4350:
---

I verified with v3.0.0 and found no such problem

> Pushdown improperly rewrites the query causing it to fail
> -
>
> Key: KYLIN-4350
> URL: https://issues.apache.org/jira/browse/KYLIN-4350
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.6.4
> Environment: HDP 2.6.5, Kylin 2.6.4, CentOS 7.6
>Reporter: Vsevolod Ostapenko
>Priority: Major
>
> A query that uses WITH clause and is subject for pushdown to Hive (or Impala) 
> for execution is incorrectly rewritten before being submitted to the 
> execution engine. Table aliases are attributed with database name, with makes 
> query invalid.
> Sample log excerpts are below:
>  
> {quote}2020-01-17 12:12:21,997 INFO [Query 
> e844b846-c589-4729-5a04-483f6d73c834-31163] service.QueryService:404 : The 
> original query: with
> t as
> (
> SELECT ZETTICSDW.A_VL_HOURLY_V.IMSIID "ZETTICSDW_A_VL_HOURLY_V_IMSIID",
>  ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID 
> "ZETTICSDW_A_VL_HOURLY_V_MEDIA_GAP_CALL_ID",
>  count(*) cnt
> FROM ZETTICSDW.A_VL_HOURLY_V
> WHERE ((ZETTICSDW.A_VL_HOURLY_V.THEDATE = '20200117')
>  AND ((ZETTICSDW.A_VL_HOURLY_V.THEHOUR >= '10')
>  AND (ZETTICSDW.A_VL_HOURLY_V.THEHOUR <= '10')))
> GROUP BY ZETTICSDW.A_VL_HOURLY_V.IMSIID, 
> ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID
> )
> select t.ZETTICSDW_A_VL_HOURLY_V_IMSIID,
>  count(*) "vl_aggs_model___CD_MEDIA_GAP_CALL_ID"
> *from t*
> group by t.ZETTICSDW_A_VL_HOURLY_V_IMSIID
> ORDER BY "vl_aggs_model___CD_MEDIA_GAP_CALL_ID" desc
> LIMIT 500
> 
> 2020-01-17 12:12:22,073 INFO [Query 
> e844b846-c589-4729-5a04-483f6d73c834-31163] 
> adhocquery.AbstractPushdownRunner:37 : the query is converted to with
> t as
> (
> SELECT ZETTICSDW.A_VL_HOURLY_V.IMSIID `ZETTICSDW_A_VL_HOURLY_V_IMSIID`,
>  ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID 
> `ZETTICSDW_A_VL_HOURLY_V_MEDIA_GAP_CALL_ID`,
>  count(*) cnt
> FROM ZETTICSDW.A_VL_HOURLY_V
> WHERE ((ZETTICSDW.A_VL_HOURLY_V.THEDATE = '20200117')
>  AND ((ZETTICSDW.A_VL_HOURLY_V.THEHOUR >= '10')
>  AND (ZETTICSDW.A_VL_HOURLY_V.THEHOUR <= '10')))
> GROUP BY ZETTICSDW.A_VL_HOURLY_V.IMSIID, 
> ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID
> )
> select t.ZETTICSDW_A_VL_HOURLY_V_IMSIID,
>  count(*) `vl_aggs_model___CD_MEDIA_GAP_CALL_ID`
> *{color:#FF}from ZETTICSDW.t{color}*
> group by t.ZETTICSDW_A_VL_HOURLY_V_IMSIID
> ORDER BY `vl_aggs_model___CD_MEDIA_GAP_CALL_ID` desc
> LIMIT 500 after applying converter 
> org.apache.kylin.source.adhocquery.HivePushDownConverter
> 2020-01-17 12:12:22,108 ERROR [Query 
> e844b846-c589-4729-5a04-483f6d73c834-31163] service.QueryService:989 : 
> pushdown engine failed current query too
> org.apache.hive.service.cli.HiveSQLException: AnalysisException: Could not 
> resolve table reference: '*zetticsdw.t*'
> {quote}
> Pushdown query should be submitted into query engine as written by the user.
>  As the best effort Kylin push down executor should issue "use " 
> over the same JDBC connection right before submitting the query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4350) Pushdown improperly rewrites the query causing it to fail

2020-01-19 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019161#comment-17019161
 ] 

weibin0516 commented on KYLIN-4350:
---

Hi, [~seva_ostapenko], not all databases support use databse, such as postgresql

> Pushdown improperly rewrites the query causing it to fail
> -
>
> Key: KYLIN-4350
> URL: https://issues.apache.org/jira/browse/KYLIN-4350
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.6.4
> Environment: HDP 2.6.5, Kylin 2.6.4, CentOS 7.6
>Reporter: Vsevolod Ostapenko
>Priority: Major
>
> A query that uses WITH clause and is subject for pushdown to Hive (or Impala) 
> for execution is incorrectly rewritten before being submitted to the 
> execution engine. Table aliases are attributed with database name, with makes 
> query invalid.
> Sample log excerpts are below:
>  
> {quote}2020-01-17 12:12:21,997 INFO [Query 
> e844b846-c589-4729-5a04-483f6d73c834-31163] service.QueryService:404 : The 
> original query: with
> t as
> (
> SELECT ZETTICSDW.A_VL_HOURLY_V.IMSIID "ZETTICSDW_A_VL_HOURLY_V_IMSIID",
>  ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID 
> "ZETTICSDW_A_VL_HOURLY_V_MEDIA_GAP_CALL_ID",
>  count(*) cnt
> FROM ZETTICSDW.A_VL_HOURLY_V
> WHERE ((ZETTICSDW.A_VL_HOURLY_V.THEDATE = '20200117')
>  AND ((ZETTICSDW.A_VL_HOURLY_V.THEHOUR >= '10')
>  AND (ZETTICSDW.A_VL_HOURLY_V.THEHOUR <= '10')))
> GROUP BY ZETTICSDW.A_VL_HOURLY_V.IMSIID, 
> ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID
> )
> select t.ZETTICSDW_A_VL_HOURLY_V_IMSIID,
>  count(*) "vl_aggs_model___CD_MEDIA_GAP_CALL_ID"
> *from t*
> group by t.ZETTICSDW_A_VL_HOURLY_V_IMSIID
> ORDER BY "vl_aggs_model___CD_MEDIA_GAP_CALL_ID" desc
> LIMIT 500
> 
> 2020-01-17 12:12:22,073 INFO [Query 
> e844b846-c589-4729-5a04-483f6d73c834-31163] 
> adhocquery.AbstractPushdownRunner:37 : the query is converted to with
> t as
> (
> SELECT ZETTICSDW.A_VL_HOURLY_V.IMSIID `ZETTICSDW_A_VL_HOURLY_V_IMSIID`,
>  ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID 
> `ZETTICSDW_A_VL_HOURLY_V_MEDIA_GAP_CALL_ID`,
>  count(*) cnt
> FROM ZETTICSDW.A_VL_HOURLY_V
> WHERE ((ZETTICSDW.A_VL_HOURLY_V.THEDATE = '20200117')
>  AND ((ZETTICSDW.A_VL_HOURLY_V.THEHOUR >= '10')
>  AND (ZETTICSDW.A_VL_HOURLY_V.THEHOUR <= '10')))
> GROUP BY ZETTICSDW.A_VL_HOURLY_V.IMSIID, 
> ZETTICSDW.A_VL_HOURLY_V.MEDIA_GAP_CALL_ID
> )
> select t.ZETTICSDW_A_VL_HOURLY_V_IMSIID,
>  count(*) `vl_aggs_model___CD_MEDIA_GAP_CALL_ID`
> *{color:#FF}from ZETTICSDW.t{color}*
> group by t.ZETTICSDW_A_VL_HOURLY_V_IMSIID
> ORDER BY `vl_aggs_model___CD_MEDIA_GAP_CALL_ID` desc
> LIMIT 500 after applying converter 
> org.apache.kylin.source.adhocquery.HivePushDownConverter
> 2020-01-17 12:12:22,108 ERROR [Query 
> e844b846-c589-4729-5a04-483f6d73c834-31163] service.QueryService:989 : 
> pushdown engine failed current query too
> org.apache.hive.service.cli.HiveSQLException: AnalysisException: Could not 
> resolve table reference: '*zetticsdw.t*'
> {quote}
> Pushdown query should be submitted into query engine as written by the user.
>  As the best effort Kylin push down executor should issue "use " 
> over the same JDBC connection right before submitting the query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4349) Close InputStream in RowRecordReader.initReaders()

2020-01-15 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4349:
--
 Attachment: image-2020-01-16-10-44-40-119.png
Description: 
Some InputStream not closed properly
 !image-2020-01-16-10-44-40-119.png! 

> Close InputStream in RowRecordReader.initReaders()
> --
>
> Key: KYLIN-4349
> URL: https://issues.apache.org/jira/browse/KYLIN-4349
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Attachments: image-2020-01-16-10-44-40-119.png
>
>
> Some InputStream not closed properly
>  !image-2020-01-16-10-44-40-119.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4349) Close InputStream in RowRecordReader.initReaders()

2020-01-15 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4349:
-

 Summary: Close InputStream in RowRecordReader.initReaders()
 Key: KYLIN-4349
 URL: https://issues.apache.org/jira/browse/KYLIN-4349
 Project: Kylin
  Issue Type: Bug
Affects Versions: v3.0.0
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (KYLIN-4339) Extract Fact Table Distinct Columns fail due to no kylin installed on worker node

2020-01-13 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 closed KYLIN-4339.
-
Resolution: Duplicate

> Extract Fact Table Distinct Columns fail due to no kylin installed on worker 
> node
> -
>
> Key: KYLIN-4339
> URL: https://issues.apache.org/jira/browse/KYLIN-4339
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> After set kylin.engine.spark-fact-distinct to true, using spark engine to 
> build cube will fail, error message as follow
> {code:java}
> 2020-01-13 22:19:23 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
> 2020-01-13 22:19:23 INFO  
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - 
> OutputCommitCoordinator stopped!
> 2020-01-13 22:19:23 INFO  SparkContext:54 - Successfully stopped SparkContext
> Exception in thread "main" java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Job aborted due 
> to stage failure: Task 9 in stage 1.0 failed 4 times, most recent failure: 
> Lost task 9.3 in stage 1.0 (TID 32, 
> sql-gateway-eu95-17.gz00c.test.alipay.net, executor 7): 
> org.apache.kylin.common.KylinConfigCannotInitException: Didn't find 
> KYLIN_CONF or KYLIN_HOME, please set one of them
>   at 
> org.apache.kylin.common.KylinConfig.getSitePropertiesFile(KylinConfig.java:336)
>   at 
> org.apache.kylin.common.KylinConfig.buildSiteOrderedProps(KylinConfig.java:378)
>   at 
> org.apache.kylin.common.KylinConfig.buildSiteProperties(KylinConfig.java:358)
>   at 
> org.apache.kylin.common.KylinConfig.getInstanceFromEnv(KylinConfig.java:137)
>   at 
> org.apache.kylin.dict.CacheDictionary.enableCache(CacheDictionary.java:105)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.initForestCache(TrieDictionaryForest.java:394)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.init(TrieDictionaryForest.java:77)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.(TrieDictionaryForest.java:67)
>   at 
> org.apache.kylin.dict.TrieDictionaryForestBuilder.build(TrieDictionaryForestBuilder.java:114)
>   at 
> org.apache.kylin.dict.DictionaryGenerator$NumberTrieDictForestBuilder.build(DictionaryGenerator.java:312)
>   at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:774)
>   at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:650)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>   at org.apache.spark.scheduler.Task.run(Task.scala:109)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:756)
> {code}
> we should put kylin.properties in the execution environment of spark 
> application (via --files) to fix this problem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4339) Extract Fact Table Distinct Columns fail due to no kylin installed on worker node

2020-01-13 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014893#comment-17014893
 ] 

weibin0516 commented on KYLIN-4339:
---

Ok, i will close the jira.

> Extract Fact Table Distinct Columns fail due to no kylin installed on worker 
> node
> -
>
> Key: KYLIN-4339
> URL: https://issues.apache.org/jira/browse/KYLIN-4339
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> After set kylin.engine.spark-fact-distinct to true, using spark engine to 
> build cube will fail, error message as follow
> {code:java}
> 2020-01-13 22:19:23 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
> 2020-01-13 22:19:23 INFO  
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - 
> OutputCommitCoordinator stopped!
> 2020-01-13 22:19:23 INFO  SparkContext:54 - Successfully stopped SparkContext
> Exception in thread "main" java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Job aborted due 
> to stage failure: Task 9 in stage 1.0 failed 4 times, most recent failure: 
> Lost task 9.3 in stage 1.0 (TID 32, 
> sql-gateway-eu95-17.gz00c.test.alipay.net, executor 7): 
> org.apache.kylin.common.KylinConfigCannotInitException: Didn't find 
> KYLIN_CONF or KYLIN_HOME, please set one of them
>   at 
> org.apache.kylin.common.KylinConfig.getSitePropertiesFile(KylinConfig.java:336)
>   at 
> org.apache.kylin.common.KylinConfig.buildSiteOrderedProps(KylinConfig.java:378)
>   at 
> org.apache.kylin.common.KylinConfig.buildSiteProperties(KylinConfig.java:358)
>   at 
> org.apache.kylin.common.KylinConfig.getInstanceFromEnv(KylinConfig.java:137)
>   at 
> org.apache.kylin.dict.CacheDictionary.enableCache(CacheDictionary.java:105)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.initForestCache(TrieDictionaryForest.java:394)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.init(TrieDictionaryForest.java:77)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.(TrieDictionaryForest.java:67)
>   at 
> org.apache.kylin.dict.TrieDictionaryForestBuilder.build(TrieDictionaryForestBuilder.java:114)
>   at 
> org.apache.kylin.dict.DictionaryGenerator$NumberTrieDictForestBuilder.build(DictionaryGenerator.java:312)
>   at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:774)
>   at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:650)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>   at org.apache.spark.scheduler.Task.run(Task.scala:109)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:756)
> {code}
> we should put kylin.properties in the execution environment of spark 
> application (via --files) to fix this problem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4340) Cube Configuration Overwrites not effective

2020-01-13 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4340:
-

 Summary: Cube Configuration Overwrites not effective
 Key: KYLIN-4340
 URL: https://issues.apache.org/jira/browse/KYLIN-4340
 Project: Kylin
  Issue Type: Bug
Reporter: weibin0516
Assignee: weibin0516
 Attachments: image-2020-01-13-23-20-23-476.png

In kylin.properties, 


{code:java}
kylin.engine.spark-fact-distinct=true
{code}

 !image-2020-01-13-23-20-23-476.png! 
set this config to false in cube, but not effective when build cube




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4339) Extract Fact Table Distinct Columns fail due to no kylin installed on worker node

2020-01-13 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014393#comment-17014393
 ] 

weibin0516 commented on KYLIN-4339:
---

cc [~temple.zhou] ,he met the same problem.

> Extract Fact Table Distinct Columns fail due to no kylin installed on worker 
> node
> -
>
> Key: KYLIN-4339
> URL: https://issues.apache.org/jira/browse/KYLIN-4339
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> After set kylin.engine.spark-fact-distinct to true, using spark engine to 
> build cube will fail, error message as follow
> {code:java}
> 2020-01-13 22:19:23 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
> 2020-01-13 22:19:23 INFO  
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - 
> OutputCommitCoordinator stopped!
> 2020-01-13 22:19:23 INFO  SparkContext:54 - Successfully stopped SparkContext
> Exception in thread "main" java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Job aborted due 
> to stage failure: Task 9 in stage 1.0 failed 4 times, most recent failure: 
> Lost task 9.3 in stage 1.0 (TID 32, 
> sql-gateway-eu95-17.gz00c.test.alipay.net, executor 7): 
> org.apache.kylin.common.KylinConfigCannotInitException: Didn't find 
> KYLIN_CONF or KYLIN_HOME, please set one of them
>   at 
> org.apache.kylin.common.KylinConfig.getSitePropertiesFile(KylinConfig.java:336)
>   at 
> org.apache.kylin.common.KylinConfig.buildSiteOrderedProps(KylinConfig.java:378)
>   at 
> org.apache.kylin.common.KylinConfig.buildSiteProperties(KylinConfig.java:358)
>   at 
> org.apache.kylin.common.KylinConfig.getInstanceFromEnv(KylinConfig.java:137)
>   at 
> org.apache.kylin.dict.CacheDictionary.enableCache(CacheDictionary.java:105)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.initForestCache(TrieDictionaryForest.java:394)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.init(TrieDictionaryForest.java:77)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.(TrieDictionaryForest.java:67)
>   at 
> org.apache.kylin.dict.TrieDictionaryForestBuilder.build(TrieDictionaryForestBuilder.java:114)
>   at 
> org.apache.kylin.dict.DictionaryGenerator$NumberTrieDictForestBuilder.build(DictionaryGenerator.java:312)
>   at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:774)
>   at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:650)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>   at org.apache.spark.scheduler.Task.run(Task.scala:109)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:756)
> {code}
> we should put kylin.properties in the execution environment of spark 
> application (via --files) to fix this problem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4339) Extract Fact Table Distinct Columns fail due to no kylin installed on worker node

2020-01-13 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014390#comment-17014390
 ] 

weibin0516 commented on KYLIN-4339:
---

Need to appear in a cluster environment, there is no problem with docker trial

> Extract Fact Table Distinct Columns fail due to no kylin installed on worker 
> node
> -
>
> Key: KYLIN-4339
> URL: https://issues.apache.org/jira/browse/KYLIN-4339
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> After set kylin.engine.spark-fact-distinct to true, using spark engine to 
> build cube will fail, error message as follow
> {code:java}
> 2020-01-13 22:19:23 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
> 2020-01-13 22:19:23 INFO  
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - 
> OutputCommitCoordinator stopped!
> 2020-01-13 22:19:23 INFO  SparkContext:54 - Successfully stopped SparkContext
> Exception in thread "main" java.lang.RuntimeException: error execute 
> org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Job aborted due 
> to stage failure: Task 9 in stage 1.0 failed 4 times, most recent failure: 
> Lost task 9.3 in stage 1.0 (TID 32, 
> sql-gateway-eu95-17.gz00c.test.alipay.net, executor 7): 
> org.apache.kylin.common.KylinConfigCannotInitException: Didn't find 
> KYLIN_CONF or KYLIN_HOME, please set one of them
>   at 
> org.apache.kylin.common.KylinConfig.getSitePropertiesFile(KylinConfig.java:336)
>   at 
> org.apache.kylin.common.KylinConfig.buildSiteOrderedProps(KylinConfig.java:378)
>   at 
> org.apache.kylin.common.KylinConfig.buildSiteProperties(KylinConfig.java:358)
>   at 
> org.apache.kylin.common.KylinConfig.getInstanceFromEnv(KylinConfig.java:137)
>   at 
> org.apache.kylin.dict.CacheDictionary.enableCache(CacheDictionary.java:105)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.initForestCache(TrieDictionaryForest.java:394)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.init(TrieDictionaryForest.java:77)
>   at 
> org.apache.kylin.dict.TrieDictionaryForest.(TrieDictionaryForest.java:67)
>   at 
> org.apache.kylin.dict.TrieDictionaryForestBuilder.build(TrieDictionaryForestBuilder.java:114)
>   at 
> org.apache.kylin.dict.DictionaryGenerator$NumberTrieDictForestBuilder.build(DictionaryGenerator.java:312)
>   at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:774)
>   at 
> org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:650)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>   at org.apache.spark.scheduler.Task.run(Task.scala:109)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:756)
> {code}
> we should put kylin.properties in the execution environment of spark 
> application (via --files) to fix this problem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4339) Extract Fact Table Distinct Columns fail due to no kylin installed on worker node

2020-01-13 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4339:
-

 Summary: Extract Fact Table Distinct Columns fail due to no kylin 
installed on worker node
 Key: KYLIN-4339
 URL: https://issues.apache.org/jira/browse/KYLIN-4339
 Project: Kylin
  Issue Type: Bug
Affects Versions: v3.0.0
Reporter: weibin0516
Assignee: weibin0516


After set kylin.engine.spark-fact-distinct to true, using spark engine to build 
cube will fail, error message as follow


{code:java}
2020-01-13 22:19:23 INFO  BlockManagerMaster:54 - BlockManagerMaster stopped
2020-01-13 22:19:23 INFO  
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - 
OutputCommitCoordinator stopped!
2020-01-13 22:19:23 INFO  SparkContext:54 - Successfully stopped SparkContext
Exception in thread "main" java.lang.RuntimeException: error execute 
org.apache.kylin.engine.spark.SparkFactDistinct. Root cause: Job aborted due to 
stage failure: Task 9 in stage 1.0 failed 4 times, most recent failure: Lost 
task 9.3 in stage 1.0 (TID 32, sql-gateway-eu95-17.gz00c.test.alipay.net, 
executor 7): org.apache.kylin.common.KylinConfigCannotInitException: Didn't 
find KYLIN_CONF or KYLIN_HOME, please set one of them
at 
org.apache.kylin.common.KylinConfig.getSitePropertiesFile(KylinConfig.java:336)
at 
org.apache.kylin.common.KylinConfig.buildSiteOrderedProps(KylinConfig.java:378)
at 
org.apache.kylin.common.KylinConfig.buildSiteProperties(KylinConfig.java:358)
at 
org.apache.kylin.common.KylinConfig.getInstanceFromEnv(KylinConfig.java:137)
at 
org.apache.kylin.dict.CacheDictionary.enableCache(CacheDictionary.java:105)
at 
org.apache.kylin.dict.TrieDictionaryForest.initForestCache(TrieDictionaryForest.java:394)
at 
org.apache.kylin.dict.TrieDictionaryForest.init(TrieDictionaryForest.java:77)
at 
org.apache.kylin.dict.TrieDictionaryForest.(TrieDictionaryForest.java:67)
at 
org.apache.kylin.dict.TrieDictionaryForestBuilder.build(TrieDictionaryForestBuilder.java:114)
at 
org.apache.kylin.dict.DictionaryGenerator$NumberTrieDictForestBuilder.build(DictionaryGenerator.java:312)
at 
org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:774)
at 
org.apache.kylin.engine.spark.SparkFactDistinct$MultiOutputFunction.call(SparkFactDistinct.java:650)
at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
at 
org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
at 
org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:800)
at 
org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:756)


{code}

we should put kylin.properties in the execution environment of spark 
application (via --files) to fix this problem



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4321) Create fact distinct columns using spark by default when build engine is spark

2020-01-07 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010215#comment-17010215
 ] 

weibin0516 edited comment on KYLIN-4321 at 1/8/20 12:49 AM:


Past experience and a large amount of test data show that Spark's performance 
is significantly better than Hive(MapReduce).

The following pictures are the test result of spark and hive on tpc-ds
 !screenshot-2.png! 
 !screenshot-1.png! 

Currently, when the cube is built with the spark engine, the `Create fact 
distinct columns` step uses mapreduce by default. Here we want to use the spark 
engine to perform this step by default, that is, modify the` 
kylin.engine.spark-fact-distinct` value to true.


was (Author: codingforfun):
Past experience and a large amount of test data show that Spark's performance 
is significantly better than Hive(MapReduce).
 !screenshot-2.png! 
 !screenshot-1.png! 
Currently, when the cube is built with the spark engine, the `Create fact 
distinct columns` step uses mapreduce by default. Here we want to use the spark 
engine to perform this step by default, that is, modify the` 
kylin.engine.spark-fact-distinct` value to true.

> Create fact distinct columns using spark by default when build engine is spark
> --
>
> Key: KYLIN-4321
> URL: https://issues.apache.org/jira/browse/KYLIN-4321
> Project: Kylin
>  Issue Type: Improvement
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4321) Create fact distinct columns using spark by default when build engine is spark

2020-01-07 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010215#comment-17010215
 ] 

weibin0516 edited comment on KYLIN-4321 at 1/8/20 12:46 AM:


Past experience and a large amount of test data show that Spark's performance 
is significantly better than Hive(MapReduce).
 !screenshot-2.png! 
 !screenshot-1.png! 
Currently, when the cube is built with the spark engine, the `Create fact 
distinct columns` step uses mapreduce by default. Here we want to use the spark 
engine to perform this step by default, that is, modify the` 
kylin.engine.spark-fact-distinct` value to true.


was (Author: codingforfun):
Past experience and a large amount of test data show that Spark's performance 
is significantly better than MapReduce.
 !screenshot-1.png! 
 !screenshot-2.png! 
Currently, when the cube is built with the spark engine, the `Create fact 
distinct columns` step uses mapreduce by default. Here we want to use the spark 
engine to perform this step by default, that is, modify the` 
kylin.engine.spark-fact-distinct` value to true.

> Create fact distinct columns using spark by default when build engine is spark
> --
>
> Key: KYLIN-4321
> URL: https://issues.apache.org/jira/browse/KYLIN-4321
> Project: Kylin
>  Issue Type: Improvement
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4321) Create fact distinct columns using spark by default when build engine is spark

2020-01-07 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4321:
--
Attachment: screenshot-2.png

> Create fact distinct columns using spark by default when build engine is spark
> --
>
> Key: KYLIN-4321
> URL: https://issues.apache.org/jira/browse/KYLIN-4321
> Project: Kylin
>  Issue Type: Improvement
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4321) Create fact distinct columns using spark by default when build engine is spark

2020-01-07 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010215#comment-17010215
 ] 

weibin0516 commented on KYLIN-4321:
---

Past experience and a large amount of test data show that Spark's performance 
is significantly better than MapReduce.
 !screenshot-1.png! 
 !screenshot-2.png! 
Currently, when the cube is built with the spark engine, the `Create fact 
distinct columns` step uses mapreduce by default. Here we want to use the spark 
engine to perform this step by default, that is, modify the` 
kylin.engine.spark-fact-distinct` value to true.

> Create fact distinct columns using spark by default when build engine is spark
> --
>
> Key: KYLIN-4321
> URL: https://issues.apache.org/jira/browse/KYLIN-4321
> Project: Kylin
>  Issue Type: Improvement
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4321) Create fact distinct columns using spark by default when build engine is spark

2020-01-07 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4321:
--
Attachment: screenshot-1.png

> Create fact distinct columns using spark by default when build engine is spark
> --
>
> Key: KYLIN-4321
> URL: https://issues.apache.org/jira/browse/KYLIN-4321
> Project: Kylin
>  Issue Type: Improvement
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.1.0
>
> Attachments: screenshot-1.png, screenshot-2.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4324) User query returns Unknown error

2020-01-01 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006567#comment-17006567
 ] 

weibin0516 commented on KYLIN-4324:
---

Hi, [~bai], Can you describe how this error occurred?

> User query returns Unknown error
> 
>
> Key: KYLIN-4324
> URL: https://issues.apache.org/jira/browse/KYLIN-4324
> Project: Kylin
>  Issue Type: Bug
>Reporter: 白云松
>Priority: Major
> Attachments: 1577935826(1).png
>
>
> !1577935826(1).png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4104) Support multi jdbc pushdown runners to execute query/update

2019-12-30 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17005896#comment-17005896
 ] 

weibin0516 commented on KYLIN-4104:
---

Hi, [~shaofengshi], please see previous discussions 
http://apache-kylin.74782.x6.nabble.com/DISCUSS-Support-multiple-pushdown-query-engines-td13454.html

> Support multi jdbc pushdown runners to execute query/update
> ---
>
> Key: KYLIN-4104
> URL: https://issues.apache.org/jira/browse/KYLIN-4104
> Project: Kylin
>  Issue Type: New Feature
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> Current (version 3.0.0-SNAPSHOT), kylin support only one kind of pushdown 
> query engine. In some user's scenario, need pushdown query to mysql, spark 
> sql,hive etc.
> I think kylin need support  multiple pushdowns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4321) Create fact distinct columns using spark by default when build engine is spark

2019-12-30 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4321:
--
Summary: Create fact distinct columns using spark by default when build 
engine is spark  (was: Create fact distinct columns by spark when build engine 
is spark)

> Create fact distinct columns using spark by default when build engine is spark
> --
>
> Key: KYLIN-4321
> URL: https://issues.apache.org/jira/browse/KYLIN-4321
> Project: Kylin
>  Issue Type: Improvement
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4321) Create fact distinct columns by spark when build engine is spark

2019-12-30 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4321:
-

 Summary: Create fact distinct columns by spark when build engine 
is spark
 Key: KYLIN-4321
 URL: https://issues.apache.org/jira/browse/KYLIN-4321
 Project: Kylin
  Issue Type: Improvement
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4317) Update doc for KYLIN-4104

2019-12-27 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4317:
-

 Summary: Update doc for KYLIN-4104
 Key: KYLIN-4317
 URL: https://issues.apache.org/jira/browse/KYLIN-4317
 Project: Kylin
  Issue Type: Improvement
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4309) One user's mailbox is not suffixed and other messages cannot be sent

2019-12-19 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4309:
-

Assignee: weibin0516

> One user's mailbox is not suffixed and other messages cannot be sent
> 
>
> Key: KYLIN-4309
> URL: https://issues.apache.org/jira/browse/KYLIN-4309
> Project: Kylin
>  Issue Type: Bug
>Reporter: 白云松
>Assignee: weibin0516
>Priority: Major
> Attachments: 1576811897(1).png
>
>
>  !1576811897(1).png! 
> One user's mailbox is not suffixed and other messages cannot be sent
> {code}
> org.apache.commons.mail.EmailException: javax.mail.internet.AddressException: 
> Missing final '@domain' in string ``baiyunsong''
>   at org.apache.commons.mail.Email.createInternetAddress(Email.java:1974)
>   at org.apache.commons.mail.Email.addTo(Email.java:846)
>   at org.apache.commons.mail.Email.addTo(Email.java:829)
>   at org.apache.commons.mail.Email.addTo(Email.java:778)
>   at 
> org.apache.kylin.common.util.MailService.sendMailInternal(MailService.java:136)
>   at 
> org.apache.kylin.common.util.MailService.sendMail(MailService.java:107)
>   at 
> org.apache.kylin.common.util.MailService.sendMail(MailService.java:76)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.doSendMail(AbstractExecutable.java:381)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (KYLIN-4309) One user's mailbox is not suffixed and other messages cannot be sent

2019-12-19 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000596#comment-17000596
 ] 

weibin0516 edited comment on KYLIN-4309 at 12/20/19 3:49 AM:
-

Hi, [~bai], this is indeed a bug, we should verify when creating the model / 
cube, or skip the wrong mailbox when sending (record the error log).

I prefer the latter, if you don't mind, I can fix this bug.


was (Author: codingforfun):
Hi, [~bai], this is indeed a bug, we should verify when creating the model / 
cube, or skip the wrong mailbox when sending (record the error log)

> One user's mailbox is not suffixed and other messages cannot be sent
> 
>
> Key: KYLIN-4309
> URL: https://issues.apache.org/jira/browse/KYLIN-4309
> Project: Kylin
>  Issue Type: Bug
>Reporter: 白云松
>Priority: Major
> Attachments: 1576811897(1).png
>
>
>  !1576811897(1).png! 
> One user's mailbox is not suffixed and other messages cannot be sent
> {code}
> org.apache.commons.mail.EmailException: javax.mail.internet.AddressException: 
> Missing final '@domain' in string ``baiyunsong''
>   at org.apache.commons.mail.Email.createInternetAddress(Email.java:1974)
>   at org.apache.commons.mail.Email.addTo(Email.java:846)
>   at org.apache.commons.mail.Email.addTo(Email.java:829)
>   at org.apache.commons.mail.Email.addTo(Email.java:778)
>   at 
> org.apache.kylin.common.util.MailService.sendMailInternal(MailService.java:136)
>   at 
> org.apache.kylin.common.util.MailService.sendMail(MailService.java:107)
>   at 
> org.apache.kylin.common.util.MailService.sendMail(MailService.java:76)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.doSendMail(AbstractExecutable.java:381)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4309) One user's mailbox is not suffixed and other messages cannot be sent

2019-12-19 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000596#comment-17000596
 ] 

weibin0516 commented on KYLIN-4309:
---

Hi, [~bai], this is indeed a bug, we should verify when creating the model / 
cube, or skip the wrong mailbox when sending (record the error log)

> One user's mailbox is not suffixed and other messages cannot be sent
> 
>
> Key: KYLIN-4309
> URL: https://issues.apache.org/jira/browse/KYLIN-4309
> Project: Kylin
>  Issue Type: Bug
>Reporter: 白云松
>Priority: Major
> Attachments: 1576811897(1).png
>
>
>  !1576811897(1).png! 
> One user's mailbox is not suffixed and other messages cannot be sent
> {code}
> org.apache.commons.mail.EmailException: javax.mail.internet.AddressException: 
> Missing final '@domain' in string ``baiyunsong''
>   at org.apache.commons.mail.Email.createInternetAddress(Email.java:1974)
>   at org.apache.commons.mail.Email.addTo(Email.java:846)
>   at org.apache.commons.mail.Email.addTo(Email.java:829)
>   at org.apache.commons.mail.Email.addTo(Email.java:778)
>   at 
> org.apache.kylin.common.util.MailService.sendMailInternal(MailService.java:136)
>   at 
> org.apache.kylin.common.util.MailService.sendMail(MailService.java:107)
>   at 
> org.apache.kylin.common.util.MailService.sendMail(MailService.java:76)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.doSendMail(AbstractExecutable.java:381)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4308) Make kylin.sh tips clearer and more explicit

2019-12-19 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4308:
-

 Summary: Make kylin.sh tips clearer and more explicit
 Key: KYLIN-4308
 URL: https://issues.apache.org/jira/browse/KYLIN-4308
 Project: Kylin
  Issue Type: Improvement
Reporter: weibin0516
Assignee: weibin0516


When the streaming receiver process is running, start another one, and the 
error is not clear enough

{code:java}
[root@ca301c60c3d6 bin]# ./kylin.sh streaming start
Kylin is running, stop it first
{code}

When stopping the streaming receiver process, `Stopping Kylin...` should not be 
displayed and it should be pointed out exactly as streaming receiver stopped

{code:java}
[root@ca301c60c3d6 bin]# ./kylin.sh streaming stop
stopping streaming:20404
Stopping Kylin: 20404
Kylin with pid 20404 has been stopped.
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4303) Fix the bug that HBaseAdmin is not closed properly

2019-12-17 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4303:
-

 Summary: Fix the bug that HBaseAdmin is not closed properly
 Key: KYLIN-4303
 URL: https://issues.apache.org/jira/browse/KYLIN-4303
 Project: Kylin
  Issue Type: Bug
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4302) Fix the bug that InputStream is not closed properly

2019-12-17 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4302:
-

 Summary: Fix the bug that InputStream is not closed properly
 Key: KYLIN-4302
 URL: https://issues.apache.org/jira/browse/KYLIN-4302
 Project: Kylin
  Issue Type: Bug
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (KYLIN-4078) Fix DefaultSchedulerTest.testMetaStoreRecover unit test fail

2019-12-13 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 closed KYLIN-4078.
-
Resolution: Invalid

> Fix DefaultSchedulerTest.testMetaStoreRecover unit test fail
> 
>
> Key: KYLIN-4078
> URL: https://issues.apache.org/jira/browse/KYLIN-4078
> Project: Kylin
>  Issue Type: Test
>Affects Versions: v3.0.0
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Attachments: error.png
>
>
> When run `mvn clean test` got error as follow:
> {code:java}
> [INFO]
> [ERROR] Errors:
> [ERROR]   
> DefaultSchedulerTest.testMetaStoreRecover:189->BaseSchedulerTest.waitForJobFinish:107
>  » Runtime
> [INFO]
> [ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 2
> [INFO]
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO]
> [INFO] Apache Kylin 3.0.0-SNAPSHOT  SUCCESS [  4.856 
> s]
> [INFO] Apache Kylin - Core Common . SUCCESS [ 32.858 
> s]
> [INFO] Apache Kylin - Core Metadata ... SUCCESS [ 59.055 
> s]
> [INFO] Apache Kylin - Core Dictionary . SUCCESS [03:55 
> min]
> [INFO] Apache Kylin - Core Cube ... SUCCESS [02:34 
> min]
> [INFO] Apache Kylin - Core Metrics  SUCCESS [  2.071 
> s]
> [INFO] Apache Kylin - Core Job  FAILURE [02:33 
> min]
> [INFO] Apache Kylin - Core Storage  SKIPPED
> [INFO] Apache Kylin - Stream Core . SKIPPED
> [INFO] Apache Kylin - MapReduce Engine  SKIPPED
> [INFO] Apache Kylin - Spark Engine  SKIPPED
> [INFO] Apache Kylin - Hive Source . SKIPPED
> [INFO] Apache Kylin - DataSource SDK .. SKIPPED
> [INFO] Apache Kylin - Jdbc Source . SKIPPED
> [INFO] Apache Kylin - Kafka Source  SKIPPED
> [INFO] Apache Kylin - Cache ... SKIPPED
> [INFO] Apache Kylin - HBase Storage ... SKIPPED
> [INFO] Apache Kylin - Query ... SKIPPED
> [INFO] Apache Kylin - Metrics Reporter Hive ... SKIPPED
> [INFO] Apache Kylin - Metrics Reporter Kafka .. SKIPPED
> [INFO] Apache Kylin - Stream Source Kafka . SKIPPED
> [INFO] Apache Kylin - Stream Coordinator .. SKIPPED
> [INFO] Apache Kylin - Stream Receiver . SKIPPED
> [INFO] Apache Kylin - Stream Storage .. SKIPPED
> [INFO] Apache Kylin - REST Server Base  SKIPPED
> [INFO] Apache Kylin - REST Server . SKIPPED
> [INFO] Apache Kylin - JDBC Driver . SKIPPED
> [INFO] Apache Kylin - Assembly  SKIPPED
> [INFO] Apache Kylin - Tool  SKIPPED
> [INFO] Apache Kylin - Tool Assembly ... SKIPPED
> [INFO] Apache Kylin - Integration Test  SKIPPED
> [INFO] Apache Kylin - Tomcat Extension 3.0.0-SNAPSHOT . SKIPPED
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 10:42 min
> [INFO] Finished at: 2019-07-12T08:59:26+08:00
> [INFO] 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
> project kylin-core-job: There are test failures.
> [ERROR]
> [ERROR] Please refer to 
> /Users/zhuweibin/ant_code/OpenSource/kylin/core-job/../target/surefire-reports
>  for the individual test results.
> [ERROR] Please refer to dump files (if any exist) [date]-jvmRun[N].dump, 
> [date].dumpstream and [date]-jvmRun[N].dumpstream.
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :kylin-core-job
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4296) When an exception occurs, record detailed stack information to the log

2019-12-11 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4296:
-

 Summary: When an exception occurs, record detailed stack 
information to the log
 Key: KYLIN-4296
 URL: https://issues.apache.org/jira/browse/KYLIN-4296
 Project: Kylin
  Issue Type: Improvement
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4272) problems of docker/build_image.sh

2019-12-08 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4272:
-

Assignee: (was: weibin0516)

> problems of docker/build_image.sh
> -
>
> Key: KYLIN-4272
> URL: https://issues.apache.org/jira/browse/KYLIN-4272
> Project: Kylin
>  Issue Type: Bug
>Reporter: ZhouKang
>Priority: Major
>
> this script can only be executed in sub dir "docker", if your want to execute 
> as following, some error got:
> {code:java}
> // code placeholder
> bash docker/build_image.sh{code}
> And, the source code's dir name must be kylin, you cannot use other name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Issue Comment Deleted] (KYLIN-4272) problems of docker/build_image.sh

2019-12-08 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4272:
--
Comment: was deleted

(was: Hi, [~zhoukangcn], thanks for the feedback, I will fix this bug.)

> problems of docker/build_image.sh
> -
>
> Key: KYLIN-4272
> URL: https://issues.apache.org/jira/browse/KYLIN-4272
> Project: Kylin
>  Issue Type: Bug
>Reporter: ZhouKang
>Assignee: weibin0516
>Priority: Major
>
> this script can only be executed in sub dir "docker", if your want to execute 
> as following, some error got:
> {code:java}
> // code placeholder
> bash docker/build_image.sh{code}
> And, the source code's dir name must be kylin, you cannot use other name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4272) problems of docker/build_image.sh

2019-11-27 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983275#comment-16983275
 ] 

weibin0516 commented on KYLIN-4272:
---

Hi, [~zhoukangcn], thanks for the feedback, I will fix this bug.

> problems of docker/build_image.sh
> -
>
> Key: KYLIN-4272
> URL: https://issues.apache.org/jira/browse/KYLIN-4272
> Project: Kylin
>  Issue Type: Bug
>Reporter: ZhouKang
>Assignee: weibin0516
>Priority: Major
>
> this script can only be executed in sub dir "docker", if your want to execute 
> as following, some error got:
> {code:java}
> // code placeholder
> bash docker/build_image.sh{code}
> And, the source code's dir name must be kylin, you cannot use other name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4272) problems of docker/build_image.sh

2019-11-27 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4272:
-

Assignee: weibin0516

> problems of docker/build_image.sh
> -
>
> Key: KYLIN-4272
> URL: https://issues.apache.org/jira/browse/KYLIN-4272
> Project: Kylin
>  Issue Type: Bug
>Reporter: ZhouKang
>Assignee: weibin0516
>Priority: Major
>
> this script can only be executed in sub dir "docker", if your want to execute 
> as following, some error got:
> {code:java}
> // code placeholder
> bash docker/build_image.sh{code}
> And, the source code's dir name must be kylin, you cannot use other name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4008) Real-time Streaming submit streaming job failed for spark engine

2019-11-21 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4008:
-

Assignee: weibin0516

> Real-time Streaming submit streaming job failed for spark engine
> 
>
> Key: KYLIN-4008
> URL: https://issues.apache.org/jira/browse/KYLIN-4008
> Project: Kylin
>  Issue Type: New Feature
>  Components: Real-time Streaming
>Affects Versions: v3.0.0-alpha
>Reporter: zengrui
>Assignee: weibin0516
>Priority: Minor
> Attachments: error.bmp
>
>
> Create a Realtime Streaming Cube and the Cube Engine is Spark, when the 
> coordinator node receive a remoteStoreCompelete request and exist some 
> segments can build, the streaming job submit failed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4255) Display detailed error message when using livy build error

2019-11-14 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4255:
--
Description: 
Currently, when using livy build error, the error message does not show the 
detailed error reason, which is not conducive to troubleshooting. The following 
is an example of submit build job error:

Current error message:

{code:java}
java.lang.RuntimeException: livy execute failed. 
livy get status failed. state is dead
at 
org.apache.kylin.common.livy.LivyRestExecutor.execute(LivyRestExecutor.java:76)
at 
org.apache.kylin.source.hive.MRHiveDictUtil.runLivySqlJob(MRHiveDictUtil.java:144)
at 
org.apache.kylin.source.hive.CreateFlatHiveTableByLivyStep.createFlatHiveTable(CreateFlatHiveTableByLivyStep.java:44)
at 
org.apache.kylin.source.hive.CreateFlatHiveTableByLivyStep.doWork(CreateFlatHiveTableByLivyStep.java:51)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}

Actual reason for the error:

{code:java}
2019-11-13 07:40:02 WARN  NativeCodeLoader:62 - Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.io.FileNotFoundException: File 
hdfs://localhost:9000/kylin/livy/hbase-client-$HBASE_VERSION.jar does not exist.
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:697)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:105)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:755)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:751)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:751)
at org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:727)
at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:695)
at 
org.apache.spark.deploy.DependencyUtils$.downloadFile(DependencyUtils.scala:135)
at 
org.apache.spark.deploy.DependencyUtils$$anonfun$downloadFileList$2.apply(DependencyUtils.scala:102)
at 
org.apache.spark.deploy.DependencyUtils$$anonfun$downloadFileList$2.apply(DependencyUtils.scala:102)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
{code}





> Display detailed error message when using livy build error
> --
>
> Key: KYLIN-4255
> URL: https://issues.apache.org/jira/browse/KYLIN-4255
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> Currently, when using livy build error, the error message does not show the 
> detailed error reason, which is not conducive to troubleshooting. The 
> following is an example of submit build job error:
> Current error message:
> {code:java}
> java.lang.RuntimeException: livy execute failed. 
> livy get status failed. state is dead
>   at 
> org.apache.kylin.common.livy.LivyRestExecutor.execute(LivyRestExecutor.java:76)
>   at 
> org.apache.kylin.source.hive.MRHiveDictUtil.runLivySqlJob(MRHiveDictUtil.java:144)
>   at 
> org.apache.kylin.source.hive.CreateFlatHiveTableByLivyStep.createFlatHiveTable(CreateFlatHiveTableByLivyStep.java:44)
>   at 
> org.apache.kylin.source.hive.CreateFlatHiveTableByLivyStep.doWork(CreateFlatHiveTableByLivyStep.java:51)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
>   at 
> org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
>   at 
> org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:179)
>   at 
> org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Wor

[jira] [Updated] (KYLIN-4255) Display detailed error message when using livy build error

2019-11-14 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4255:
--
Attachment: (was: screenshot-1.png)

> Display detailed error message when using livy build error
> --
>
> Key: KYLIN-4255
> URL: https://issues.apache.org/jira/browse/KYLIN-4255
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4255) Display detailed error message when using livy build error

2019-11-14 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4255:
--
Attachment: screenshot-1.png

> Display detailed error message when using livy build error
> --
>
> Key: KYLIN-4255
> URL: https://issues.apache.org/jira/browse/KYLIN-4255
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4255) Display detailed error message when using livy build error

2019-11-14 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4255:
--
Description: (was: Currently, when using livy build error, the error 
message does not show the detailed error reason, which is not conducive to 
troubleshooting. Here are two examples:

### submit build job failed
### )

> Display detailed error message when using livy build error
> --
>
> Key: KYLIN-4255
> URL: https://issues.apache.org/jira/browse/KYLIN-4255
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4255) Display detailed error message when using livy build error

2019-11-14 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4255:
--
Description: 
Currently, when using livy build error, the error message does not show the 
detailed error reason, which is not conducive to troubleshooting. Here are two 
examples:

### submit build job failed
### 

> Display detailed error message when using livy build error
> --
>
> Key: KYLIN-4255
> URL: https://issues.apache.org/jira/browse/KYLIN-4255
> Project: Kylin
>  Issue Type: Improvement
>  Components: Spark Engine
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> Currently, when using livy build error, the error message does not show the 
> detailed error reason, which is not conducive to troubleshooting. Here are 
> two examples:
> ### submit build job failed
> ### 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4255) Display detailed error message when using livy build error

2019-11-14 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4255:
-

 Summary: Display detailed error message when using livy build error
 Key: KYLIN-4255
 URL: https://issues.apache.org/jira/browse/KYLIN-4255
 Project: Kylin
  Issue Type: Improvement
  Components: Spark Engine
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4251) Add livy to docker

2019-11-12 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4251:
-

 Summary: Add livy to docker
 Key: KYLIN-4251
 URL: https://issues.apache.org/jira/browse/KYLIN-4251
 Project: Kylin
  Issue Type: Improvement
  Components: Environment 
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4224) Create flat table wich spark sql

2019-11-11 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972006#comment-16972006
 ] 

weibin0516 commented on KYLIN-4224:
---

Hi [~shaofengshi], thanks for reminding, this feature is used to support 
reading the spark sql datasource to build a flat table, which is different from 
what you mentioned.

> Create flat table wich spark sql
> 
>
> Key: KYLIN-4224
> URL: https://issues.apache.org/jira/browse/KYLIN-4224
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> Spark SQL datasource jira is https://issues.apache.org/jira/browse/KYLIN-741.
> Currently using hive to create flat table, hive can't read spark datasource 
> data, we need to support the creation of flat table with spark sql, because 
> it can read hive and spark datasource data at the same time to create flat 
> table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4224) Create flat table wich spark sql

2019-11-11 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4224:
-

Assignee: weibin0516  (was: hailin.huang)

> Create flat table wich spark sql
> 
>
> Key: KYLIN-4224
> URL: https://issues.apache.org/jira/browse/KYLIN-4224
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> Spark SQL datasource jira is https://issues.apache.org/jira/browse/KYLIN-741.
> Currently using hive to create flat table, hive can't read spark datasource 
> data, we need to support the creation of flat table with spark sql, because 
> it can read hive and spark datasource data at the same time to create flat 
> table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4224) Create flat table wich spark sql

2019-11-11 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4224:
--
Description: 
Spark SQL datasource jira is https://issues.apache.org/jira/browse/KYLIN-741.

Currently using hive to create flat table, hive can't read spark datasource 
data, we need to support the creation of flat table with spark sql, because it 
can read hive and spark datasource data at the same time to create flat table.

> Create flat table wich spark sql
> 
>
> Key: KYLIN-4224
> URL: https://issues.apache.org/jira/browse/KYLIN-4224
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: weibin0516
>Assignee: hailin.huang
>Priority: Major
>
> Spark SQL datasource jira is https://issues.apache.org/jira/browse/KYLIN-741.
> Currently using hive to create flat table, hive can't read spark datasource 
> data, we need to support the creation of flat table with spark sql, because 
> it can read hive and spark datasource data at the same time to create flat 
> table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (KYLIN-4104) Support multi jdbc pushdown runners to execute query/update

2019-11-10 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4104:
--
Summary: Support multi jdbc pushdown runners to execute query/update  (was: 
Support multi kinds of pushdown query engines)

> Support multi jdbc pushdown runners to execute query/update
> ---
>
> Key: KYLIN-4104
> URL: https://issues.apache.org/jira/browse/KYLIN-4104
> Project: Kylin
>  Issue Type: New Feature
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> Current (version 3.0.0-SNAPSHOT), kylin support only one kind of pushdown 
> query engine. In some user's scenario, need pushdown query to mysql, spark 
> sql,hive etc.
> I think kylin need support  multiple pushdowns.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4217) Calcite rel to Spark plan

2019-11-08 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4217:
-

Assignee: (was: weibin0516)

> Calcite rel to Spark plan
> -
>
> Key: KYLIN-4217
> URL: https://issues.apache.org/jira/browse/KYLIN-4217
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Query Engine
>Reporter: yiming.xu
>Priority: Major
>
> Transform calcite rel to spark plan to implement distributed computing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4224) Create flat table wich spark sql

2019-11-07 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969819#comment-16969819
 ] 

weibin0516 commented on KYLIN-4224:
---

Hi, [~aahi], spark sql datasource jira is 
https://issues.apache.org/jira/browse/KYLIN-741, i have basically completed the 
development, these two days will push pr to github, welcome to review.

> Create flat table wich spark sql
> 
>
> Key: KYLIN-4224
> URL: https://issues.apache.org/jira/browse/KYLIN-4224
> Project: Kylin
>  Issue Type: Sub-task
>Reporter: weibin0516
>Assignee: hailin.huang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KYLIN-4224) Create flat table wich spark sql

2019-10-29 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4224:
-

 Summary: Create flat table wich spark sql
 Key: KYLIN-4224
 URL: https://issues.apache.org/jira/browse/KYLIN-4224
 Project: Kylin
  Issue Type: Sub-task
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4213) The new build engine with Spark-SQL

2019-10-29 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16962071#comment-16962071
 ] 

weibin0516 commented on KYLIN-4213:
---

Great feature, this will bring significant performance improvementsss.

> The new build engine with Spark-SQL
> ---
>
> Key: KYLIN-4213
> URL: https://issues.apache.org/jira/browse/KYLIN-4213
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine
>Affects Versions: Future
>Reporter: yiming.xu
>Assignee: yiming.xu
>Priority: Major
>
> 1. Use Spark-SQL to compute cuboid, build cuboid A, B, C , Sum(D) is sql 
> "select A B C Sum(D) from table group by A, B, C". 
> 2. To void many memory errors or other exceptions, we can auto set spark conf 
> with build job.E.g use adaptive execution.
> 3. The snapshot table will save a table with parquet format.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-4217) Calcite rel to Spark plan

2019-10-29 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16962058#comment-16962058
 ] 

weibin0516 commented on KYLIN-4217:
---

I have done similar things in our project and I am interested in doing this 
feature.

> Calcite rel to Spark plan
> -
>
> Key: KYLIN-4217
> URL: https://issues.apache.org/jira/browse/KYLIN-4217
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Query Engine
>Reporter: yiming.xu
>Assignee: weibin0516
>Priority: Major
>
> Transform calcite rel to spark plan to implement distributed computing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (KYLIN-4217) Calcite rel to Spark plan

2019-10-29 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4217:
-

Assignee: weibin0516

> Calcite rel to Spark plan
> -
>
> Key: KYLIN-4217
> URL: https://issues.apache.org/jira/browse/KYLIN-4217
> Project: Kylin
>  Issue Type: Sub-task
>  Components: Query Engine
>Reporter: yiming.xu
>Assignee: weibin0516
>Priority: Major
>
> Transform calcite rel to spark plan to implement distributed computing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (KYLIN-741) Read data from SparkSQL

2019-09-10 Thread weibin0516 (Jira)


[ 
https://issues.apache.org/jira/browse/KYLIN-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927218#comment-16927218
 ] 

weibin0516 commented on KYLIN-741:
--

Sounds useful, Spark sql itself already supports a wide variety of data sources 
and is highly extensible, which helps Kylin read various data sources to build 
cubes. I will try to achieve this feature.

> Read data from SparkSQL
> ---
>
> Key: KYLIN-741
> URL: https://issues.apache.org/jira/browse/KYLIN-741
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine, Spark Engine
>Reporter: Luke Han
>Assignee: weibin0516
>Priority: Major
>  Labels: scope
> Fix For: Backlog
>
>
> Read data from SparkSQL directly.
> There are some instances enabled SparkSQL interface for data consuming, it 
> will be great if Kylin could read data directly from SparkSQL. 
> This feature does not require Spark Cube Build Engine to be ready. It could 
> continue to leverage existing MR cube build engine and process data on Hadoop 
> cluster then persistent cube to HBase.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Assigned] (KYLIN-741) Read data from SparkSQL

2019-09-10 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-741:


Assignee: weibin0516  (was: Dong Li)

> Read data from SparkSQL
> ---
>
> Key: KYLIN-741
> URL: https://issues.apache.org/jira/browse/KYLIN-741
> Project: Kylin
>  Issue Type: New Feature
>  Components: Job Engine, Spark Engine
>Reporter: Luke Han
>Assignee: weibin0516
>Priority: Major
>  Labels: scope
> Fix For: Backlog
>
>
> Read data from SparkSQL directly.
> There are some instances enabled SparkSQL interface for data consuming, it 
> will be great if Kylin could read data directly from SparkSQL. 
> This feature does not require Spark Cube Build Engine to be ready. It could 
> continue to leverage existing MR cube build engine and process data on Hadoop 
> cluster then persistent cube to HBase.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (KYLIN-4068) Automatically add limit has bug

2019-09-08 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 resolved KYLIN-4068.
---
Resolution: Fixed

> Automatically add limit has bug
> ---
>
> Key: KYLIN-4068
> URL: https://issues.apache.org/jira/browse/KYLIN-4068
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.6.2
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> {code:sql}
> SELECT E_Name FROM Employees_China
> UNION
> SELECT E_Name FROM Employees_USA
> {code}
> will convert to 
> {code:sql}
> SELECT E_Name FROM Employees_China
> UNION
> SELECT E_Name FROM Employees_USA
> LIMIT 5
> {code}
> This limit is not working on the result of union, but on SELECT E_Name FROM 
> Employees_USA.
> We should use a more secure way to achieve the limit effect.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (KYLIN-4068) Automatically add limit has bug

2019-09-08 Thread weibin0516 (Jira)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4068:
--
Fix Version/s: v3.0.0-alpha2
  Description: 
{code:sql}
SELECT E_Name FROM Employees_China
UNION
SELECT E_Name FROM Employees_USA

{code}

will convert to 


{code:sql}
SELECT E_Name FROM Employees_China
UNION
SELECT E_Name FROM Employees_USA
LIMIT 5
{code}


This limit is not working on the result of union, but on SELECT E_Name FROM 
Employees_USA.
We should use a more secure way to achieve the limit effect.

  was:

{code:sql}
SELECT E_Name FROM Employees_China
UNION
SELECT E_Name FROM Employees_USA

{code}

will convert to 


{code:sql}
SELECT E_Name FROM Employees_China
UNION
SELECT E_Name FROM Employees_USA
LIMIT 5
{code}


This limit is not working on the result of union, but on SELECT E_Name FROM 
Employees_USA.
We should use a more secure way to achieve the limit effect.


> Automatically add limit has bug
> ---
>
> Key: KYLIN-4068
> URL: https://issues.apache.org/jira/browse/KYLIN-4068
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.6.2
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-alpha2
>
>
> {code:sql}
> SELECT E_Name FROM Employees_China
> UNION
> SELECT E_Name FROM Employees_USA
> {code}
> will convert to 
> {code:sql}
> SELECT E_Name FROM Employees_China
> UNION
> SELECT E_Name FROM Employees_USA
> LIMIT 5
> {code}
> This limit is not working on the result of union, but on SELECT E_Name FROM 
> Employees_USA.
> We should use a more secure way to achieve the limit effect.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (KYLIN-4150) Improve docker for kylin instructions

2019-08-27 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4150:
-

 Summary: Improve docker for kylin instructions
 Key: KYLIN-4150
 URL: https://issues.apache.org/jira/browse/KYLIN-4150
 Project: Kylin
  Issue Type: Improvement
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (KYLIN-4146) Add doc for KYLIN-4114

2019-08-21 Thread weibin0516 (Jira)
weibin0516 created KYLIN-4146:
-

 Summary: Add doc for KYLIN-4114
 Key: KYLIN-4146
 URL: https://issues.apache.org/jira/browse/KYLIN-4146
 Project: Kylin
  Issue Type: Improvement
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (KYLIN-4129) Remove useless code

2019-08-09 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4129:
--
Summary: Remove useless code  (was: Remove never useless code)

> Remove useless code
> ---
>
> Key: KYLIN-4129
> URL: https://issues.apache.org/jira/browse/KYLIN-4129
> Project: Kylin
>  Issue Type: Improvement
>Affects Versions: v3.0.0-alpha2
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (KYLIN-4129) Remove never useless code

2019-08-09 Thread weibin0516 (JIRA)
weibin0516 created KYLIN-4129:
-

 Summary: Remove never useless code
 Key: KYLIN-4129
 URL: https://issues.apache.org/jira/browse/KYLIN-4129
 Project: Kylin
  Issue Type: Improvement
Affects Versions: v3.0.0-alpha2
Reporter: weibin0516
Assignee: weibin0516






--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (KYLIN-4127) Remove never called classes

2019-08-09 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4127:
--
Description: 
I found some classes never called by FindBugs plugin, we should remove these to 
make code more clean.



  was:
Class 
{code:java}
org.apache.kylin.stream.core.storage.StreamingCubeSegment.SegmentInfo
{code}
 is never used.  We should remove it.

Summary: Remove never called classes  (was: Delete unused class 
org.apache.kylin.stream.core.storage.StreamingCubeSegment.SegmentInfo)

> Remove never called classes
> ---
>
> Key: KYLIN-4127
> URL: https://issues.apache.org/jira/browse/KYLIN-4127
> Project: Kylin
>  Issue Type: Improvement
>Affects Versions: v3.0.0-alpha2
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Minor
>
> I found some classes never called by FindBugs plugin, we should remove these 
> to make code more clean.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (KYLIN-4128) Remove never called methods

2019-08-09 Thread weibin0516 (JIRA)
weibin0516 created KYLIN-4128:
-

 Summary: Remove never called methods
 Key: KYLIN-4128
 URL: https://issues.apache.org/jira/browse/KYLIN-4128
 Project: Kylin
  Issue Type: Improvement
Affects Versions: v3.0.0-alpha2
Reporter: weibin0516
Assignee: weibin0516


I found some methods never called by *FindBugs* plugin, we should remove these 
to make code more clean.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (KYLIN-4127) Delete unused class org.apache.kylin.stream.core.storage.StreamingCubeSegment.SegmentInfo

2019-08-09 Thread weibin0516 (JIRA)
weibin0516 created KYLIN-4127:
-

 Summary: Delete unused class 
org.apache.kylin.stream.core.storage.StreamingCubeSegment.SegmentInfo
 Key: KYLIN-4127
 URL: https://issues.apache.org/jira/browse/KYLIN-4127
 Project: Kylin
  Issue Type: Improvement
Affects Versions: v3.0.0-alpha2
Reporter: weibin0516
Assignee: weibin0516


Class 
{code:java}
org.apache.kylin.stream.core.storage.StreamingCubeSegment.SegmentInfo
{code}
 is never used.  We should remove it.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (KYLIN-4116) Package fail due to lack of dependency objenesis

2019-07-28 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4116:
--
Description: 
Execute the command 
{code:java}
build/script/package.sh
{code}
got error below due to lack of dependency objenesis:


{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.5.1:compile (default-compile) 
on project kylin-core-metadata: Compilation failure: Compilation failure:
[ERROR] 
/home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[24,29]
 error: package org.objenesis.strategy does not exist
[ERROR] 
/home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[59,82]
 error: cannot find symbol
[ERROR]   symbol:   class StdInstantiatorStrategy
[ERROR]   location: class KryoUtils
[ERROR] 
/home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[59,16]
 error: cannot access InstantiatorStrategy
[ERROR] -> [Help 1]
{code}



  was:
Execute the command 
{code:shell}
build/script/package.sh
{code}
got error below due to lack of dependency objenesis:


{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.5.1:compile (default-compile) 
on project kylin-core-metadata: Compilation failure: Compilation failure:
[ERROR] 
/home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[24,29]
 error: package org.objenesis.strategy does not exist
[ERROR] 
/home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[59,82]
 error: cannot find symbol
[ERROR]   symbol:   class StdInstantiatorStrategy
[ERROR]   location: class KryoUtils
[ERROR] 
/home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[59,16]
 error: cannot access InstantiatorStrategy
[ERROR] -> [Help 1]
{code}




> Package fail due to lack of dependency objenesis
> 
>
> Key: KYLIN-4116
> URL: https://issues.apache.org/jira/browse/KYLIN-4116
> Project: Kylin
>  Issue Type: Bug
>Affects Versions: v3.0.0
>Reporter: weibin0516
>Assignee: weibin0516
>Priority: Major
>
> Execute the command 
> {code:java}
> build/script/package.sh
> {code}
> got error below due to lack of dependency objenesis:
> {code:java}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.5.1:compile 
> (default-compile) on project kylin-core-metadata: Compilation failure: 
> Compilation failure:
> [ERROR] 
> /home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[24,29]
>  error: package org.objenesis.strategy does not exist
> [ERROR] 
> /home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[59,82]
>  error: cannot find symbol
> [ERROR]   symbol:   class StdInstantiatorStrategy
> [ERROR]   location: class KryoUtils
> [ERROR] 
> /home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[59,16]
>  error: cannot access InstantiatorStrategy
> [ERROR] -> [Help 1]
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (KYLIN-4116) Package fail due to lack of dependency objenesis

2019-07-28 Thread weibin0516 (JIRA)
weibin0516 created KYLIN-4116:
-

 Summary: Package fail due to lack of dependency objenesis
 Key: KYLIN-4116
 URL: https://issues.apache.org/jira/browse/KYLIN-4116
 Project: Kylin
  Issue Type: Bug
Affects Versions: v3.0.0
Reporter: weibin0516
Assignee: weibin0516


Execute the command 
{code:shell}
build/script/package.sh
{code}
got error below due to lack of dependency objenesis:


{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.5.1:compile (default-compile) 
on project kylin-core-metadata: Compilation failure: Compilation failure:
[ERROR] 
/home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[24,29]
 error: package org.objenesis.strategy does not exist
[ERROR] 
/home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[59,82]
 error: cannot find symbol
[ERROR]   symbol:   class StdInstantiatorStrategy
[ERROR]   location: class KryoUtils
[ERROR] 
/home/admin/kylin_sourcecode/core-metadata/src/main/java/org/apache/kylin/util/KryoUtils.java:[59,16]
 error: cannot access InstantiatorStrategy
[ERROR] -> [Help 1]
{code}





--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (KYLIN-4107) StorageCleanupJob fails to delete Hive tables with "Argument list too long" error

2019-07-27 Thread weibin0516 (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16894630#comment-16894630
 ] 

weibin0516 commented on KYLIN-4107:
---

pr: https://github.com/apache/kylin/pull/776

> StorageCleanupJob fails to delete Hive tables with "Argument list too long" 
> error
> -
>
> Key: KYLIN-4107
> URL: https://issues.apache.org/jira/browse/KYLIN-4107
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: v2.6.2
> Environment: CentOS 7.6, HDP 2.6.5, Kylin 2.6.3
>Reporter: Vsevolod Ostapenko
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> On a system with multiple Kylin developers that experiment with cube design 
> and (re)build/drop cube segments often intermediate Hive tables and HBase 
> left over tables accumulate very quickly.
> After a certain point storage cleanup cannot be executed using suggested 
> method:
> {{${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete 
> true}}
> Apparently, storage cleanup job creates a single shell command to drop all 
> Hive tables, which fails to execute because command line is just too long. 
> For example:
> {quote}
> 2019-07-23 17:47:31,611 ERROR [main] job.StorageCleanupJob:377 : Error during 
> deleting Hive tables
> java.io.IOException: Cannot run program "/bin/bash": error=7, Argument list 
> too long
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
>  at 
> org.apache.kylin.common.util.CliCommandExecutor.runNativeCommand(CliCommandExecutor.java:133)
>  at 
> org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:89)
>  at 
> org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:83)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.deleteHiveTables(StorageCleanupJob.java:409)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.cleanUnusedIntermediateHiveTableInternal(StorageCleanupJob.java:375)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.cleanUnusedIntermediateHiveTable(StorageCleanupJob.java:278)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.cleanup(StorageCleanupJob.java:151)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.execute(StorageCleanupJob.java:145)
>  at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
>  at org.apache.kylin.tool.StorageCleanupJob.main(StorageCleanupJob.java:27)
> Caused by: java.io.IOException: error=7, Argument list too long
>  at java.lang.UNIXProcess.forkAndExec(Native Method)
>  at java.lang.UNIXProcess.(UNIXProcess.java:247)
>  at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
>  ... 10 more 
> {quote}
> Instead of composing one long command, storage cleanup need to generate a 
> script and feed that into beeline or hive CLI.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (KYLIN-4107) StorageCleanupJob fails to delete Hive tables with "Argument list too long" error

2019-07-24 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-4107:
-

Assignee: weibin0516

> StorageCleanupJob fails to delete Hive tables with "Argument list too long" 
> error
> -
>
> Key: KYLIN-4107
> URL: https://issues.apache.org/jira/browse/KYLIN-4107
> Project: Kylin
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: v2.6.2
> Environment: CentOS 7.6, HDP 2.6.5, Kylin 2.6.3
>Reporter: Vsevolod Ostapenko
>Assignee: weibin0516
>Priority: Major
> Fix For: v3.0.0-beta
>
>
> On a system with multiple Kylin developers that experiment with cube design 
> and (re)build/drop cube segments often intermediate Hive tables and HBase 
> left over tables accumulate very quickly.
> After a certain point storage cleanup cannot be executed using suggested 
> method:
> {{${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete 
> true}}
> Apparently, storage cleanup job creates a single shell command to drop all 
> Hive tables, which fails to execute because command line is just too long. 
> For example:
> {quote}
> 2019-07-23 17:47:31,611 ERROR [main] job.StorageCleanupJob:377 : Error during 
> deleting Hive tables
> java.io.IOException: Cannot run program "/bin/bash": error=7, Argument list 
> too long
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
>  at 
> org.apache.kylin.common.util.CliCommandExecutor.runNativeCommand(CliCommandExecutor.java:133)
>  at 
> org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:89)
>  at 
> org.apache.kylin.common.util.CliCommandExecutor.execute(CliCommandExecutor.java:83)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.deleteHiveTables(StorageCleanupJob.java:409)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.cleanUnusedIntermediateHiveTableInternal(StorageCleanupJob.java:375)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.cleanUnusedIntermediateHiveTable(StorageCleanupJob.java:278)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.cleanup(StorageCleanupJob.java:151)
>  at 
> org.apache.kylin.rest.job.StorageCleanupJob.execute(StorageCleanupJob.java:145)
>  at 
> org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
>  at org.apache.kylin.tool.StorageCleanupJob.main(StorageCleanupJob.java:27)
> Caused by: java.io.IOException: error=7, Argument list too long
>  at java.lang.UNIXProcess.forkAndExec(Native Method)
>  at java.lang.UNIXProcess.(UNIXProcess.java:247)
>  at java.lang.ProcessImpl.start(ProcessImpl.java:134)
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
>  ... 10 more 
> {quote}
> Instead of composing one long command, storage cleanup need to generate a 
> script and feed that into beeline or hive CLI.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (KYLIN-4104) Support multi kinds of pushdown query engines

2019-07-19 Thread weibin0516 (JIRA)
weibin0516 created KYLIN-4104:
-

 Summary: Support multi kinds of pushdown query engines
 Key: KYLIN-4104
 URL: https://issues.apache.org/jira/browse/KYLIN-4104
 Project: Kylin
  Issue Type: New Feature
Reporter: weibin0516
Assignee: weibin0516


Current (version 3.0.0-SNAPSHOT), kylin support only one kind of pushdown query 
engine. In some user's scenario, need pushdown query to mysql, spark sql,hive 
etc.
I think kylin need support  multiple pushdowns.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (KYLIN-4010) TimeZone is hard-coded in function makeSegmentName for class CubeSegment

2019-07-15 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4010:
--
Sprint: Sprint 52

> TimeZone is hard-coded in function makeSegmentName for class CubeSegment
> 
>
> Key: KYLIN-4010
> URL: https://issues.apache.org/jira/browse/KYLIN-4010
> Project: Kylin
>  Issue Type: Improvement
>  Components: Others
>Affects Versions: v3.0.0-alpha
>Reporter: zengrui
>Assignee: Xiaoxiang Yu
>Priority: Minor
> Attachments: image-2019-07-15-17-15-31-209.png, 
> image-2019-07-15-17-17-04-029.png, image-2019-07-15-17-17-39-568.png
>
>
> In Real-Time Streaming Cube when I send some records to kafka topic, the 
> tmestamp for the record is 2019-01-01 00:00:00.000, but kylin create a 
> segment named 2018123116_2018123117.
> Then I found that TimeZone is hard-coded to "GMT" in function makeSegmentName 
> for class CubeSegment. I think that it should be config in kylin.properties.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Updated] (KYLIN-4010) TimeZone is hard-coded in function makeSegmentName for class CubeSegment

2019-07-15 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4010:
--
Sprint:   (was: Sprint 52)

> TimeZone is hard-coded in function makeSegmentName for class CubeSegment
> 
>
> Key: KYLIN-4010
> URL: https://issues.apache.org/jira/browse/KYLIN-4010
> Project: Kylin
>  Issue Type: Improvement
>  Components: Others
>Affects Versions: v3.0.0-alpha
>Reporter: zengrui
>Assignee: Xiaoxiang Yu
>Priority: Minor
> Attachments: image-2019-07-15-17-15-31-209.png, 
> image-2019-07-15-17-17-04-029.png, image-2019-07-15-17-17-39-568.png
>
>
> In Real-Time Streaming Cube when I send some records to kafka topic, the 
> tmestamp for the record is 2019-01-01 00:00:00.000, but kylin create a 
> segment named 2018123116_2018123117.
> Then I found that TimeZone is hard-coded to "GMT" in function makeSegmentName 
> for class CubeSegment. I think that it should be config in kylin.properties.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Created] (KYLIN-4078) Fix DefaultSchedulerTest.testMetaStoreRecover unit test fail

2019-07-11 Thread weibin0516 (JIRA)
weibin0516 created KYLIN-4078:
-

 Summary: Fix DefaultSchedulerTest.testMetaStoreRecover unit test 
fail
 Key: KYLIN-4078
 URL: https://issues.apache.org/jira/browse/KYLIN-4078
 Project: Kylin
  Issue Type: Test
Affects Versions: v3.0.0
Reporter: weibin0516
Assignee: weibin0516
 Attachments: error.png

When run `mvn clean test` got error as follow:

{code:java}

[INFO]
[ERROR] Errors:
[ERROR]   
DefaultSchedulerTest.testMetaStoreRecover:189->BaseSchedulerTest.waitForJobFinish:107
 » Runtime
[INFO]
[ERROR] Tests run: 28, Failures: 0, Errors: 1, Skipped: 2
[INFO]
[INFO] 
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Kylin 3.0.0-SNAPSHOT  SUCCESS [  4.856 s]
[INFO] Apache Kylin - Core Common . SUCCESS [ 32.858 s]
[INFO] Apache Kylin - Core Metadata ... SUCCESS [ 59.055 s]
[INFO] Apache Kylin - Core Dictionary . SUCCESS [03:55 min]
[INFO] Apache Kylin - Core Cube ... SUCCESS [02:34 min]
[INFO] Apache Kylin - Core Metrics  SUCCESS [  2.071 s]
[INFO] Apache Kylin - Core Job  FAILURE [02:33 min]
[INFO] Apache Kylin - Core Storage  SKIPPED
[INFO] Apache Kylin - Stream Core . SKIPPED
[INFO] Apache Kylin - MapReduce Engine  SKIPPED
[INFO] Apache Kylin - Spark Engine  SKIPPED
[INFO] Apache Kylin - Hive Source . SKIPPED
[INFO] Apache Kylin - DataSource SDK .. SKIPPED
[INFO] Apache Kylin - Jdbc Source . SKIPPED
[INFO] Apache Kylin - Kafka Source  SKIPPED
[INFO] Apache Kylin - Cache ... SKIPPED
[INFO] Apache Kylin - HBase Storage ... SKIPPED
[INFO] Apache Kylin - Query ... SKIPPED
[INFO] Apache Kylin - Metrics Reporter Hive ... SKIPPED
[INFO] Apache Kylin - Metrics Reporter Kafka .. SKIPPED
[INFO] Apache Kylin - Stream Source Kafka . SKIPPED
[INFO] Apache Kylin - Stream Coordinator .. SKIPPED
[INFO] Apache Kylin - Stream Receiver . SKIPPED
[INFO] Apache Kylin - Stream Storage .. SKIPPED
[INFO] Apache Kylin - REST Server Base  SKIPPED
[INFO] Apache Kylin - REST Server . SKIPPED
[INFO] Apache Kylin - JDBC Driver . SKIPPED
[INFO] Apache Kylin - Assembly  SKIPPED
[INFO] Apache Kylin - Tool  SKIPPED
[INFO] Apache Kylin - Tool Assembly ... SKIPPED
[INFO] Apache Kylin - Integration Test  SKIPPED
[INFO] Apache Kylin - Tomcat Extension 3.0.0-SNAPSHOT . SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 10:42 min
[INFO] Finished at: 2019-07-12T08:59:26+08:00
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.21.0:test (default-test) on 
project kylin-core-job: There are test failures.
[ERROR]
[ERROR] Please refer to 
/Users/zhuweibin/ant_code/OpenSource/kylin/core-job/../target/surefire-reports 
for the individual test results.
[ERROR] Please refer to dump files (if any exist) [date]-jvmRun[N].dump, 
[date].dumpstream and [date]-jvmRun[N].dumpstream.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn  -rf :kylin-core-job
{code}




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (KYLIN-2517) Upgrade hbase dependency to 1.4.7

2019-07-03 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-2517:
-

Assignee: weibin0516

> Upgrade hbase dependency to 1.4.7
> -
>
> Key: KYLIN-2517
> URL: https://issues.apache.org/jira/browse/KYLIN-2517
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: weibin0516
>Priority: Major
>
> There have been major enhancements / bug fixes since the hbase 1.1.1 release.
> This issue is to upgrade to 1.4.7 release.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KYLIN-3519) Upgrade Jacoco version to 0.8.2

2019-07-03 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 reassigned KYLIN-3519:
-

Assignee: weibin0516

> Upgrade Jacoco version to 0.8.2
> ---
>
> Key: KYLIN-3519
> URL: https://issues.apache.org/jira/browse/KYLIN-3519
> Project: Kylin
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: weibin0516
>Priority: Minor
>
> Jacoco 0.8.2 adds Java 11 support:
>https://github.com/jacoco/jacoco/releases/tag/v0.8.2
> Java 11 RC1 is out.
> We should consider upgrading Jacoco.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-4069) HivePushDownConverter.doConvert will change sql semantics in some scenarios

2019-07-02 Thread weibin0516 (JIRA)
weibin0516 created KYLIN-4069:
-

 Summary: HivePushDownConverter.doConvert will change sql semantics 
in some scenarios
 Key: KYLIN-4069
 URL: https://issues.apache.org/jira/browse/KYLIN-4069
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: v2.6.2
Reporter: weibin0516
Assignee: weibin0516



HivePushDownConverter.doConvert source code is as follows:


{code:java}
public static String doConvert(String originStr, boolean isPrepare) {
// Step1.Replace " with `
String convertedSql = replaceString(originStr, "\"", "`");

// Step2.Replace extract functions
convertedSql = extractReplace(convertedSql);

// Step3.Replace cast type string
convertedSql = castReplace(convertedSql);

// Step4.Replace sub query
convertedSql = subqueryReplace(convertedSql);

// Step5.Replace char_length with length
convertedSql = replaceString(convertedSql, "CHAR_LENGTH", "LENGTH");
convertedSql = replaceString(convertedSql, "char_length", "length");

// Step6.Replace "||" with concat
convertedSql = concatReplace(convertedSql);

// Step7.Add quote for interval in timestampadd
convertedSql = timestampAddDiffReplace(convertedSql);

// Step8.Replace integer with int
convertedSql = replaceString(convertedSql, "INTEGER", "INT");
convertedSql = replaceString(convertedSql, "integer", "int");

// Step9.Add limit 1 for prepare select sql to speed up
if (isPrepare) {
convertedSql = addLimit(convertedSql);
}

return convertedSql;
}
{code}


It is not advisable to directly replace the sql text. The following example 
will convert sql to another error sql:

{code:sql}
SELECT "CHAR_LENGTH" FROM datasource.a
{code}

will convert to 

{code:sql}
SELECT `LENGTH` FROM datasource.a
{code}


Every use of replaceString in doConvert will cause such problems.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-4068) Automatically add limit has bug

2019-07-02 Thread weibin0516 (JIRA)
weibin0516 created KYLIN-4068:
-

 Summary: Automatically add limit has bug
 Key: KYLIN-4068
 URL: https://issues.apache.org/jira/browse/KYLIN-4068
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: v2.6.2
Reporter: weibin0516
Assignee: weibin0516



{code:sql}
SELECT E_Name FROM Employees_China
UNION
SELECT E_Name FROM Employees_USA

{code}

will convert to 


{code:sql}
SELECT E_Name FROM Employees_China
UNION
SELECT E_Name FROM Employees_USA
LIMIT 5
{code}


This limit is not working on the result of union, but on SELECT E_Name FROM 
Employees_USA.
We should use a more secure way to achieve the limit effect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3832) Kylin pushdown to support postgresql

2019-06-30 Thread weibin0516 (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875792#comment-16875792
 ] 

weibin0516 commented on KYLIN-3832:
---

[~Shaofengshi], thanks for assigning to me, i will try to implement this.

> Kylin pushdown to support postgresql
> 
>
> Key: KYLIN-3832
> URL: https://issues.apache.org/jira/browse/KYLIN-3832
> Project: Kylin
>  Issue Type: New Feature
>  Components: Query Engine
>Affects Versions: v2.5.2
>Reporter: hailin.huang
>Assignee: weibin0516
>Priority: Major
>
> when I run pushdown to postgresql in my env, I encount the below exception.
> it seems that kylin need support more JDBC Driver, 
> PushDownRunnerJdbcImpl.class should be more general.
> 2019-02-26 16:12:53,168 ERROR [Query 207dcf77-7c14-8078-ea8b-79644a0c576d-48] 
> service.QueryService:989 : pushdown engine failed current query too
> java.sql.SQLException: Unrecognized column type: int8
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.toSqlType(PushDownRunnerJdbcImpl.java:260)
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.extractColumnMeta(PushDownRunnerJdbcImpl.java:192)
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.executeQuery(PushDownRunnerJdbcImpl.java:68)
>   at 
> org.apache.kylin.query.util.PushDownUtil.tryPushDownQuery(PushDownUtil.java:122)
>   at 
> org.apache.kylin.query.util.PushDownUtil.tryPushDownSelectQuery(PushDownUtil.java:69)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-4061) Swap inner join's left side, right side table will get different result when query

2019-06-30 Thread weibin0516 (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875789#comment-16875789
 ] 

weibin0516 commented on KYLIN-4061:
---

[~Shaofengshi], thanks for explaining.

> Swap inner join's left side, right side table will get different result when 
> query
> --
>
> Key: KYLIN-4061
> URL: https://issues.apache.org/jira/browse/KYLIN-4061
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.5.2
>Reporter: weibin0516
>Priority: Major
> Attachments: failed.png, succeed.png
>
>
> When the left side table of inner join is a fact table and the right side 
> table is a lookup table, will query cube and get correct result. Sql is as 
> follows.
> {code:java}
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
>  FROM KYLIN_SALES
>  INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
>  WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
>  GROUP BY KYLIN_SALES.TRANS_ID
>  ORDER BY TRANS_ID
>  LIMIT 10;{code}
>  
> However,when swap the left and right side tables of the inner join will 
> failed due to no realization found. Sql is as follows.
> {code:java}
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
>  FROM KYLIN_ACCOUNT
>  INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
>  WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
>  GROUP BY KYLIN_SALES.TRANS_ID
>  ORDER BY TRANS_ID
>  LIMIT 10;{code}
> We know that the above two sql semantics are consistent and should return the 
> same result. 
>  I looked at the source code, kylin will use context.firstTableScan(assigned 
> in OLAPTableScan.implementOLAP) as the fact table, whether it is or not. The 
> fact table will be the key evidence for choosing realization later. So, in 
> the second sql Regard a lookup table as a fact table can not find 
> corresponding realization.
> Is this a bug, do we need to fix it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-4061) Swap inner join's left side, right side table will get different result when query

2019-06-29 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4061:
--
Description: 
When the left side table of inner join is a fact table and the right side table 
is a lookup table, will query cube and get correct result. Sql is as follows.
{code:java}
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
 FROM KYLIN_SALES
 INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
 WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
 GROUP BY KYLIN_SALES.TRANS_ID
 ORDER BY TRANS_ID
 LIMIT 10;{code}
 

However,when swap the left and right side tables of the inner join will failed 
due to no realization found. Sql is as follows.
{code:java}
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
 FROM KYLIN_ACCOUNT
 INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
 WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
 GROUP BY KYLIN_SALES.TRANS_ID
 ORDER BY TRANS_ID
 LIMIT 10;{code}
We know that the above two sql semantics are consistent and should return the 
same result. 
 I looked at the source code, kylin will use context.firstTableScan(assigned in 
OLAPTableScan.implementOLAP) as the fact table, whether it is or not. The fact 
table will be the key evidence for choosing realization later. So, in the 
second sql Regard a lookup table as a fact table can not find corresponding 
realization.

Is this a bug, do we need to fix it?

  was:
When the left side table of inner join is a fact table and the right side table 
is a lookup table, will query cube and get correct result. Sql is as follows.
{code:java}
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
 FROM KYLIN_SALES
 INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
 WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
 GROUP BY KYLIN_SALES.TRANS_ID
 ORDER BY TRANS_ID
 LIMIT 10;{code}
 

However,when swap the left and right side tables of the inner join will failed 
due to no realization found. Sql is as follows.
{code:java}
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
 FROM KYLIN_ACCOUNT
 INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
 WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
 GROUP BY KYLIN_SALES.TRANS_ID
 ORDER BY TRANS_ID
 LIMIT 10;{code}
We know that the above two sql semantics are consistent and should return the 
same result. 
 I looked at the source code, kylin will use context.firstTableScan(assigned in 
OLAPTableScan.implementOLAP) as the fact table, whether it is or not.

Is this a bug, do we need to fix it?


> Swap inner join's left side, right side table will get different result when 
> query
> --
>
> Key: KYLIN-4061
> URL: https://issues.apache.org/jira/browse/KYLIN-4061
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.5.2
>Reporter: weibin0516
>Priority: Major
> Attachments: failed.png, succeed.png
>
>
> When the left side table of inner join is a fact table and the right side 
> table is a lookup table, will query cube and get correct result. Sql is as 
> follows.
> {code:java}
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
>  FROM KYLIN_SALES
>  INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
>  WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
>  GROUP BY KYLIN_SALES.TRANS_ID
>  ORDER BY TRANS_ID
>  LIMIT 10;{code}
>  
> However,when swap the left and right side tables of the inner join will 
> failed due to no realization found. Sql is as follows.
> {code:java}
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
>  FROM KYLIN_ACCOUNT
>  INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
>  WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
>  GROUP BY KYLIN_SALES.TRANS_ID
>  ORDER BY TRANS_ID
>  LIMIT 10;{code}
> We know that the above two sql semantics are consistent and should return the 
> same result. 
>  I looked at the source code, kylin will use context.firstTableScan(assigned 
> in OLAPTableScan.implementOLAP) as the fact table, whether it is or not. The 
> fact table will be the key evidence for choosing realization later. So, in 
> the second sql Regard a lookup table as a fact table can not find 
> corresponding realization.
> Is this a bug, do we need to fix it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-4061) Swap inner join's left side, right side table will get different result when query

2019-06-29 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4061:
--
Description: 
When the left side table of inner join is a fact table and the right side table 
is a lookup table, will query cube and get correct result. Sql is as follows.
{code:java}
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
 FROM KYLIN_SALES
 INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
 WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
 GROUP BY KYLIN_SALES.TRANS_ID
 ORDER BY TRANS_ID
 LIMIT 10;{code}
 

However,when swap the left and right side tables of the inner join will failed 
due to no realization found. Sql is as follows.
{code:java}
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
 FROM KYLIN_ACCOUNT
 INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
 WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
 GROUP BY KYLIN_SALES.TRANS_ID
 ORDER BY TRANS_ID
 LIMIT 10;{code}
We know that the above two sql semantics are consistent and should return the 
same result. 
 I looked at the source code, kylin will use context.firstTableScan(assigned in 
OLAPTableScan.implementOLAP) as the fact table, whether it is or not.

Is this a bug, do we need to fix it?

  was:
When the left side table of inner join is a fact table and the right side table 
is a lookup table, will query cube and get correct result. Sql is as follows.

```
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
FROM KYLIN_SALES
INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
GROUP BY KYLIN_SALES.TRANS_ID
ORDER BY TRANS_ID
LIMIT 10;
```

 

However,when swap the left and right side tables of the inner join will failed 
due to no realization found. Sql is as follows.

```
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
FROM KYLIN_ACCOUNT
INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
GROUP BY KYLIN_SALES.TRANS_ID
ORDER BY TRANS_ID
LIMIT 10;
```

We know that the above two sql semantics are consistent and should return the 
same result. 
I looked at the source code, kylin will use context.firstTableScan(assigned in 
OLAPTableScan.implementOLAP) as the fact table, whether it is or not.

Is this a bug, do we need to fix it?


> Swap inner join's left side, right side table will get different result when 
> query
> --
>
> Key: KYLIN-4061
> URL: https://issues.apache.org/jira/browse/KYLIN-4061
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.5.2
>Reporter: weibin0516
>Priority: Major
> Attachments: failed.png, succeed.png
>
>
> When the left side table of inner join is a fact table and the right side 
> table is a lookup table, will query cube and get correct result. Sql is as 
> follows.
> {code:java}
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
>  FROM KYLIN_SALES
>  INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
>  WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
>  GROUP BY KYLIN_SALES.TRANS_ID
>  ORDER BY TRANS_ID
>  LIMIT 10;{code}
>  
> However,when swap the left and right side tables of the inner join will 
> failed due to no realization found. Sql is as follows.
> {code:java}
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
>  FROM KYLIN_ACCOUNT
>  INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
>  WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
>  GROUP BY KYLIN_SALES.TRANS_ID
>  ORDER BY TRANS_ID
>  LIMIT 10;{code}
> We know that the above two sql semantics are consistent and should return the 
> same result. 
>  I looked at the source code, kylin will use context.firstTableScan(assigned 
> in OLAPTableScan.implementOLAP) as the fact table, whether it is or not.
> Is this a bug, do we need to fix it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KYLIN-4061) Swap inner join's left side, right side table will get different result when query

2019-06-29 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-4061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-4061:
--
Attachment: succeed.png
failed.png

> Swap inner join's left side, right side table will get different result when 
> query
> --
>
> Key: KYLIN-4061
> URL: https://issues.apache.org/jira/browse/KYLIN-4061
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.5.2
>Reporter: weibin0516
>Priority: Major
> Attachments: failed.png, succeed.png
>
>
> When the left side table of inner join is a fact table and the right side 
> table is a lookup table, will query cube and get correct result. Sql is as 
> follows.
> ```
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
> FROM KYLIN_SALES
> INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
> WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
> GROUP BY KYLIN_SALES.TRANS_ID
> ORDER BY TRANS_ID
> LIMIT 10;
> ```
>  
> However,when swap the left and right side tables of the inner join will 
> failed due to no realization found. Sql is as follows.
> ```
> SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
> COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
> FROM KYLIN_ACCOUNT
> INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
> WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
> GROUP BY KYLIN_SALES.TRANS_ID
> ORDER BY TRANS_ID
> LIMIT 10;
> ```
> We know that the above two sql semantics are consistent and should return the 
> same result. 
> I looked at the source code, kylin will use context.firstTableScan(assigned 
> in OLAPTableScan.implementOLAP) as the fact table, whether it is or not.
> Is this a bug, do we need to fix it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KYLIN-4061) Swap inner join's left side, right side table will get different result when query

2019-06-29 Thread weibin0516 (JIRA)
weibin0516 created KYLIN-4061:
-

 Summary: Swap inner join's left side, right side table will get 
different result when query
 Key: KYLIN-4061
 URL: https://issues.apache.org/jira/browse/KYLIN-4061
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: v2.5.2
Reporter: weibin0516


When the left side table of inner join is a fact table and the right side table 
is a lookup table, will query cube and get correct result. Sql is as follows.

```
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
FROM KYLIN_SALES
INNER JOIN KYLIN_ACCOUNT ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
GROUP BY KYLIN_SALES.TRANS_ID
ORDER BY TRANS_ID
LIMIT 10;
```

 

However,when swap the left and right side tables of the inner join will failed 
due to no realization found. Sql is as follows.

```
SELECT KYLIN_SALES.TRANS_ID, SUM(KYLIN_SALES.PRICE), 
COUNT(KYLIN_ACCOUNT.ACCOUNT_ID)
FROM KYLIN_ACCOUNT
INNER JOIN KYLIN_SALES ON KYLIN_SALES.BUYER_ID = KYLIN_ACCOUNT.ACCOUNT_ID
WHERE KYLIN_SALES.LSTG_SITE_ID != 1000
GROUP BY KYLIN_SALES.TRANS_ID
ORDER BY TRANS_ID
LIMIT 10;
```

We know that the above two sql semantics are consistent and should return the 
same result. 
I looked at the source code, kylin will use context.firstTableScan(assigned in 
OLAPTableScan.implementOLAP) as the fact table, whether it is or not.

Is this a bug, do we need to fix it?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KYLIN-3679) Fetch Kafka topic with Spark streaming

2019-06-29 Thread weibin0516 (JIRA)


[ 
https://issues.apache.org/jira/browse/KYLIN-3679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16875486#comment-16875486
 ] 

weibin0516 commented on KYLIN-3679:
---

Hi, [~Shaofengshi], i hope to implement this feature, please assign it to me. 
Thanks~

> Fetch Kafka topic with Spark streaming
> --
>
> Key: KYLIN-3679
> URL: https://issues.apache.org/jira/browse/KYLIN-3679
> Project: Kylin
>  Issue Type: New Feature
>  Components: Spark Engine
>Reporter: Shaofeng SHI
>Priority: Major
>
> Now Kylin uses a MR job to fetch Kafka messages in parallel and then persist 
> to HDFS for subsequent processing. If user selects to use Spark engine, we 
> can use Spark streaming API to do this. Spark streaming can read the Kafka 
> message in a given offset range as a RDD, then it would be easy to process;
> https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html 
> With Spark streaming, Kylin can also easily connect with other data source 
> like Kinesis, Flume, etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Issue Comment Deleted] (KYLIN-3832) Kylin Pushdown query not support postgresql

2019-06-29 Thread weibin0516 (JIRA)


 [ 
https://issues.apache.org/jira/browse/KYLIN-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

weibin0516 updated KYLIN-3832:
--
Comment: was deleted

(was: I'd like to implement the postgresql data source adapter.
Please assign to me.)

> Kylin Pushdown query not support postgresql
> ---
>
> Key: KYLIN-3832
> URL: https://issues.apache.org/jira/browse/KYLIN-3832
> Project: Kylin
>  Issue Type: Bug
>  Components: Query Engine
>Affects Versions: v2.5.2
>Reporter: hailin.huang
>Priority: Major
> Fix For: Future
>
>
> when I run pushdown to postgresql in my env, I encount the below exception.
> it seems that kylin need support more JDBC Driver, 
> PushDownRunnerJdbcImpl.class should be more general.
> 2019-02-26 16:12:53,168 ERROR [Query 207dcf77-7c14-8078-ea8b-79644a0c576d-48] 
> service.QueryService:989 : pushdown engine failed current query too
> java.sql.SQLException: Unrecognized column type: int8
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.toSqlType(PushDownRunnerJdbcImpl.java:260)
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.extractColumnMeta(PushDownRunnerJdbcImpl.java:192)
>   at 
> org.apache.kylin.query.adhoc.PushDownRunnerJdbcImpl.executeQuery(PushDownRunnerJdbcImpl.java:68)
>   at 
> org.apache.kylin.query.util.PushDownUtil.tryPushDownQuery(PushDownUtil.java:122)
>   at 
> org.apache.kylin.query.util.PushDownUtil.tryPushDownSelectQuery(PushDownUtil.java:69)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)