[jira] [Created] (KYLIN-5655) The index details in the optimization suggestions are recommended to increase the dimension base
Laura Xia created KYLIN-5655: Summary: The index details in the optimization suggestions are recommended to increase the dimension base Key: KYLIN-5655 URL: https://issues.apache.org/jira/browse/KYLIN-5655 Project: Kylin Issue Type: Bug Reporter: Laura Xia 优化建议的索引详情页面也能和正常索引一样显示各维度的基数 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5654) Diagnostic package API (/kylin/api/system/diag? host= ip:port) The current design has security risks
Laura Xia created KYLIN-5654: Summary: Diagnostic package API (/kylin/api/system/diag? host= ip:port) The current design has security risks Key: KYLIN-5654 URL: https://issues.apache.org/jira/browse/KYLIN-5654 Project: Kylin Issue Type: Bug Reporter: Laura Xia RT -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5653) Enable Spark Parquet Page Index by default to improve query performance
Zhiting Guo created KYLIN-5653: -- Summary: Enable Spark Parquet Page Index by default to improve query performance Key: KYLIN-5653 URL: https://issues.apache.org/jira/browse/KYLIN-5653 Project: Kylin Issue Type: Bug Components: Query Engine Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Spark has officially supported Parquet PageIndex support, so kylin can enable the PageIndex function by default -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5652) Network anomalies or metabase anomalies cause Project Epoch to change frequently, which may cause the job in pending status for a long time and no longer to be executed
Zhiting Guo created KYLIN-5652: -- Summary: Network anomalies or metabase anomalies cause Project Epoch to change frequently, which may cause the job in pending status for a long time and no longer to be executed Key: KYLIN-5652 URL: https://issues.apache.org/jira/browse/KYLIN-5652 Project: Kylin Issue Type: Bug Components: Metadata Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Attachments: Network anomalies or metabase anomalies cause Project Epoch to change frequently, which may cause the job in pending status for a long time and no longer to be executed.pdf The scheduling of the project is not shut down during the shutdown period. At this time, the start operation is triggered. When starting, it is found that the scheduling is still in the start state in the if judgment, and it is returned. Then the shutdown operation ends, the task scheduling is turned off, and the task of the project has been in the pending state since then. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5651) supports obtaining table comment from Hive
Zhiting Guo created KYLIN-5651: -- Summary: supports obtaining table comment from Hive Key: KYLIN-5651 URL: https://issues.apache.org/jira/browse/KYLIN-5651 Project: Kylin Issue Type: Bug Components: REST Service Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Attachments: supports obtaining table comment from Hive.pdf The API to get the table cannot get the table comment Get /kylin/api/tables -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5650) In the cloud environment, there is a probability that the dictionary metadata file will be read abnormally during building job, resulting in incorrect query results.
Zhiting Guo created KYLIN-5650: -- Summary: In the cloud environment, there is a probability that the dictionary metadata file will be read abnormally during building job, resulting in incorrect query results. Key: KYLIN-5650 URL: https://issues.apache.org/jira/browse/KYLIN-5650 Project: Kylin Issue Type: Bug Components: Tools, Build and Test Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Attachments: In the cloud environment, there is a probability that the dictionary metadata file will be read abnormally during building job, resulting in incorrect query results..pdf Checked the dictionary, there are no duplicate values. Checked the execution plan of the build dictionary step, there is no problem. Checked the steps of building a flat table and found that there was a problem in the step of flat table encoding dictionary. The reason for the error is that the encoding is not performed after repartition according to the dictionary column. As shown in the figure, there is no repartition, and the encode column appears in the plan. There are also the following logs: {code:java} 2023-03-26T20:26:30,868 INFO [logger-thread-0] dict.NGlobalDictHDFSStore : Commit from s3a://datalake-kc-s3-prd-bj/kylin/kcprodYcHG_kylin/datalake_kylin/dict/global_dict/GDT.GDT_CMPLYA_FCT_DIST_RESLT/IS_STAT/working to s3a://datalake-kc-s3-prd-bj/kylin/kcprodYcHG_kylin/datalake_kylin/dict/global_dict/GDT.GDT_CMPLYA_FCT_DIST_RESLT/IS_STAT/version_1679862387539 2023-03-26T20:31:14,501 INFO [logger-thread-0] dict.NGlobalDictionaryV2 : getMetaInfo versions.length is 12 2023-03-26T20:31:14,547 INFO [logger-thread-0] dict.NGlobalDictHDFSStore : because metaFiles.length is 0, metaInfo is null 2023-03-26T20:31:14,547 INFO [logger-thread-0] dict.NGlobalDictionaryV2 : getMetaInfo metadata is null : [true]{code} This is on s3, after renaming the dictionary directory, no metadata file is queried. However, if the meta is not obtained in the code and no error is reported, it is not reasonable to encode directly without repartition. In short, the result is that the encoding of the dictionary column on the flat table fails. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5649) When query contains computed columns, fail to guarantee the priority of using the aggregate index to answer the aggregate query
Zhiting Guo created KYLIN-5649: -- Summary: When query contains computed columns, fail to guarantee the priority of using the aggregate index to answer the aggregate query Key: KYLIN-5649 URL: https://issues.apache.org/jira/browse/KYLIN-5649 Project: Kylin Issue Type: Bug Components: Query Engine Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Attachments: When query contains computed columns, fail to guarantee the priority of using the aggregate index to answer the aggregate query.pdf After enabling the "kylin.query.use-tableindex-answer-non-raw-query = true" & & "kylin.query.layout.prefer-aggindex = true" parameter, the aggregate query can match the aggregate index and the basic detail index, but the final hit is the basic detail index. What is puzzling is why the aggregate query is answered using the basic detail index? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5648) Add log for sparder init user
Zhiting Guo created KYLIN-5648: -- Summary: Add log for sparder init user Key: KYLIN-5648 URL: https://issues.apache.org/jira/browse/KYLIN-5648 Project: Kylin Issue Type: Improvement Components: Others Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha The user name used when spark initialization needs to be explicitly printed in the log. *dev design* Add a line of log printing when spader starts {{}} {code:java} logInfo(s"sparder init user:${UserGroupInformation.getCurrentUser.getUserName}"){code} {{}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5647) upload hive_1_2_2 jars to HDFS before kylin start
Zhiting Guo created KYLIN-5647: -- Summary: upload hive_1_2_2 jars to HDFS before kylin start Key: KYLIN-5647 URL: https://issues.apache.org/jira/browse/KYLIN-5647 Project: Kylin Issue Type: Improvement Components: Tools, Build and Test Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Attachments: Auto upload hive jars.pdf -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5646) The build job reports an error at the step of detecting time partition columns in the Yarn Cluster mode
Zhiting Guo created KYLIN-5646: -- Summary: The build job reports an error at the step of detecting time partition columns in the Yarn Cluster mode Key: KYLIN-5646 URL: https://issues.apache.org/jira/browse/KYLIN-5646 Project: Kylin Issue Type: Bug Components: Tools, Build and Test Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha When building Spark YARN-Cluster mode, when detecting incremental time partition columns, initializing KylinConfig reports an error Didn't find KYLIN_HOME or KYLIN_HOME *Reproduce method* Build the partition table model incrementally using Spark YARN_Cluster mode, and set kylin.engine.check-partition-col-enabled=true (the default value is true) *Root Cause* Modified the autoSetShufflePartitions of the pushdown query in [KYLIN-5571], no need to execute when the pre-modification build task detects the delta time column format (only the pushdown query is executed) After modification, autoSetShufflePartitions is executed asynchronously, the following two methods will get KylinConfig through KylinConfig.getInstanceFromEnv, At this time, the asynchronous execution of the new thread cannot use the built KylinConfig, so the KylinConfig will be initialized, However, the build task jvm and the KE main process are not the same machine, and KYLIN_CONF and KYLIN_HOME cannot be obtained, so the build task fails to run * ResourceDetectUtils.getResourceSizeWithTimeoutByConcurrency * ResourceDetectUtils.getResourceSizBySerial *fix design* In all the logic of newly opened threads, if KylinConfig is used, this method KylinConfig.getInstanceFromEnv() is not used. Unified is obtained by an external thread and passed to the place where it needs to be used -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5645) add response params for model list api
Zhiting Guo created KYLIN-5645: -- Summary: add response params for model list api Key: KYLIN-5645 URL: https://issues.apache.org/jira/browse/KYLIN-5645 Project: Kylin Issue Type: Bug Reporter: Zhiting Guo For api GET /kylin/api/models, if set lite=false, the response will not contain partition_column_in_dims and empty_model, which are expected by frontend. *fix design* Regardless of the value of lite,add partition_column_in_dims and empty_model to the response.{*}{*} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5644) fix diag api security, encryption changed from base64 to AES
Zhiting Guo created KYLIN-5644: -- Summary: fix diag api security, encryption changed from base64 to AES Key: KYLIN-5644 URL: https://issues.apache.org/jira/browse/KYLIN-5644 Project: Kylin Issue Type: Bug Components: REST Service, Security Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha *dev design* Continue to develop along the existing logic, adjust Base64 encryption to AES encryption, encryption & decryption algorithm multiplexing has been implemented as follows: Encryption: org.apache.kylin.common.util.EncryptUtil#encrypt(String strToEncrypt) Decryption: org.apache.kylin.common.util.EncryptUtil#decrypt(String strToDecrypt) Because there will be special characters after AES encryption, such as: +, when API parameters are passed, they will be recognized as spaces, resulting in subsequent errors. So here is the adjustment, the encryption algorithm is changed to: first encrypt with EncryptUtil#encrypt and then encrypt twice with Base64, and the decryption algorithm is the same: first decrypt with Base64 and then decrypt twice with EncryptUtil#decrypt. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5643) Add public api for batch delete index
Zhiting Guo created KYLIN-5643: -- Summary: Add public api for batch delete index Key: KYLIN-5643 URL: https://issues.apache.org/jira/browse/KYLIN-5643 Project: Kylin Issue Type: Improvement Components: Metadata Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Move the "kylin/api/index_plans/index" DELETE API from NIndexPlanController to OpenIndexPlanController as a public api. demo: curl --location --request DELETE 'http://127.0.0.1:9099/kylin/api/index_plans/index?project=project1_name=abc_ids=201' \ --header 'Accept: application/vnd.apache.kylin-v4-public+json' \ --header 'Authorization: Basic YWRtaW46S1lMSU4=' -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5642) Align the default value of the parameter kylin.metadata.audit-log.max-size with the product manual
Zhiting Guo created KYLIN-5642: -- Summary: Align the default value of the parameter kylin.metadata.audit-log.max-size with the product manual Key: KYLIN-5642 URL: https://issues.apache.org/jira/browse/KYLIN-5642 Project: Kylin Issue Type: Bug Components: Documentation, Metadata Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Set the default value as 50 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5641) fix set spark conf in serverless mod
Zhiting Guo created KYLIN-5641: -- Summary: fix set spark conf in serverless mod Key: KYLIN-5641 URL: https://issues.apache.org/jira/browse/KYLIN-5641 Project: Kylin Issue Type: Bug Components: Tools, Build and Test Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha In serverless mode, it will cause a NoSuchMethodException when build a model. To fix it, just remove the set of spark.sql.sources.repartitionWritingDataSource -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5640) Support to automatically adjust the Bloom Filter based on data distribution
Zhiting Guo created KYLIN-5640: -- Summary: Support to automatically adjust the Bloom Filter based on data distribution Key: KYLIN-5640 URL: https://issues.apache.org/jira/browse/KYLIN-5640 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha h3. Why are the changes needed? Now the usage of bloom filter is to specify the NDV(number of distinct values), and then build BloomFilter. In general scenarios, it is actually not sure how much the distinct value is. If BloomFilter can be automatically generated according to the data, the file size can be reduced and the reading efficiency can also be improved. h3. What changes were proposed in this pull request? {{DynamicBlockBloomFilter}} contains multiple {{BlockSplitBloomFilter}} as candidates and inserts values in the candidates at the same time. Use the largest bloom filter as an approximate deduplication counter, and then remove incapable bloom filter candidates during data insertion. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5639) Refine kylin-it dependency
Zhiting Guo created KYLIN-5639: -- Summary: Refine kylin-it dependency Key: KYLIN-5639 URL: https://issues.apache.org/jira/browse/KYLIN-5639 Project: Kylin Issue Type: Improvement Components: Tools, Build and Test Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5638) kylin create spark history dir auto in SparkApplication
Zhiting Guo created KYLIN-5638: -- Summary: kylin create spark history dir auto in SparkApplication Key: KYLIN-5638 URL: https://issues.apache.org/jira/browse/KYLIN-5638 Project: Kylin Issue Type: Improvement Reporter: Zhiting Guo *dev design* Before creating a spark session when building a job, check the configuration of the event log directory and find the directory. If the directory does not exist, create it. This will prevent different spark history directories from being configured for different projects. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5637) minor fix get delta table ddl
Zhiting Guo created KYLIN-5637: -- Summary: minor fix get delta table ddl Key: KYLIN-5637 URL: https://issues.apache.org/jira/browse/KYLIN-5637 Project: Kylin Issue Type: Improvement Components: RDBMS Source Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha The delta data source does not support operations such as msck partition, show create table, etc. Need to do some processing -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5636) automatically clean up dependent files after the build task
Zhiting Guo created KYLIN-5636: -- Summary: automatically clean up dependent files after the build task Key: KYLIN-5636 URL: https://issues.apache.org/jira/browse/KYLIN-5636 Project: Kylin Issue Type: Improvement Components: Tools, Build and Test Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha *question:* The files uploaded under the path spark.kubernetes.file.upload.path are not automatically deleted 1: When spark creates a driverPod, it uploads dependencies to the specified path. The build task is in cluster mode and needs to create a driverPod. Running the build task multiple times results in a large path file. 2: At present, the upload.path path we configured (s3a://kylin/spark-on-k8s) is a fixed path, and spark will create a subdirectory in this directory, the spark-upload-uuid directory, and then store the dependencies in it. *dev design* Core idea, add dynamic subdirectory under the original upload.path path, delete the entire subdirectory when the task is over Build task: upload.path + jobId (e.g. s3a://kylin/spark-on-k8s/uuid) Delete the dependency directory when the build task is finished Automatically delete dependent function is called, kill-9 situation will lead to the deletion function is not called, garbage cleaning function needs to be added to the bottom of the policy, such as greater than three months before the directory is automatically deleted -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5635) Adapt for delta table
Zhiting Guo created KYLIN-5635: -- Summary: Adapt for delta table Key: KYLIN-5635 URL: https://issues.apache.org/jira/browse/KYLIN-5635 Project: Kylin Issue Type: Improvement Components: RDBMS Source Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha When obtaining delta column information, because the column information is not saved in the catalog (due to the use of DeltaCatalog), it cannot be obtained directly from the catalog. We first determine whether the table is a delta table. The Delta SDK provides a function. If so, read the table through spark.table to get the schema. Here, spark scans Metadata to get the schema information under the back path Also due to the use of DeltaCatalog, delta table does not support the show create table statement, this is because deltaCatalog does some checks, does not support this SQL , here by judging whether it is delta in advance, if it is directly through location and table spell a ddl return. Limit: The partition column is not processed here, so the partition column is not recognized, and delta does not manage its own partition through the catalog. It is obtained in real time by scanning the Metadata under the confidant path, so it does not affect the reading of data. The only place that has an impact is the function of snapshot partition construction. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5634) Support query executor expansion and contraction
Zhiting Guo created KYLIN-5634: -- Summary: Support query executor expansion and contraction Key: KYLIN-5634 URL: https://issues.apache.org/jira/browse/KYLIN-5634 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Attachments: Support query executor expansion and contraction.pdf -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5633) The query can answered with the existing data of the current model (accepting index or segment data is not uniform), and there should be no query failure or push-down
Zhiting Guo created KYLIN-5633: -- Summary: The query can answered with the existing data of the current model (accepting index or segment data is not uniform), and there should be no query failure or push-down Key: KYLIN-5633 URL: https://issues.apache.org/jira/browse/KYLIN-5633 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha Attachments: Segment heterogeneous query behavior (1).pdf [^Segment heterogeneous query behavior (1).pdf] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5632) Optimize and clean up some useless code in the query
Zhiting Guo created KYLIN-5632: -- Summary: Optimize and clean up some useless code in the query Key: KYLIN-5632 URL: https://issues.apache.org/jira/browse/KYLIN-5632 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: 5.0-alpha Reporter: Zhiting Guo Fix For: 5.0-alpha this issue will includes: 1. Refactoring the selection logic of realizations 2. Rename, move package or drop some useless class 3. fix some unstable ut, add some ignored ut, move some ut to the module of kylin-it 4. move index matchers to ChooserContext 5. Move candidate sorting method to the QueryRouter -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5631) Logical view some issues
Laura Xia created KYLIN-5631: Summary: Logical view some issues Key: KYLIN-5631 URL: https://issues.apache.org/jira/browse/KYLIN-5631 Project: Kylin Issue Type: Bug Reporter: Laura Xia -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5630) Query history some css issues
Laura Xia created KYLIN-5630: Summary: Query history some css issues Key: KYLIN-5630 URL: https://issues.apache.org/jira/browse/KYLIN-5630 Project: Kylin Issue Type: Bug Reporter: Laura Xia -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5629) After the multi-level partition model is modified to a full build model, the build job fails
Laura Xia created KYLIN-5629: Summary: After the multi-level partition model is modified to a full build model, the build job fails Key: KYLIN-5629 URL: https://issues.apache.org/jira/browse/KYLIN-5629 Project: Kylin Issue Type: Bug Reporter: Laura Xia -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5628) The configuration items of kylin.source.ddl.logical-view.database are inconsistent
Laura Xia created KYLIN-5628: Summary: The configuration items of kylin.source.ddl.logical-view.database are inconsistent Key: KYLIN-5628 URL: https://issues.apache.org/jira/browse/KYLIN-5628 Project: Kylin Issue Type: Bug Reporter: Laura Xia 参数项:kylin.source.ddl.logical-view-database=DB_logical_view -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5627) After the logical view is created successfully, the front end does not display an entry prompting for loading
Laura Xia created KYLIN-5627: Summary: After the logical view is created successfully, the front end does not display an entry prompting for loading Key: KYLIN-5627 URL: https://issues.apache.org/jira/browse/KYLIN-5627 Project: Kylin Issue Type: Bug Reporter: Laura Xia RT -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5626) Copy the successfully queried SQL from the query history, because KE rearranged the SQL format with more spaces, the query fails when you query again
Laura Xia created KYLIN-5626: Summary: Copy the successfully queried SQL from the query history, because KE rearranged the SQL format with more spaces, the query fails when you query again Key: KYLIN-5626 URL: https://issues.apache.org/jira/browse/KYLIN-5626 Project: Kylin Issue Type: Bug Reporter: Laura Xia 从界面查询历史复制由帆软发出的灵活报表查询成功的SQL由于N 和 ‘ ‘之间多了个空格, 导致SQL查询失败 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5625) Edit the custom table index, delete the ShardBy column, and build the index failed
Laura Xia created KYLIN-5625: Summary: Edit the custom table index, delete the ShardBy column, and build the index failed Key: KYLIN-5625 URL: https://issues.apache.org/jira/browse/KYLIN-5625 Project: Kylin Issue Type: Bug Reporter: Laura Xia 在后端处理手动添加明细索引请求的时候,没有检测是否 {{shard_by_columns}} 列 一定要在 {{col_order}} 字段中,所以添加新索引的时候是可能发生丢失 col_order 的情况的,这是不符合预期的,构建时也会发生错误,本地测试了下如下请求可以请求成功,但实际存储了错误的元数据 {{curl -X POST \ http://10.1.2.168:7068/kylin/api/index_plans/table_index \ -H 'accept: application/vnd.apache.kylin-v4+json' \ -H 'accept-language: en' \ -H 'authorization: Basic QURNSU46S1lMSU4=' \ -H 'cache-control: no-cache' \ -H 'content-type: application/json' \ -H 'postman-token: fb63f31f-d8d2-b728-3f3b-553e38b02cb2' \ -d '\{"id":"","col_order":["TF_FACTS_LEADS_DAY.NEW_INTENTIONS"],"sort_by_columns":[], \ "shard_by_columns":["TF_FACTS_LEADS_DAY.NEW_LEADS"],"load_data":false,"index_range":"EMPTY","project":"SFM","model_id":"290c552c-c38f-ea8f-8c18-e7850d7419cc"}'}} 查看了前端,前端也有 bug,当勾选某个列,选择为 shard by 列,然后点击取消勾选,然后发送请求,col_order 中没有加上这个列,shard_by_columns 加上了这个列,就变成了和上述请求 api 同样的效果 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5624) Text recognition has been added to the hierarchy dimension, the union dimension, and the detail index
Laura Xia created KYLIN-5624: Summary: Text recognition has been added to the hierarchy dimension, the union dimension, and the detail index Key: KYLIN-5624 URL: https://issues.apache.org/jira/browse/KYLIN-5624 Project: Kylin Issue Type: Improvement Reporter: Laura Xia -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5623) Enter the snapshot page immediately after deleting the source table, and an error is reported
Laura Xia created KYLIN-5623: Summary: Enter the snapshot page immediately after deleting the source table, and an error is reported Key: KYLIN-5623 URL: https://issues.apache.org/jira/browse/KYLIN-5623 Project: Kylin Issue Type: Bug Reporter: Laura Xia 在快照页面新增快照,进入数据源页面删除该表,立即进入快照列表页面 报错 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5622) Support to create logical views to improve the data reprocessing ability of data developers
Laura Xia created KYLIN-5622: Summary: Support to create logical views to improve the data reprocessing ability of data developers Key: KYLIN-5622 URL: https://issues.apache.org/jira/browse/KYLIN-5622 Project: Kylin Issue Type: Task Reporter: Laura Xia -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5621) The order of ascending and descending order by start time or end time is wrong on the index completion page of incrementally built models
Laura Xia created KYLIN-5621: Summary: The order of ascending and descending order by start time or end time is wrong on the index completion page of incrementally built models Key: KYLIN-5621 URL: https://issues.apache.org/jira/browse/KYLIN-5621 Project: Kylin Issue Type: Bug Reporter: Laura Xia RT -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5620) Unified the Chinese and English copywriting on the model setting page
Laura Xia created KYLIN-5620: Summary: Unified the Chinese and English copywriting on the model setting page Key: KYLIN-5620 URL: https://issues.apache.org/jira/browse/KYLIN-5620 Project: Kylin Issue Type: Bug Reporter: Laura Xia Attachments: image-2023-07-12-11-39-43-725.png !image-2023-07-12-11-39-43-725.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5619) Unify the English copywriting of the "Save and Build" button
Laura Xia created KYLIN-5619: Summary: Unify the English copywriting of the "Save and Build" button Key: KYLIN-5619 URL: https://issues.apache.org/jira/browse/KYLIN-5619 Project: Kylin Issue Type: Bug Reporter: Laura Xia Attachments: image-2023-07-12-11-38-15-181.png !image-2023-07-12-11-38-15-181.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5618) Unify the copywriting of the ShardBy column on the Web GUI
Laura Xia created KYLIN-5618: Summary: Unify the copywriting of the ShardBy column on the Web GUI Key: KYLIN-5618 URL: https://issues.apache.org/jira/browse/KYLIN-5618 Project: Kylin Issue Type: Bug Reporter: Laura Xia Attachments: image-2023-07-12-11-35-54-602.png, image-2023-07-12-11-36-02-682.png !image-2023-07-12-11-35-54-602.png! !image-2023-07-12-11-36-02-682.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5617) Semi-cumulative measurement Time dimension field types are not filtered
Laura Xia created KYLIN-5617: Summary: Semi-cumulative measurement Time dimension field types are not filtered Key: KYLIN-5617 URL: https://issues.apache.org/jira/browse/KYLIN-5617 Project: Kylin Issue Type: Bug Reporter: Laura Xia 半累加度量SUM_LC,对于时间维度没有做 boolean、float、double 类型的过滤 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5616) model failed to display description information
Laura Xia created KYLIN-5616: Summary: model failed to display description information Key: KYLIN-5616 URL: https://issues.apache.org/jira/browse/KYLIN-5616 Project: Kylin Issue Type: Bug Reporter: Laura Xia {{创建模型后写入描述,但不会正常显示}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5615) The front end limits the semi-cumulative metric field from being reused
Laura Xia created KYLIN-5615: Summary: The front end limits the semi-cumulative metric field from being reused Key: KYLIN-5615 URL: https://issues.apache.org/jira/browse/KYLIN-5615 Project: Kylin Issue Type: Bug Reporter: Laura Xia 在添加半累加度量的时候,计算字段和时间字段已被其它半累加度量使用,这时候前端有限制,按照设计这里应该没有限制不能重复使用才是 另外,当前KE的度量逻辑是单个添加的时候不允许使用重复的列,批量添加时却没有限制 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5614) When creating a model through sql, click to edit the model, and the dimension table display will merge from the expanded state
Laura Xia created KYLIN-5614: Summary: When creating a model through sql, click to edit the model, and the dimension table display will merge from the expanded state Key: KYLIN-5614 URL: https://issues.apache.org/jira/browse/KYLIN-5614 Project: Kylin Issue Type: Bug Reporter: Laura Xia RT -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5613) The time precision of the query is inconsistent
Laura Xia created KYLIN-5613: Summary: The time precision of the query is inconsistent Key: KYLIN-5613 URL: https://issues.apache.org/jira/browse/KYLIN-5613 Project: Kylin Issue Type: Improvement Reporter: Laura Xia 查询的时间精度不一致 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5612) WAINING State Model Click to see details Front-end error
Laura Xia created KYLIN-5612: Summary: WAINING State Model Click to see details Front-end error Key: KYLIN-5612 URL: https://issues.apache.org/jira/browse/KYLIN-5612 Project: Kylin Issue Type: Bug Reporter: Laura Xia Attachments: image-2023-07-05-16-54-00-291.png !image-2023-07-05-16-54-00-291.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5611) Failed to query history details
Laura Xia created KYLIN-5611: Summary: Failed to query history details Key: KYLIN-5611 URL: https://issues.apache.org/jira/browse/KYLIN-5611 Project: Kylin Issue Type: Bug Reporter: Laura Xia 在查询历史界面展开详情没反应或者展开是空白 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5610) Should automatically copy some of the necessary jars from kylin/server/jar into kylin/spark/jars
huangsheng created KYLIN-5610: - Summary: Should automatically copy some of the necessary jars from kylin/server/jar into kylin/spark/jars Key: KYLIN-5610 URL: https://issues.apache.org/jira/browse/KYLIN-5610 Project: Kylin Issue Type: Bug Components: Tools, Build and Test Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-beta When soft-affinity is enabled, spark depends on some special jar packages. Therefore, we need to copy these jar packages to spark/jars automatically after downloading spark -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5609) Fix security vulnerabilities, upgrade spark version to 3.2.0-kylin-4.6.8.0
huangsheng created KYLIN-5609: - Summary: Fix security vulnerabilities, upgrade spark version to 3.2.0-kylin-4.6.8.0 Key: KYLIN-5609 URL: https://issues.apache.org/jira/browse/KYLIN-5609 Project: Kylin Issue Type: Bug Components: Spark Engine Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-alpha Due to the CVE-2023-24998 security vulnerability in spark 3.2.0-kylin-4.6.7, upgrade spark to 3.2.0-kylin-4.6.8.0 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5608) [Aggregation Group API] Calling the newly added index of the aggregation group API, the order of dimensions displayed on the page is inconsistent with the order passed in
huangsheng created KYLIN-5608: - Summary: [Aggregation Group API] Calling the newly added index of the aggregation group API, the order of dimensions displayed on the page is inconsistent with the order passed in by the api Key: KYLIN-5608 URL: https://issues.apache.org/jira/browse/KYLIN-5608 Project: Kylin Issue Type: Bug Components: REST Service Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-alpha [Aggregation Group API] Calling the newly added index of the aggregation group API, the order of dimensions displayed on the page is inconsistent with the order passed in by the api -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5607) When querying with high concurrency, the query may report an error (the thread is not safe when ACL permission multi-threaded concurrent access)
huangsheng created KYLIN-5607: - Summary: When querying with high concurrency, the query may report an error (the thread is not safe when ACL permission multi-threaded concurrent access) Key: KYLIN-5607 URL: https://issues.apache.org/jira/browse/KYLIN-5607 Project: Kylin Issue Type: Bug Components: Metadata Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-alpha ACL records are taken out of the cache, and the internal array is modified by sorting in the init method. Since the metadata in Kylin is shared, there will be multi-threaded access problems, so there will be concurrent modification problems here. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5606) [Aggregation Group API] The required item aggregate_group is not passed, and the prompt information is unclear
huangsheng created KYLIN-5606: - Summary: [Aggregation Group API] The required item aggregate_group is not passed, and the prompt information is unclear Key: KYLIN-5606 URL: https://issues.apache.org/jira/browse/KYLIN-5606 Project: Kylin Issue Type: Bug Components: REST Service Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-alpha Attachments: image-2023-07-03-19-29-42-051.png [Aggregation Group API] The required item aggregate_group is not passed, and the prompt information is unclear !image-2023-07-03-19-29-42-051.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5605) When starting kylin in the FI environment, if the Kerberos configuration KAP_KERBEROS_ENABLED is empty, the jar package replacement will not be performed, resulting in st
huangsheng created KYLIN-5605: - Summary: When starting kylin in the FI environment, if the Kerberos configuration KAP_KERBEROS_ENABLED is empty, the jar package replacement will not be performed, resulting in startup failure Key: KYLIN-5605 URL: https://issues.apache.org/jira/browse/KYLIN-5605 Project: Kylin Issue Type: Bug Components: Environment Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-alpha When starting kylin in the FI environment, if the Kerberos configuration KAP_KERBEROS_ENABLED is empty, the jar package replacement will not be performed, resulting in startup failure -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5604) Open API for adding aggregation group function
huangsheng created KYLIN-5604: - Summary: Open API for adding aggregation group function Key: KYLIN-5604 URL: https://issues.apache.org/jira/browse/KYLIN-5604 Project: Kylin Issue Type: New Feature Components: REST Service Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-alpha Since the latest version of Kylin does not open API interfaces for functions such as model editing, task management-task deletion, adding detailed indexes, and adding aggregation groups, etc. We are currently working on an indicator platform, and we need to use these interfaces to nest into our own programs for secondary development, so we hope to open up these APIs -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5603) After the V2 dictionary is automatically upgraded to the V3 dictionary, the dictionary data is abnormal, resulting in incorrect query results
huangsheng created KYLIN-5603: - Summary: After the V2 dictionary is automatically upgraded to the V3 dictionary, the dictionary data is abnormal, resulting in incorrect query results Key: KYLIN-5603 URL: https://issues.apache.org/jira/browse/KYLIN-5603 Project: Kylin Issue Type: Bug Components: Job Engine Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-alpha After v2 upgrades the v3 dictionary, if v2 and v3 are mixed, the query result is incorrect Root Cause The project opens the v3 dictionary construction, but when the v3 dictionary construction reads the v2 dictionary, the db name is not included in the path, so the v2 dictionary will not be read, causing the v2 upgrade and the v3 dictionary conversion step to be skipped directly, and v3 is equivalent to recoding. Therefore, the v2 upgrade v3 dictionary must not be available, and the encoding is disordered. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5602) Added logs related to Segment Pruning for derived dimension queries
huangsheng created KYLIN-5602: - Summary: Added logs related to Segment Pruning for derived dimension queries Key: KYLIN-5602 URL: https://issues.apache.org/jira/browse/KYLIN-5602 Project: Kylin Issue Type: New Feature Components: Query Engine Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-alpha Supplement Derived segment pruning related logs # Query whether Derived Segment Pruning is used, and filter a few # Add Metric to quantify the effect of Derived Segment Pruning, such as: counting the filtered segment size and counting the time of Bloom Filter -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5601) V2 dictionary workaround fixes
huangsheng created KYLIN-5601: - Summary: V2 dictionary workaround fixes Key: KYLIN-5601 URL: https://issues.apache.org/jira/browse/KYLIN-5601 Project: Kylin Issue Type: Bug Components: Job Engine Affects Versions: 5.0-alpha Reporter: huangsheng Fix For: 5.0-alpha In some scenarios, AQE optimizes the step of writing the dictionary from repartition in the execution plan, skips the encoding of some partitioned dictionaries, and there is no abnormal prompt in the task status, which eventually leads to an error in the calculation of the count distinct metric. So consider turning off AQE during dictionary construction -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5600) LDAP DN is not case sensitive, resulting in user login failure
huangsheng created KYLIN-5600: - Summary: LDAP DN is not case sensitive, resulting in user login failure Key: KYLIN-5600 URL: https://issues.apache.org/jira/browse/KYLIN-5600 Project: Kylin Issue Type: Bug Components: REST Service, Security Affects Versions: 5.0-alpha Reporter: huangsheng Fix For: 5.0-alpha In some user scenarios, uppercase and lowercase logins to LDAP fail. Root Cause: When all users are obtained from ldapUserService in the code, the attribute names in the recorded dn contain uppercase letters, but the DN attribute names passed in by customers when they log in to ldap are lowercase, resulting in inconsistent capitalization and login failure. Customers here CN =xxx,DU=xxx,DC=xxx, but ldap here is cn=xxx,du=xxx,dc=xxx A point where later maintenance can be optimized: When troubleshooting LDAP problems, there are often strange problems that the user names cannot be matched. It is very laborious to troubleshoot. You need to add this information to the log instead of printing it all the time. You can consider printing it after polling for a number of times, and printing it when it is loaded for the first time. and so on -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5599) In the scenario where there are many users in the user group, the performance of assigning users to the user group is extremely poor
huangsheng created KYLIN-5599: - Summary: In the scenario where there are many users in the user group, the performance of assigning users to the user group is extremely poor Key: KYLIN-5599 URL: https://issues.apache.org/jira/browse/KYLIN-5599 Project: Kylin Issue Type: New Feature Components: REST Service Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-beta, 5.0-alpha Currently, in my development environment, there are 2w+ users in a user group, and 1w+ users in the production environment. In the scenario where there are many users in the user group, the performance of assigning users to the user group is extremely poor. So I want to ensure that when the number of users in the user group is more than 20,000, assign users to the user group to respond within a reasonable time (such as 10 seconds) -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5598) Support Kylin to use zk to use username + password (no kerberos required)
huangsheng created KYLIN-5598: - Summary: Support Kylin to use zk to use username + password (no kerberos required) Key: KYLIN-5598 URL: https://issues.apache.org/jira/browse/KYLIN-5598 Project: Kylin Issue Type: New Feature Components: Environment Affects Versions: 5.0-alpha Reporter: huangsheng Assignee: huangsheng Fix For: 5.0-beta, 5.0-alpha Support Kylin to authenticate zk to use username + password (kerberos is not required). In some user-defined scenarios, there is no kerberos, but the user name and password are used to authenticate zk. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5597) When the /etc/hosts file of the KE node does not configure the IP corresponding to the cluster domain name in the hadoop conf configuration file, KE will fail to start
huangsheng created KYLIN-5597: - Summary: When the /etc/hosts file of the KE node does not configure the IP corresponding to the cluster domain name in the hadoop conf configuration file, KE will fail to start Key: KYLIN-5597 URL: https://issues.apache.org/jira/browse/KYLIN-5597 Project: Kylin Issue Type: Bug Components: Environment , Security Affects Versions: 5.0-alpha Reporter: huangsheng Fix For: 5.0-alpha If the IP corresponding to the cluster domain name in the hadoop conf configuration file is not configured in the /etc/hosts file of the newly added KE node, the command {code:java} get-properties.sh kylin.kerberos.enabled{code} will fail when the service starts, and throw java.net.UnknownHostException -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5596) kylin5 dashboard test units error issue
Li Can created KYLIN-5596: - Summary: kylin5 dashboard test units error issue Key: KYLIN-5596 URL: https://issues.apache.org/jira/browse/KYLIN-5596 Project: Kylin Issue Type: Test Affects Versions: 5.0-beta Reporter: Li Can Assignee: Li Can Fix For: 5.0-beta main branch code updated, but some code of dashboard is not been consistent with the main branch, the test units need to fix. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Final Reminder: Community Over Code call for presentations closing soon
[Note: You're receiving this email because you are subscribed to one or more project dev@ mailing lists at the Apache Software Foundation.] This is your final reminder that the Call for Presentations for Community Over Code (formerly known as ApacheCon) is closing soon - on Thursday, 13 July 2023 at 23:59:59 GMT. https://communityovercode.org/call-for-presentations/ We are looking for talk proposals on all topics related to ASF projects and open source software. The event will be held in Halifax, Nova Scotia, Octiber 7th through 10th. More details about the event may be found on the event website at https://communityovercode.org/ Rich, for the event planners
[jira] [Created] (KYLIN-5595) [kylin 5.0] Launch Job Node not initialize spark session issue
Li Can created KYLIN-5595: - Summary: [kylin 5.0] Launch Job Node not initialize spark session issue Key: KYLIN-5595 URL: https://issues.apache.org/jira/browse/KYLIN-5595 Project: Kylin Issue Type: Improvement Components: Job Engine Affects Versions: 5.0-alpha Reporter: Li Can Assignee: Li Can Fix For: 5.0-alpha Attachments: image (87).png, image (88).png Saving model will execute 'checkFlatTableSql' method at job node, and it will not skip the step by default. When execute 'checkFlatTableSql' method, it will initialize spark session if the job node just started, the process of getting spark session costs too much time. The pic 87 shows that get spark session costs more than 63s, and the execution of checking sql costs more than 2s, it is not friendly for saving model first time after node launched, and it is also unreasonable. So I suggest that the job node's process of initialization spark session should be consistent with the query node, it means that the spark session should be initialized as the node just started. And the spark session is a singleton model, just need once initialization, as the pic 88 display. -- This message was sent by Atlassian Jira (v8.20.10#820010)
TAC Applications for Community Over Code North America and Asia now open
Hi All, (This email goes out to all our user and dev project mailing lists, so you may receive this email more than once.) The Travel Assistance Committee has opened up applications to help get people to the following events: *Community Over Code Asia 2023 - * *August 18th to August 20th in Beijing , China* Applications for this event closes on the 6th July so time is short, please apply as soon as possible. TAC is prioritising applications from the Asia and Oceania regions. More details on this event can be found at: https://apachecon.com/acasia2023/ More information on how to apply please read: https://tac.apache.org/ *Community Over Code North America - * *October 7th to October 10th in Halifax, Canada* Applications for this event closes on the 22nd July. We expect many applications so please do apply as soon as you can. TAC is prioritising applications from the North and South America regions. More details on this event can be found at: https://communityovercode.org/ More information on how to apply please read: https://tac.apache.org/ *Have you applied to be a Speaker?* If you have applied or intend to apply as a Speaker at either of these events, and think you may require assistance for Travel and/or Accommodation - TAC advises that you do not wait until you have been notified of your speaker status and to apply early. Should you not be accepted as a speaker and still wish to attend you can amend you application to include Conference fees, or, you may withdraw your application. The call for presentations for Halifax is here: https://communityovercode.org/call-for-presentations/ and you have until the 13th of July to apply. The call for presentations for Beijing is here: https://apachecon.com/acasia2023/cfp.html and you have until the 18th June to apply. *IMPORTANT Note on Visas:* It is important that you apply for a Visa as soon as possible - do not wait until you know if you have been accepted for Travel Assistance or not, as due to current wait times for Interviews in some Countries, waiting that long may be too late, so please do apply for a Visa right away. Contact tac-ap...@tac.apache.org if you need any more information or assistance in this area. *Spread the Word!!* TAC encourages you to spread the word about Travel Assistance to get to these events, so feel free to repost as you see fit on Social Media, at work, schools, universities etc etc... Thank You and hope to see you all soon Gavin McDonald on behalf of the ASF Travel Assistance Committee.
[jira] [Created] (KYLIN-5594) Support separation of data permissions and management permissions
Laura Xia created KYLIN-5594: Summary: Support separation of data permissions and management permissions Key: KYLIN-5594 URL: https://issues.apache.org/jira/browse/KYLIN-5594 Project: Kylin Issue Type: New Feature Reporter: Laura Xia *原需求* PLG 和 Managed Service 这种模式下,我们应该是作为服务的运维主体,但是一些诊断包等运维功能,只有 ADMIN 用户拥有,对于运维人员来说,ADMIN 用户的权限过大,应该有进一步的拆分 *沟通结果* 1托管运维希望在某一个权限(例如 运维)以上就可以打诊断包,最好是单独赋权。 2需要考虑Managed Service更多的权限需求 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5593) After the user is turned off the data permission, the sampled data can still be seen
Laura Xia created KYLIN-5593: Summary: After the user is turned off the data permission, the sampled data can still be seen Key: KYLIN-5593 URL: https://issues.apache.org/jira/browse/KYLIN-5593 Project: Kylin Issue Type: Bug Reporter: Laura Xia Attachments: image-2023-06-15-17-43-50-814.png, image-2023-06-15-17-44-18-775.png 无论哪个来源的数据,关闭数据查询权限后,均不能查看明细数据,但可以查看表结构。 比如:工资明细表, # 通过抽样获取维度基数,所有字段的最小值、最大值不是具体某一个员工,应当允许查看。(见图 1) # 抽样数据,则能看到10 个员工的每月工资,这当然是不能允许的(见图 2)图1 !image-2023-06-15-17-43-50-814.png! 图2 !image-2023-06-15-17-44-18-775.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5592) Click to delete the connection between the fact table and the dimension table, cancel the operation, return to the query analysis screen, click Delete, will pop up to del
Laura Xia created KYLIN-5592: Summary: Click to delete the connection between the fact table and the dimension table, cancel the operation, return to the query analysis screen, click Delete, will pop up to delete the model connection Key: KYLIN-5592 URL: https://issues.apache.org/jira/browse/KYLIN-5592 Project: Kylin Issue Type: Bug Reporter: Laura Xia 步骤一:在查询分析界面任意查询一条SQL SELECT "KYLIN_SALES"."TRANS_ID" FROM "DEFAULT"."KYLIN_SALES" as "KYLIN_SALES" LIMIT 500 步骤二:在查询分析窗口里按下delete键,可以正常删除内容 步骤三:点击一个模型(模型有连接关系,事实表join维表) 步骤四:点击编辑模型,点击删除join关系的x按钮 步骤五:弹出删除关联关系的提示,点击取消,再取消编辑模型,返回查询分析界面 步骤六:重复步骤二的操作,会弹出删除关联关系的提示,取消掉弹窗后,无法输入内容 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5591) Dev Env Setting Docs
unical1988 created KYLIN-5591: - Summary: Dev Env Setting Docs Key: KYLIN-5591 URL: https://issues.apache.org/jira/browse/KYLIN-5591 Project: Kylin Issue Type: Improvement Reporter: unical1988 I am setting the development environment for Kylin in Windows, and I am following their docs to do so :([https://kylin.apache.org/development40/dev_env.html]) The docs state that If using IntellJ Idea>17 then there's a need to modify “server/kylin-server.iml” file, replace all “PROVIDED” to “COMPILE”, otherwise an {{“java.lang.NoClassDefFoundError: org/apache/catalina/LifecycleListener”}} error may be thrown.. I am at this point and I can't find the mentioned file in the server/kylin-server.iml under kylin/server in the code cloned through git clone [https://github.com/apache/kylin.git] Any clues what is this file kylin-server.iml ? and what is meant by replace all provided by compile? Thanks -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5590) spark cube job supports priority, add job execution limit
Chuang Lee created KYLIN-5590: - Summary: spark cube job supports priority, add job execution limit Key: KYLIN-5590 URL: https://issues.apache.org/jira/browse/KYLIN-5590 Project: Kylin Issue Type: Improvement Components: Job Engine Affects Versions: v4.0.1 Reporter: Chuang Lee spark cube job supports priority, add job execution limit parallelism limit and execution time period limit to prevent excessive cluster resources -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5589) Supports semi-additive measure
Laura Xia created KYLIN-5589: Summary: Supports semi-additive measure Key: KYLIN-5589 URL: https://issues.apache.org/jira/browse/KYLIN-5589 Project: Kylin Issue Type: New Feature Reporter: Laura Xia 创建模型时,增加半累加度量,同时在模型中正确保存半累加度量的元数据。 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5588) Update spark version to 3.2.0-kylin-4.6.7.0
Guangyuan Feng created KYLIN-5588: - Summary: Update spark version to 3.2.0-kylin-4.6.7.0 Key: KYLIN-5588 URL: https://issues.apache.org/jira/browse/KYLIN-5588 Project: Kylin Issue Type: Bug Components: Spark Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5587) Upgrade spring-webmvc to 5.3.26 to fix the vulnerability
Guangyuan Feng created KYLIN-5587: - Summary: Upgrade spring-webmvc to 5.3.26 to fix the vulnerability Key: KYLIN-5587 URL: https://issues.apache.org/jira/browse/KYLIN-5587 Project: Kylin Issue Type: Bug Components: Others, Security Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5586) Upgrade json-smart from 2.4.7 to 2.4.9, to eliminate the vulnerability
Guangyuan Feng created KYLIN-5586: - Summary: Upgrade json-smart from 2.4.7 to 2.4.9, to eliminate the vulnerability Key: KYLIN-5586 URL: https://issues.apache.org/jira/browse/KYLIN-5586 Project: Kylin Issue Type: Bug Components: Security Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5585) Bug fix for loading tables, to add the corresponding message of the failure into http response
Guangyuan Feng created KYLIN-5585: - Summary: Bug fix for loading tables, to add the corresponding message of the failure into http response Key: KYLIN-5585 URL: https://issues.apache.org/jira/browse/KYLIN-5585 Project: Kylin Issue Type: Bug Components: RDBMS Source, REST Service Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5584) To fix sonar checked errors
Guangyuan Feng created KYLIN-5584: - Summary: To fix sonar checked errors Key: KYLIN-5584 URL: https://issues.apache.org/jira/browse/KYLIN-5584 Project: Kylin Issue Type: Bug Components: Others Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5583) Minor bug fix for returning the wrong result by query with grouping sets
Guangyuan Feng created KYLIN-5583: - Summary: Minor bug fix for returning the wrong result by query with grouping sets Key: KYLIN-5583 URL: https://issues.apache.org/jira/browse/KYLIN-5583 Project: Kylin Issue Type: Bug Components: Query Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha The issue is brought in by https://issues.apache.org/jira/browse/KYLIN-5577. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5582) Minor fix for query collectors in BloomFilter
Guangyuan Feng created KYLIN-5582: - Summary: Minor fix for query collectors in BloomFilter Key: KYLIN-5582 URL: https://issues.apache.org/jira/browse/KYLIN-5582 Project: Kylin Issue Type: Bug Components: Query Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5581) Can't using metadata to respond to min/max aggregations on date/timestamp columns
Guangyuan Feng created KYLIN-5581: - Summary: Can't using metadata to respond to min/max aggregations on date/timestamp columns Key: KYLIN-5581 URL: https://issues.apache.org/jira/browse/KYLIN-5581 Project: Kylin Issue Type: Bug Components: Query Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha Th root cause is that Kylin will parse Date/Timestamp value as Integer internally, when enabling _kylin.query.try-route-to-metadata-enabled,_ the current code will mismatch the real data type. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5580) Refactor multi-tenant to make resource separable
Guangyuan Feng created KYLIN-5580: - Summary: Refactor multi-tenant to make resource separable Key: KYLIN-5580 URL: https://issues.apache.org/jira/browse/KYLIN-5580 Project: Kylin Issue Type: Improvement Components: REST Service Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5579) Remove wrong exclusions
Guangyuan Feng created KYLIN-5579: - Summary: Remove wrong exclusions Key: KYLIN-5579 URL: https://issues.apache.org/jira/browse/KYLIN-5579 Project: Kylin Issue Type: Bug Components: Others Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5578) Using metadata to respond to min/max aggregation queries
Guangyuan Feng created KYLIN-5578: - Summary: Using metadata to respond to min/max aggregation queries Key: KYLIN-5578 URL: https://issues.apache.org/jira/browse/KYLIN-5578 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha KE's segment contains enough metrics, especially existed min/max statistics, so it's possible to respond to min/max aggregations conditionally, thus to enhance KE's querying speed. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5577) Should prohibit creating a new model of which name is equal to the existed ignoring letter case
Guangyuan Feng created KYLIN-5577: - Summary: Should prohibit creating a new model of which name is equal to the existed ignoring letter case Key: KYLIN-5577 URL: https://issues.apache.org/jira/browse/KYLIN-5577 Project: Kylin Issue Type: Bug Components: REST Service Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5576) Listing files using WHERE conditions with subquery on partition columns will lead to the failure of building model
Guangyuan Feng created KYLIN-5576: - Summary: Listing files using WHERE conditions with subquery on partition columns will lead to the failure of building model Key: KYLIN-5576 URL: https://issues.apache.org/jira/browse/KYLIN-5576 Project: Kylin Issue Type: Bug Components: Modeling Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha KE should not using the filters containing subquery on partition columns to detecting resources. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5575) OPERATION role has no privilege to build the index if a model contains the base index.
Guangyuan Feng created KYLIN-5575: - Summary: OPERATION role has no privilege to build the index if a model contains the base index. Key: KYLIN-5575 URL: https://issues.apache.org/jira/browse/KYLIN-5575 Project: Kylin Issue Type: Bug Components: Modeling Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha OPERATION role should also have the privilege to modify the base index. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5574) Build model failed if any of the source table schema had been changed, saying deleted unrelated columns.
Guangyuan Feng created KYLIN-5574: - Summary: Build model failed if any of the source table schema had been changed, saying deleted unrelated columns. Key: KYLIN-5574 URL: https://issues.apache.org/jira/browse/KYLIN-5574 Project: Kylin Issue Type: Bug Components: Modeling Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha How the error occurred? During build a model, KE will construct a new SELECT sql from KE's table metadata, to read data from source tables. But if the user had deleted some unused columns from any source table, KE's metadata will be different with Hive's metastore, as a consequence, `SparkSession::sql` will throws validation exceptions because of the mismatched columns found in the SELECT sql. How to solve it? Using the intersection between KE's columns and Hive metastore's. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5573) Refine the response message when loading more than 1000 tables
Guangyuan Feng created KYLIN-5573: - Summary: Refine the response message when loading more than 1000 tables Key: KYLIN-5573 URL: https://issues.apache.org/jira/browse/KYLIN-5573 Project: Kylin Issue Type: Bug Components: Others Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha Chinese tips: From:一次最多可加载 1000 张表,请修改后重新提交。 To:一次最多可加载 1000 张表,请修改后重试。 English tips: From:Up to 1000 tables could be loaded per time, please modify and resubmit。 To:Up to 1000 tables could be loaded per time, please modify and try again. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5572) Should provide a REST API to allow users to build specific indexes
Guangyuan Feng created KYLIN-5572: - Summary: Should provide a REST API to allow users to build specific indexes Key: KYLIN-5572 URL: https://issues.apache.org/jira/browse/KYLIN-5572 Project: Kylin Issue Type: New Feature Components: Modeling, REST Service Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha Our product (KC 4.5.22.2) currently doesn’t support building specific indexes through API. Users can only refresh the whole segments to rebuild some of the indexes in the segment which is very inefficient. However, the same capability is available in the UI. Therefore, UBS wants to have a REST API to provide: # The ability to build specific indexes instead of all indexes in segments # The indexes to be built can be specified by *indexes ID or(and) status.* Status can be any valid status of indexes, for example, NO BUILD, ONLINE, etc. # The response of this API should return the job ID for further tracking Why should we provide this API? (Business Impact) Without this API, users can only rely on a *Kyligence ADMIN* to operate index building for each team from UI or simply rebuild the whole segment. In UBS, all operations on production have to be automated. No manual operations are allowed. To fully utilize the capability of AI recommendations, providing such an API will help UBS make the whole workflow more smooth and increase Kyligence usage. Meanwhile, the maintenance effort and total infra cost can be significantly reduced also. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5571) It takes too much time to calculate the data size during pushing down queries, which will lead to the queries un-stoppable.
Guangyuan Feng created KYLIN-5571: - Summary: It takes too much time to calculate the data size during pushing down queries, which will lead to the queries un-stoppable. Key: KYLIN-5571 URL: https://issues.apache.org/jira/browse/KYLIN-5571 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha During pushing down the query, KE will try to calculate the included data size to set Spark partitions, but if there were too many files on HDFS, it will take a lot of time to complete. So in order to improve this situation, the following things will be done: # Using a limited thread pool to calculate the data size # Add timeout for the calculation, so as to stop the query as soon as possible After these changes, we can expected the query complete in a fixed duration. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5570) Incorrect result produced by the query with grouping sets
Guangyuan Feng created KYLIN-5570: - Summary: Incorrect result produced by the query with grouping sets Key: KYLIN-5570 URL: https://issues.apache.org/jira/browse/KYLIN-5570 Project: Kylin Issue Type: Bug Components: Query Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha The following SQL pattern will produce the wrong result when hitting a model. In such a case, assuming each group contains 2 rows, this query will return 10 rows by KE, rather than 18 rows actually. {code:java} //代码占位符 select GROUPING(gr1) as gr1, GROUPING(gr2) as gr2, GROUPING(gr3) as gr3, GROUPING(gr4) as gr4, GROUPING(gr5) as gr5, GROUPING(gr6) as gr6, GROUPING(gr7) as gr7, GROUPING(gr8) as gr8, GROUPING(gr9) as gr9, count(distinct case when 1=1 then LO_ORDERKEY else null end) as goal_group from( select case when LO_ORDERPRIOTITY = '1-URGENT' then '立刻发出' else '延后发出' end gr1, case when LO_SHIPMODE = 'AIR' then '空运' else '海运' end gr2, case when LO_LINENUMBER = 1 then '1' else '0' end gr3, case when LO_CUSTKEY = 1 then '1' else '0' end gr4, case when LO_PARTKEY = 1 then '1' else '0' end gr5, case when LO_SUPPKEY = 1 then '1' else '0' end gr6, case when LO_QUANTITY = 1 then '1' else '0' end gr7, case when LO_EXTENDEDPRICE = 90400 then '1' else '0' end gr8, case when LO_TAX = 0 then '1' else '0' end gr9, LO_ORDERKEY from SSB.LINEORDER ) group by GROUPING SETS ( (gr1), (gr2), (gr3), (gr4), (gr5), (gr6), (gr7), (gr8), (gr9) ) order by 1,2,3,4,5,6,7,8,9 LIMIT 500{code} A simple resolution is to add *groupByColumns* ids into *OLAPProjectRel* node's *digest* field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5569) Support manually setting ssh encrypted password when enable Job multi-live
Guangyuan Feng created KYLIN-5569: - Summary: Support manually setting ssh encrypted password when enable Job multi-live Key: KYLIN-5569 URL: https://issues.apache.org/jira/browse/KYLIN-5569 Project: Kylin Issue Type: Improvement Components: Job Engine, Security Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha In some scenarios, users hope that the password set in properties file should be encrypted, to avoid the potential security issues. So it's great for Kylin to supply such an approach. After changes, we could config the password as the folloing: {code:java} 配置项加密: kylin.job.ssh-password=ENC('${encrypted_password}') 举例: 输入:${KYLIN_HOME}/bin/kylin.sh io.kyligence.kap.tool.general.CryptTool -e AES -s kylin 输出:AES encrypted password is: YeqVr9MakSFbgxEec9sBwg== // Sample kylin.server.leader-race.heart-beat-timeout=60 kylin.server.leader-race.heart-beat-interval=30 kylin.job.ssh-username=quard kylin.job.ssh-password=ENC('k7lRPO1yqWRgtR09uG+F2w==') #kylin.job.ssh-password=PlainTextPassWord {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5568) Some JDBC datasources will fail to query or load parts of tables, like GaussDB, responding to KE
Guangyuan Feng created KYLIN-5568: - Summary: Some JDBC datasources will fail to query or load parts of tables, like GaussDB, responding to KE Key: KYLIN-5568 URL: https://issues.apache.org/jira/browse/KYLIN-5568 Project: Kylin Issue Type: Bug Components: Driver - JDBC Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha Currently KE keep all the table metadata as upper case, so it will fail to contacting with some JDBC datasources, like GaussDB, which are sensitive to the letters. So to solve this issue, a new boolean property {*}kylin.source.jdbc.convert-to-lowercase will be introduced{*}, false by default, to transform KE metadata to lowercase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5567) Make ke-external module obey the open source specs
Guangyuan Feng created KYLIN-5567: - Summary: Make ke-external module obey the open source specs Key: KYLIN-5567 URL: https://issues.apache.org/jira/browse/KYLIN-5567 Project: Kylin Issue Type: Bug Components: Others Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Yifan Zhang Fix For: 5.0-alpha In order to solve this issue, there are 5 things to do: # Replace *kap-external:4.5.9* with *kylin-external:5.0.0* # Change package path *org.apache.kylin.guava20.shaded.** to *org.apache.kylin.guava30.shaded.** # Publish *kylin-external-guava30* lib to public nexus repo # Add checkstyle rules to prohibit the official guava references -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5566) To fix the error of verifying the model alias
Guangyuan Feng created KYLIN-5566: - Summary: To fix the error of verifying the model alias Key: KYLIN-5566 URL: https://issues.apache.org/jira/browse/KYLIN-5566 Project: Kylin Issue Type: Bug Components: REST Service Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha When verifying the requested model by alias, should transfer it to lowercase. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5565) Upgrade Netty to 4.1.89 to fix the security vulnerabilities
Guangyuan Feng created KYLIN-5565: - Summary: Upgrade Netty to 4.1.89 to fix the security vulnerabilities Key: KYLIN-5565 URL: https://issues.apache.org/jira/browse/KYLIN-5565 Project: Kylin Issue Type: Bug Components: Security Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5564) Introduce Bloom Filter to optimize data scanning based on Spark
Guangyuan Feng created KYLIN-5564: - Summary: Introduce Bloom Filter to optimize data scanning based on Spark Key: KYLIN-5564 URL: https://issues.apache.org/jira/browse/KYLIN-5564 Project: Kylin Issue Type: Improvement Components: Query Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha Currently, all the data generated by Kylin are saved as Parquet files through Spark, but Kylin has not make full use of the features of Parquet when scanning data. Among them, BloomFilter must be stressed, because it's the most common tool to help READERs to skip useless data. Therefore, we introduced a approach to build BloomFilter automatically, conditionally and smartly when constructing segments, on the desired columns especially according to the query histories. After brought in BloomFilter, Spark will have a good performance improvement in the most cases. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5563) Enable operation role has the privilege to manage the model indexes
Guangyuan Feng created KYLIN-5563: - Summary: Enable operation role has the privilege to manage the model indexes Key: KYLIN-5563 URL: https://issues.apache.org/jira/browse/KYLIN-5563 Project: Kylin Issue Type: Improvement Components: Security Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha In daily works, we will cooperate with other guys to manage a shared model, generally the model is private, but we can let other roles, especially _*OPERATION role,*_ to have the privilege to manage the indexes on this model. But Kylin lacks some manners to easily enable this feature, hence we propose a new property `kylin.index.enable-operator-design` to make it work. When it's true, OPERATIONs can CRUD the indexes, but modify the model. Of course, in order to bring in this switch, we have to change the default checks for the related APIs. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5562) Building job will be scheduled and executed repeatedly
Guangyuan Feng created KYLIN-5562: - Summary: Building job will be scheduled and executed repeatedly Key: KYLIN-5562 URL: https://issues.apache.org/jira/browse/KYLIN-5562 Project: Kylin Issue Type: Bug Components: Job Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: sibing.zhang Fix For: 5.0-alpha This issue will happen in the recent versions, because of the previous changes on the logic of appending *RUNNING* job, so we need to revert the related code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5561) Optimize the build performance for models containing semi-additive measure
Guangyuan Feng created KYLIN-5561: - Summary: Optimize the build performance for models containing semi-additive measure Key: KYLIN-5561 URL: https://issues.apache.org/jira/browse/KYLIN-5561 Project: Kylin Issue Type: Bug Components: Modeling Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Yaguang Jia Fix For: 5.0-alpha When building a model with aggregate function `sum_lc`, it takes too much time to complete the calculation even on a small dataset. After dug into it's implementation, we found the root cause is that the `serialize` will always allocate a new array with `1024 * 1024` bytes as the temporary place to store the serialized value of `SumLCCounter`. Actually, only a decimal and a long value of a `SumLCCounter` object should be serialized, generally the serialized data size is about `8 + 8` bytes in 64-bit platform, so obviously the temporary array is too big to store the result. After deduce the init size of the temporary array, for example 32-Bytes, the total time to complete the calculation of `sum_lc` on 10GB datasets, have been reduced from 16min => 4min. Here is the benchmark tests: {code:java} // After optimized # Warmup: 1 iterations, 10 s each # Measurement: 5 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Throughput, ops/time # Benchmark: io.kyligence.pe.JmhSumLCApplication.dynamicLength # Run progress: 0.00% complete, ETA 00:04:00 # Fork: 1 of 2 # Warmup Iteration 1: 39082.864 ops/ms Iteration 1: 41760.550 ops/ms Iteration 2: 47911.634 ops/ms Iteration 3: 47353.936 ops/ms Iteration 4: 46888.688 ops/ms Iteration 5: 48378.075 ops/ms # Run progress: 25.00% complete, ETA 00:03:02 # Fork: 2 of 2 # Warmup Iteration 1: 39479.279 ops/ms Iteration 1: 42066.415 ops/ms Iteration 2: 48499.974 ops/ms Iteration 3: 48524.844 ops/ms Iteration 4: 48431.830 ops/ms Iteration 5: 48451.256 ops/ms Result "io.kyligence.pe.JmhSumLCApplication.dynamicLength": 46826.720 ±(99.9%) 4002.887 ops/ms [Average] (min, avg, max) = (41760.550, 46826.720, 48524.844), stdev = 2647.662 CI (99.9%): [42823.833, 50829.607] (assumes normal distribution) // Before optimized # Warmup: 1 iterations, 10 s each # Measurement: 5 iterations, 10 s each # Timeout: 10 min per iteration # Threads: 1 thread, will synchronize iterations # Benchmark mode: Throughput, ops/time # Benchmark: io.kyligence.pe.JmhSumLCApplication.fixLength # Run progress: 50.00% complete, ETA 00:02:01 # Fork: 1 of 2 # Warmup Iteration 1: 22.364 ops/ms Iteration 1: 25.354 ops/ms Iteration 2: 25.252 ops/ms Iteration 3: 20.566 ops/ms Iteration 4: 20.668 ops/ms Iteration 5: 21.585 ops/ms # Run progress: 75.00% complete, ETA 00:01:00 # Fork: 2 of 2 # Warmup Iteration 1: 22.953 ops/ms Iteration 1: 25.362 ops/ms Iteration 2: 24.041 ops/ms Iteration 3: 21.774 ops/ms Iteration 4: 25.131 ops/ms Iteration 5: 25.594 ops/ms Result "io.kyligence.pe.JmhSumLCApplication.fixLength": 23.533 ±(99.9%) 3.210 ops/ms [Average] (min, avg, max) = (20.566, 23.533, 25.594), stdev = 2.123 CI (99.9%): [20.323, 26.743] (assumes normal distribution) # Run complete. Total time: 00:04:03 REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial experiments, perform baseline and negative tests that provide experimental control, make sure the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts. Do not assume the numbers tell you what you want them to tell. Benchmark Mode Cnt Score Error Units JmhSumLCApplication.dynamicLength thrpt 10 46826.720 ± 4002.887 ops/ms JmhSumLCApplication.fixLength thrpt 10 23.533 ±3.210 ops/ms {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5560) To improving Kylin logging abilities
Guangyuan Feng created KYLIN-5560: - Summary: To improving Kylin logging abilities Key: KYLIN-5560 URL: https://issues.apache.org/jira/browse/KYLIN-5560 Project: Kylin Issue Type: Improvement Components: Integration Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha Currently most of the works, such as querying, are relying on the REST API, but Kylin lacks the measure to trace the whole life of a request, including accessing, handling, etc. Users have to pay much more time to find out the relations between the request and the corresponding procedures. In practice, we had add a new attribute `traceId` into each http request {*}HttpServletRequest{*}, and into *org.apache.kylin.rest.interceptor.KEFilter* utilizing log4j MDC(Mapped Diagnostic Context) tech to help us to improve the situation. So we hope our works could have the worth to improve Kylin. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5559) The vulnerability in Apache Avro version <= 1.10.2 will result in security issues
Guangyuan Feng created KYLIN-5559: - Summary: The vulnerability in Apache Avro version <= 1.10.2 will result in security issues Key: KYLIN-5559 URL: https://issues.apache.org/jira/browse/KYLIN-5559 Project: Kylin Issue Type: Bug Components: Security Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Assignee: Guangyuan Feng Fix For: 5.0-alpha We need to upgrade Apache Avro to 1.11.1, to avoid the potential issue. More details of the issue, please see: https://security.snyk.io/vuln/SNYK-DOTNET-APACHEAVRO-2331660 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (KYLIN-5558) The file for recording the child process to execute the asynchronous query in QUERY NODE dose not exists
Guangyuan Feng created KYLIN-5558: - Summary: The file for recording the child process to execute the asynchronous query in QUERY NODE dose not exists Key: KYLIN-5558 URL: https://issues.apache.org/jira/browse/KYLIN-5558 Project: Kylin Issue Type: Bug Components: Query Engine Affects Versions: 5.0-alpha Reporter: Guangyuan Feng Fix For: 5.0-alpha As the title mentioned, without the recording file, QUERY NODE can't kill the child process, which will result in the server losing control. -- This message was sent by Atlassian Jira (v8.20.10#820010)