[jira] [Created] (KYLIN-5655) The index details in the optimization suggestions are recommended to increase the dimension base

2023-07-19 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5655:


 Summary: The index details in the optimization suggestions are 
recommended to increase the dimension base
 Key: KYLIN-5655
 URL: https://issues.apache.org/jira/browse/KYLIN-5655
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


优化建议的索引详情页面也能和正常索引一样显示各维度的基数



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5654) Diagnostic package API (/kylin/api/system/diag? host= ip:port) The current design has security risks

2023-07-19 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5654:


 Summary: Diagnostic package API (/kylin/api/system/diag? host= 
ip:port) The current design has security risks
 Key: KYLIN-5654
 URL: https://issues.apache.org/jira/browse/KYLIN-5654
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


RT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5653) Enable Spark Parquet Page Index by default to improve query performance

2023-07-18 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5653:
--

 Summary: Enable Spark Parquet Page Index by default to improve 
query performance
 Key: KYLIN-5653
 URL: https://issues.apache.org/jira/browse/KYLIN-5653
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


Spark has officially supported Parquet PageIndex support, so kylin can enable 
the PageIndex function by default



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5652) Network anomalies or metabase anomalies cause Project Epoch to change frequently, which may cause the job in pending status for a long time and no longer to be executed

2023-07-18 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5652:
--

 Summary: Network anomalies or metabase anomalies cause Project 
Epoch to change frequently, which may cause the job in pending status for a 
long time and no longer to be executed
 Key: KYLIN-5652
 URL: https://issues.apache.org/jira/browse/KYLIN-5652
 Project: Kylin
  Issue Type: Bug
  Components: Metadata
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha
 Attachments: Network anomalies or metabase anomalies cause Project 
Epoch to change frequently, which may cause the job in pending status for a 
long time and no longer to be executed.pdf

The scheduling of the project is not shut down during the shutdown period. At 
this time, the start operation is triggered. When starting, it is found that 
the scheduling is still in the start state in the if judgment, and it is 
returned.

Then the shutdown operation ends, the task scheduling is turned off, and the 
task of the project has been in the pending state since then.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5651) supports obtaining table comment from Hive

2023-07-18 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5651:
--

 Summary: supports obtaining table comment from Hive
 Key: KYLIN-5651
 URL: https://issues.apache.org/jira/browse/KYLIN-5651
 Project: Kylin
  Issue Type: Bug
  Components: REST Service
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha
 Attachments: supports obtaining table comment from Hive.pdf

The API to get the table cannot get the table comment

Get /kylin/api/tables



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5650) In the cloud environment, there is a probability that the dictionary metadata file will be read abnormally during building job, resulting in incorrect query results.

2023-07-18 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5650:
--

 Summary: In the cloud environment, there is a probability that the 
dictionary metadata file will be read abnormally during building job, resulting 
in incorrect query results.
 Key: KYLIN-5650
 URL: https://issues.apache.org/jira/browse/KYLIN-5650
 Project: Kylin
  Issue Type: Bug
  Components: Tools, Build and Test
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha
 Attachments: In the cloud environment, there is a probability that the 
dictionary metadata file will be read abnormally during building job, resulting 
in incorrect query results..pdf

Checked the dictionary, there are no duplicate values. Checked the execution 
plan of the build dictionary step, there is no problem. Checked the steps of 
building a flat table and found that there was a problem in the step of flat 
table encoding dictionary.

The reason for the error is that the encoding is not performed after 
repartition according to the dictionary column. As shown in the figure, there 
is no repartition, and the encode column appears in the plan.

There are also the following logs:
{code:java}
2023-03-26T20:26:30,868 INFO  [logger-thread-0] dict.NGlobalDictHDFSStore : 
Commit from 
s3a://datalake-kc-s3-prd-bj/kylin/kcprodYcHG_kylin/datalake_kylin/dict/global_dict/GDT.GDT_CMPLYA_FCT_DIST_RESLT/IS_STAT/working
 to 
s3a://datalake-kc-s3-prd-bj/kylin/kcprodYcHG_kylin/datalake_kylin/dict/global_dict/GDT.GDT_CMPLYA_FCT_DIST_RESLT/IS_STAT/version_1679862387539

2023-03-26T20:31:14,501 INFO  [logger-thread-0] dict.NGlobalDictionaryV2 : 
getMetaInfo versions.length is 12
2023-03-26T20:31:14,547 INFO  [logger-thread-0] dict.NGlobalDictHDFSStore : 
because metaFiles.length is 0, metaInfo is null
2023-03-26T20:31:14,547 INFO  [logger-thread-0] dict.NGlobalDictionaryV2 : 
getMetaInfo metadata is null : [true]{code}
This is on s3, after renaming the dictionary directory, no metadata file is 
queried. However, if the meta is not obtained in the code and no error is 
reported, it is not reasonable to encode directly without repartition. In 
short, the result is that the encoding of the dictionary column on the flat 
table fails.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5649) When query contains computed columns, fail to guarantee the priority of using the aggregate index to answer the aggregate query

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5649:
--

 Summary: When query contains computed columns, fail to guarantee 
the priority of using the aggregate index to answer the aggregate query
 Key: KYLIN-5649
 URL: https://issues.apache.org/jira/browse/KYLIN-5649
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha
 Attachments: When query contains computed columns, fail to guarantee 
the priority of using the aggregate index to answer the aggregate query.pdf

After enabling the "kylin.query.use-tableindex-answer-non-raw-query = true" & & 
"kylin.query.layout.prefer-aggindex = true" parameter, the aggregate query can 
match the aggregate index and the basic detail index, but the final hit is the 
basic detail index. What is puzzling is why the aggregate query is answered 
using the basic detail index?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5648) Add log for sparder init user

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5648:
--

 Summary: Add log for sparder init user
 Key: KYLIN-5648
 URL: https://issues.apache.org/jira/browse/KYLIN-5648
 Project: Kylin
  Issue Type: Improvement
  Components: Others
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


The user name used when spark initialization needs to be explicitly printed in 
the log.

*dev design*

Add a line of log printing when spader starts
{{}}
{code:java}
logInfo(s"sparder init 
user:${UserGroupInformation.getCurrentUser.getUserName}"){code}
{{}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5647) upload hive_1_2_2 jars to HDFS before kylin start

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5647:
--

 Summary: upload hive_1_2_2 jars to HDFS before kylin start
 Key: KYLIN-5647
 URL: https://issues.apache.org/jira/browse/KYLIN-5647
 Project: Kylin
  Issue Type: Improvement
  Components: Tools, Build and Test
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha
 Attachments: Auto upload hive jars.pdf





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5646) The build job reports an error at the step of detecting time partition columns in the Yarn Cluster mode

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5646:
--

 Summary: The build job reports an error at the step of detecting 
time partition columns in the Yarn Cluster mode
 Key: KYLIN-5646
 URL: https://issues.apache.org/jira/browse/KYLIN-5646
 Project: Kylin
  Issue Type: Bug
  Components: Tools, Build and Test
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


When building Spark YARN-Cluster mode, when detecting incremental time 
partition columns, initializing KylinConfig reports an error Didn't find 
KYLIN_HOME or KYLIN_HOME

*Reproduce method*

Build the partition table model incrementally using Spark YARN_Cluster mode, 
and set kylin.engine.check-partition-col-enabled=true (the default value is 
true)

*Root Cause*

Modified the autoSetShufflePartitions of the pushdown query in [KYLIN-5571], no 
need to execute when the pre-modification build task detects the delta time 
column format (only the pushdown query is executed)

After modification, autoSetShufflePartitions is executed asynchronously, the 
following two methods will get KylinConfig through 
KylinConfig.getInstanceFromEnv,

At this time, the asynchronous execution of the new thread cannot use the built 
KylinConfig, so the KylinConfig will be initialized,

However, the build task jvm and the KE main process are not the same machine, 
and KYLIN_CONF and KYLIN_HOME cannot be obtained, so the build task fails to run
 * ResourceDetectUtils.getResourceSizeWithTimeoutByConcurrency
 * ResourceDetectUtils.getResourceSizBySerial

*fix design*

In all the logic of newly opened threads, if KylinConfig is used, this method 
KylinConfig.getInstanceFromEnv() is not used. Unified is obtained by an 
external thread and passed to the place where it needs to be used



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5645) add response params for model list api

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5645:
--

 Summary: add response params for model list api
 Key: KYLIN-5645
 URL: https://issues.apache.org/jira/browse/KYLIN-5645
 Project: Kylin
  Issue Type: Bug
Reporter: Zhiting Guo


For api GET /kylin/api/models, if set lite=false, the response will not contain 
partition_column_in_dims and empty_model, which are expected by frontend.

*fix design*

Regardless of the value of lite,add partition_column_in_dims and empty_model to 
the response.{*}{*}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5644) fix diag api security, encryption changed from base64 to AES

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5644:
--

 Summary: fix diag api security, encryption changed from base64 to 
AES
 Key: KYLIN-5644
 URL: https://issues.apache.org/jira/browse/KYLIN-5644
 Project: Kylin
  Issue Type: Bug
  Components: REST Service, Security
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


*dev design*

Continue to develop along the existing logic, adjust Base64 encryption to AES 
encryption, encryption & decryption algorithm multiplexing has been implemented 
as follows:

  Encryption: org.apache.kylin.common.util.EncryptUtil#encrypt(String 
strToEncrypt)
  Decryption: org.apache.kylin.common.util.EncryptUtil#decrypt(String 
strToDecrypt)

Because there will be special characters after AES encryption, such as: +, when 
API parameters are passed, they will be recognized as spaces, resulting in 
subsequent errors.
So here is the adjustment, the encryption algorithm is changed to: first 
encrypt with EncryptUtil#encrypt and then encrypt twice with Base64, and the 
decryption algorithm is the same: first decrypt with Base64 and then decrypt 
twice with EncryptUtil#decrypt.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5643) Add public api for batch delete index

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5643:
--

 Summary: Add public api for batch delete index
 Key: KYLIN-5643
 URL: https://issues.apache.org/jira/browse/KYLIN-5643
 Project: Kylin
  Issue Type: Improvement
  Components: Metadata
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


Move the "kylin/api/index_plans/index" DELETE API from NIndexPlanController to 
OpenIndexPlanController as a public api.

demo:
curl --location --request DELETE 
'http://127.0.0.1:9099/kylin/api/index_plans/index?project=project1_name=abc_ids=201'
 \ --header 'Accept: application/vnd.apache.kylin-v4-public+json' \ --header 
'Authorization: Basic YWRtaW46S1lMSU4='
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5642) Align the default value of the parameter kylin.metadata.audit-log.max-size with the product manual

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5642:
--

 Summary: Align the default value of the parameter 
kylin.metadata.audit-log.max-size with the product manual
 Key: KYLIN-5642
 URL: https://issues.apache.org/jira/browse/KYLIN-5642
 Project: Kylin
  Issue Type: Bug
  Components: Documentation, Metadata
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


Set the default value as 50



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5641) fix set spark conf in serverless mod

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5641:
--

 Summary: fix set spark conf in serverless mod
 Key: KYLIN-5641
 URL: https://issues.apache.org/jira/browse/KYLIN-5641
 Project: Kylin
  Issue Type: Bug
  Components: Tools, Build and Test
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


In serverless mode, it will cause a NoSuchMethodException when build a model.

To fix it, just remove the set of spark.sql.sources.repartitionWritingDataSource



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5640) Support to automatically adjust the Bloom Filter based on data distribution

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5640:
--

 Summary: Support to automatically adjust the Bloom Filter based on 
data distribution
 Key: KYLIN-5640
 URL: https://issues.apache.org/jira/browse/KYLIN-5640
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


h3. Why are the changes needed?

Now the usage of bloom filter is to specify the NDV(number of distinct values), 
and then build BloomFilter. In general scenarios, it is actually not sure how 
much the distinct value is.
If BloomFilter can be automatically generated according to the data, the file 
size can be reduced and the reading efficiency can also be improved.
h3. What changes were proposed in this pull request?

{{DynamicBlockBloomFilter}} contains multiple {{BlockSplitBloomFilter}} as 
candidates and inserts values in the candidates at the same time. Use the 
largest bloom filter as an approximate deduplication counter, and then remove 
incapable bloom filter candidates during data insertion.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5639) Refine kylin-it dependency

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5639:
--

 Summary: Refine kylin-it dependency
 Key: KYLIN-5639
 URL: https://issues.apache.org/jira/browse/KYLIN-5639
 Project: Kylin
  Issue Type: Improvement
  Components: Tools, Build and Test
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5638) kylin create spark history dir auto in SparkApplication

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5638:
--

 Summary: kylin create spark history dir auto in SparkApplication
 Key: KYLIN-5638
 URL: https://issues.apache.org/jira/browse/KYLIN-5638
 Project: Kylin
  Issue Type: Improvement
Reporter: Zhiting Guo


*dev design*

Before creating a spark session when building a job, check the configuration of 
the event log directory and find the directory. If the directory does not 
exist, create it. This will prevent different spark history directories from 
being configured for different projects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5637) minor fix get delta table ddl

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5637:
--

 Summary: minor fix get delta table ddl
 Key: KYLIN-5637
 URL: https://issues.apache.org/jira/browse/KYLIN-5637
 Project: Kylin
  Issue Type: Improvement
  Components: RDBMS Source
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


The delta data source does not support operations such as msck partition, show 
create table, etc. Need to do some processing



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5636) automatically clean up dependent files after the build task

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5636:
--

 Summary: automatically clean up dependent files after the build 
task
 Key: KYLIN-5636
 URL: https://issues.apache.org/jira/browse/KYLIN-5636
 Project: Kylin
  Issue Type: Improvement
  Components: Tools, Build and Test
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


*question:*
The files uploaded under the path spark.kubernetes.file.upload.path are not 
automatically deleted
1: When spark creates a driverPod, it uploads dependencies to the specified 
path. The build task is in cluster mode and needs to create a driverPod. 
Running the build task multiple times results in a large path file.
2: At present, the upload.path path we configured (s3a://kylin/spark-on-k8s) is 
a fixed path, and spark will create a subdirectory in this directory, the 
spark-upload-uuid directory, and then store the dependencies in it.
*dev design*
Core idea, add dynamic subdirectory under the original upload.path path, delete 
the entire subdirectory when the task is over
Build task: upload.path + jobId (e.g. s3a://kylin/spark-on-k8s/uuid)
Delete the dependency directory when the build task is finished
 
Automatically delete dependent function is called, kill-9 situation will lead 
to the deletion function is not called, garbage cleaning function needs to be 
added to the bottom of the policy, such as greater than three months before the 
directory is automatically deleted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5635) Adapt for delta table

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5635:
--

 Summary: Adapt for delta table
 Key: KYLIN-5635
 URL: https://issues.apache.org/jira/browse/KYLIN-5635
 Project: Kylin
  Issue Type: Improvement
  Components: RDBMS Source
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


When obtaining delta column information, because the column information is not 
saved in the catalog (due to the use of DeltaCatalog), it cannot be obtained 
directly from the catalog. We first determine whether the table is a delta 
table. The Delta SDK provides a function. If so, read the table through 
spark.table to get the schema. Here, spark scans Metadata to get the schema 
information under the back path
 
Also due to the use of DeltaCatalog, delta table does not support the show 
create table statement, this is because deltaCatalog does some checks, does not 
support this SQL , here by judging whether it is delta in advance, if it is 
directly through location and table spell a ddl return.
 
Limit: The partition column is not processed here, so the partition column is 
not recognized, and delta does not manage its own partition through the 
catalog. It is obtained in real time by scanning the Metadata under the 
confidant path, so it does not affect the reading of data. The only place that 
has an impact is the function of snapshot partition construction.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5634) Support query executor expansion and contraction

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5634:
--

 Summary: Support query executor expansion and contraction
 Key: KYLIN-5634
 URL: https://issues.apache.org/jira/browse/KYLIN-5634
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha
 Attachments: Support query executor expansion and contraction.pdf





--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5633) The query can answered with the existing data of the current model (accepting index or segment data is not uniform), and there should be no query failure or push-down

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5633:
--

 Summary: The query can answered with the existing data of the 
current model (accepting index or segment data is not uniform), and there 
should be no query failure or push-down
 Key: KYLIN-5633
 URL: https://issues.apache.org/jira/browse/KYLIN-5633
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha
 Attachments: Segment heterogeneous query behavior (1).pdf

[^Segment heterogeneous query behavior (1).pdf]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5632) Optimize and clean up some useless code in the query

2023-07-17 Thread Zhiting Guo (Jira)
Zhiting Guo created KYLIN-5632:
--

 Summary: Optimize and clean up some useless code in the query
 Key: KYLIN-5632
 URL: https://issues.apache.org/jira/browse/KYLIN-5632
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Zhiting Guo
 Fix For: 5.0-alpha


this issue will includes:

 1. Refactoring the selection logic of realizations

 2. Rename, move package or drop some useless class

 3. fix some unstable ut, add some ignored ut, move some ut to the module of 
kylin-it

 4. move index matchers to ChooserContext

 5. Move candidate sorting method to the QueryRouter



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5631) Logical view some issues

2023-07-13 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5631:


 Summary: Logical view some issues
 Key: KYLIN-5631
 URL: https://issues.apache.org/jira/browse/KYLIN-5631
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5630) Query history some css issues

2023-07-13 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5630:


 Summary: Query history some css issues
 Key: KYLIN-5630
 URL: https://issues.apache.org/jira/browse/KYLIN-5630
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5629) After the multi-level partition model is modified to a full build model, the build job fails

2023-07-13 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5629:


 Summary: After the multi-level partition model is modified to a 
full build model, the build job fails
 Key: KYLIN-5629
 URL: https://issues.apache.org/jira/browse/KYLIN-5629
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5628) The configuration items of kylin.source.ddl.logical-view.database are inconsistent

2023-07-13 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5628:


 Summary: The configuration items of 
kylin.source.ddl.logical-view.database are inconsistent
 Key: KYLIN-5628
 URL: https://issues.apache.org/jira/browse/KYLIN-5628
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


参数项:kylin.source.ddl.logical-view-database=DB_logical_view



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5627) After the logical view is created successfully, the front end does not display an entry prompting for loading

2023-07-13 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5627:


 Summary: After the logical view is created successfully, the front 
end does not display an entry prompting for loading
 Key: KYLIN-5627
 URL: https://issues.apache.org/jira/browse/KYLIN-5627
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


RT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5626) Copy the successfully queried SQL from the query history, because KE rearranged the SQL format with more spaces, the query fails when you query again

2023-07-13 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5626:


 Summary: Copy the successfully queried SQL from the query history, 
because KE rearranged the SQL format with more spaces, the query fails when you 
query again
 Key: KYLIN-5626
 URL: https://issues.apache.org/jira/browse/KYLIN-5626
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


从界面查询历史复制由帆软发出的灵活报表查询成功的SQL由于N 和 ‘ ‘之间多了个空格, 导致SQL查询失败



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5625) Edit the custom table index, delete the ShardBy column, and build the index failed

2023-07-13 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5625:


 Summary: Edit the custom table index, delete the ShardBy column, 
and build the index failed
 Key: KYLIN-5625
 URL: https://issues.apache.org/jira/browse/KYLIN-5625
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


在后端处理手动添加明细索引请求的时候,没有检测是否 {{shard_by_columns}} 列 一定要在 {{col_order}} 
字段中,所以添加新索引的时候是可能发生丢失 col_order 
的情况的,这是不符合预期的,构建时也会发生错误,本地测试了下如下请求可以请求成功,但实际存储了错误的元数据
 {{curl -X POST \  http://10.1.2.168:7068/kylin/api/index_plans/table_index \
  -H 'accept: application/vnd.apache.kylin-v4+json' \
  -H 'accept-language: en' \
  -H 'authorization: Basic QURNSU46S1lMSU4=' \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -H 'postman-token: fb63f31f-d8d2-b728-3f3b-553e38b02cb2' \
  -d 
'\{"id":"","col_order":["TF_FACTS_LEADS_DAY.NEW_INTENTIONS"],"sort_by_columns":[],
 \
  
"shard_by_columns":["TF_FACTS_LEADS_DAY.NEW_LEADS"],"load_data":false,"index_range":"EMPTY","project":"SFM","model_id":"290c552c-c38f-ea8f-8c18-e7850d7419cc"}'}}


查看了前端,前端也有 bug,当勾选某个列,选择为 shard by 列,然后点击取消勾选,然后发送请求,col_order 
中没有加上这个列,shard_by_columns 加上了这个列,就变成了和上述请求 api 同样的效果



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5624) Text recognition has been added to the hierarchy dimension, the union dimension, and the detail index

2023-07-12 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5624:


 Summary: Text recognition has been added to the hierarchy 
dimension, the union dimension, and the detail index
 Key: KYLIN-5624
 URL: https://issues.apache.org/jira/browse/KYLIN-5624
 Project: Kylin
  Issue Type: Improvement
Reporter: Laura Xia






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5623) Enter the snapshot page immediately after deleting the source table, and an error is reported

2023-07-12 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5623:


 Summary: Enter the snapshot page immediately after deleting the 
source table, and an error is reported
 Key: KYLIN-5623
 URL: https://issues.apache.org/jira/browse/KYLIN-5623
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


在快照页面新增快照,进入数据源页面删除该表,立即进入快照列表页面 报错



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5622) Support to create logical views to improve the data reprocessing ability of data developers

2023-07-12 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5622:


 Summary: Support to create logical views to improve the data 
reprocessing ability of data developers
 Key: KYLIN-5622
 URL: https://issues.apache.org/jira/browse/KYLIN-5622
 Project: Kylin
  Issue Type: Task
Reporter: Laura Xia






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5621) The order of ascending and descending order by start time or end time is wrong on the index completion page of incrementally built models

2023-07-12 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5621:


 Summary: The order of ascending and descending order by start time 
or end time is wrong on the index completion page of incrementally built models
 Key: KYLIN-5621
 URL: https://issues.apache.org/jira/browse/KYLIN-5621
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


RT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5620) Unified the Chinese and English copywriting on the model setting page

2023-07-11 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5620:


 Summary: Unified the Chinese and English copywriting on the model 
setting page
 Key: KYLIN-5620
 URL: https://issues.apache.org/jira/browse/KYLIN-5620
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia
 Attachments: image-2023-07-12-11-39-43-725.png

!image-2023-07-12-11-39-43-725.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5619) Unify the English copywriting of the "Save and Build" button

2023-07-11 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5619:


 Summary: Unify the English copywriting of the "Save and Build" 
button
 Key: KYLIN-5619
 URL: https://issues.apache.org/jira/browse/KYLIN-5619
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia
 Attachments: image-2023-07-12-11-38-15-181.png

!image-2023-07-12-11-38-15-181.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5618) Unify the copywriting of the ShardBy column on the Web GUI

2023-07-11 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5618:


 Summary: Unify the copywriting of the ShardBy column on the Web GUI
 Key: KYLIN-5618
 URL: https://issues.apache.org/jira/browse/KYLIN-5618
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia
 Attachments: image-2023-07-12-11-35-54-602.png, 
image-2023-07-12-11-36-02-682.png

!image-2023-07-12-11-35-54-602.png!

!image-2023-07-12-11-36-02-682.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5617) Semi-cumulative measurement Time dimension field types are not filtered

2023-07-05 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5617:


 Summary: Semi-cumulative measurement Time dimension field types 
are not filtered
 Key: KYLIN-5617
 URL: https://issues.apache.org/jira/browse/KYLIN-5617
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


半累加度量SUM_LC,对于时间维度没有做 boolean、float、double 类型的过滤



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5616) model failed to display description information

2023-07-05 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5616:


 Summary: model failed to display description information
 Key: KYLIN-5616
 URL: https://issues.apache.org/jira/browse/KYLIN-5616
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


{{创建模型后写入描述,但不会正常显示}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5615) The front end limits the semi-cumulative metric field from being reused

2023-07-05 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5615:


 Summary: The front end limits the semi-cumulative metric field 
from being reused
 Key: KYLIN-5615
 URL: https://issues.apache.org/jira/browse/KYLIN-5615
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


在添加半累加度量的时候,计算字段和时间字段已被其它半累加度量使用,这时候前端有限制,按照设计这里应该没有限制不能重复使用才是

 

另外,当前KE的度量逻辑是单个添加的时候不允许使用重复的列,批量添加时却没有限制



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5614) When creating a model through sql, click to edit the model, and the dimension table display will merge from the expanded state

2023-07-05 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5614:


 Summary: When creating a model through sql, click to edit the 
model, and the dimension table display will merge from the expanded state
 Key: KYLIN-5614
 URL: https://issues.apache.org/jira/browse/KYLIN-5614
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


RT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5613) The time precision of the query is inconsistent

2023-07-05 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5613:


 Summary: The time precision of the query is inconsistent
 Key: KYLIN-5613
 URL: https://issues.apache.org/jira/browse/KYLIN-5613
 Project: Kylin
  Issue Type: Improvement
Reporter: Laura Xia


查询的时间精度不一致



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5612) WAINING State Model Click to see details Front-end error

2023-07-05 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5612:


 Summary: WAINING State Model Click to see details Front-end error
 Key: KYLIN-5612
 URL: https://issues.apache.org/jira/browse/KYLIN-5612
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia
 Attachments: image-2023-07-05-16-54-00-291.png

!image-2023-07-05-16-54-00-291.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5611) Failed to query history details

2023-07-05 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5611:


 Summary: Failed to query history details
 Key: KYLIN-5611
 URL: https://issues.apache.org/jira/browse/KYLIN-5611
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


在查询历史界面展开详情没反应或者展开是空白



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5610) Should automatically copy some of the necessary jars from kylin/server/jar into kylin/spark/jars

2023-07-04 Thread huangsheng (Jira)
huangsheng created KYLIN-5610:
-

 Summary: Should automatically copy some of the necessary jars from 
kylin/server/jar into kylin/spark/jars
 Key: KYLIN-5610
 URL: https://issues.apache.org/jira/browse/KYLIN-5610
 Project: Kylin
  Issue Type: Bug
  Components: Tools, Build and Test
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-beta


When soft-affinity is enabled, spark depends on some special jar packages. 
Therefore, we need to copy these jar packages to spark/jars automatically after 
downloading spark



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5609) Fix security vulnerabilities, upgrade spark version to 3.2.0-kylin-4.6.8.0

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5609:
-

 Summary: Fix security vulnerabilities, upgrade spark version to 
3.2.0-kylin-4.6.8.0 
 Key: KYLIN-5609
 URL: https://issues.apache.org/jira/browse/KYLIN-5609
 Project: Kylin
  Issue Type: Bug
  Components: Spark Engine
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-alpha


Due to the CVE-2023-24998 security vulnerability in spark 3.2.0-kylin-4.6.7, 
upgrade spark to 3.2.0-kylin-4.6.8.0 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5608) [Aggregation Group API] Calling the newly added index of the aggregation group API, the order of dimensions displayed on the page is inconsistent with the order passed in

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5608:
-

 Summary: [Aggregation Group API] Calling the newly added index of 
the aggregation group API, the order of dimensions displayed on the page is 
inconsistent with the order passed in by the api
 Key: KYLIN-5608
 URL: https://issues.apache.org/jira/browse/KYLIN-5608
 Project: Kylin
  Issue Type: Bug
  Components: REST Service
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-alpha


[Aggregation Group API] Calling the newly added index of the aggregation group 
API, the order of dimensions displayed on the page is inconsistent with the 
order passed in by the api



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5607) When querying with high concurrency, the query may report an error (the thread is not safe when ACL permission multi-threaded concurrent access)

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5607:
-

 Summary: When querying with high concurrency, the query may report 
an error (the thread is not safe when ACL permission multi-threaded concurrent 
access)
 Key: KYLIN-5607
 URL: https://issues.apache.org/jira/browse/KYLIN-5607
 Project: Kylin
  Issue Type: Bug
  Components: Metadata
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-alpha


ACL records are taken out of the cache, and the internal array is modified by 
sorting in the init method. Since the metadata in Kylin is shared, there will 
be multi-threaded access problems, so there will be concurrent modification 
problems here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5606) [Aggregation Group API] The required item aggregate_group is not passed, and the prompt information is unclear

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5606:
-

 Summary: [Aggregation Group API] The required item aggregate_group 
is not passed, and the prompt information is unclear
 Key: KYLIN-5606
 URL: https://issues.apache.org/jira/browse/KYLIN-5606
 Project: Kylin
  Issue Type: Bug
  Components: REST Service
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-alpha
 Attachments: image-2023-07-03-19-29-42-051.png

[Aggregation Group API] The required item aggregate_group is not passed, and 
the prompt information is unclear

!image-2023-07-03-19-29-42-051.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5605) When starting kylin in the FI environment, if the Kerberos configuration KAP_KERBEROS_ENABLED is empty, the jar package replacement will not be performed, resulting in st

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5605:
-

 Summary: When starting kylin in the FI environment, if the 
Kerberos configuration KAP_KERBEROS_ENABLED is empty, the jar package 
replacement will not be performed, resulting in startup failure
 Key: KYLIN-5605
 URL: https://issues.apache.org/jira/browse/KYLIN-5605
 Project: Kylin
  Issue Type: Bug
  Components: Environment 
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-alpha


When starting kylin in the FI environment, if the Kerberos configuration 
KAP_KERBEROS_ENABLED is empty, the jar package replacement will not be 
performed, resulting in startup failure



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5604) Open API for adding aggregation group function

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5604:
-

 Summary: Open API for adding aggregation group function
 Key: KYLIN-5604
 URL: https://issues.apache.org/jira/browse/KYLIN-5604
 Project: Kylin
  Issue Type: New Feature
  Components: REST Service
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-alpha


Since the latest version of Kylin does not open API interfaces for functions 
such as model editing, task management-task deletion, adding detailed indexes, 
and adding aggregation groups, etc.

We are currently working on an indicator platform, and we need to use these 
interfaces to nest into our own programs for secondary development, so we hope 
to open up these APIs



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5603) After the V2 dictionary is automatically upgraded to the V3 dictionary, the dictionary data is abnormal, resulting in incorrect query results

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5603:
-

 Summary: After the V2 dictionary is automatically upgraded to the 
V3 dictionary, the dictionary data is abnormal, resulting in incorrect query 
results
 Key: KYLIN-5603
 URL: https://issues.apache.org/jira/browse/KYLIN-5603
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-alpha


After v2 upgrades the v3 dictionary, if v2 and v3 are mixed, the query result 
is incorrect

Root Cause

The project opens the v3 dictionary construction, but when the v3 dictionary 
construction reads the v2 dictionary, the db name is not included in the path, 
so the v2 dictionary will not be read, causing the v2 upgrade and the v3 
dictionary conversion step to be skipped directly, and v3 is equivalent to 
recoding. Therefore, the v2 upgrade v3 dictionary must not be available, and 
the encoding is disordered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5602) Added logs related to Segment Pruning for derived dimension queries

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5602:
-

 Summary:  Added logs related to Segment Pruning for derived 
dimension queries
 Key: KYLIN-5602
 URL: https://issues.apache.org/jira/browse/KYLIN-5602
 Project: Kylin
  Issue Type: New Feature
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-alpha


Supplement Derived segment pruning related logs
 # Query whether Derived Segment Pruning is used, and filter a few
 # Add Metric to quantify the effect of Derived Segment Pruning, such as: 
counting the filtered segment size and counting the time of Bloom Filter



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5601) V2 dictionary workaround fixes

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5601:
-

 Summary: V2 dictionary workaround fixes
 Key: KYLIN-5601
 URL: https://issues.apache.org/jira/browse/KYLIN-5601
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: 5.0-alpha
Reporter: huangsheng
 Fix For: 5.0-alpha


In some scenarios, AQE optimizes the step of writing the dictionary from 
repartition in the execution plan, skips the encoding of some partitioned 
dictionaries, and there is no abnormal prompt in the task status, which 
eventually leads to an error in the calculation of the count distinct metric.

So consider turning off AQE during dictionary construction



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5600) LDAP DN is not case sensitive, resulting in user login failure

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5600:
-

 Summary: LDAP DN is not case sensitive, resulting in user login 
failure
 Key: KYLIN-5600
 URL: https://issues.apache.org/jira/browse/KYLIN-5600
 Project: Kylin
  Issue Type: Bug
  Components: REST Service, Security
Affects Versions: 5.0-alpha
Reporter: huangsheng
 Fix For: 5.0-alpha


In some user scenarios, uppercase and lowercase logins to LDAP fail.

Root Cause:

When all users are obtained from ldapUserService in the code, the attribute 
names in the recorded dn contain uppercase letters, but the DN attribute names 
passed in by customers when they log in to ldap are lowercase, resulting in 
inconsistent capitalization and login failure. Customers here CN 
=xxx,DU=xxx,DC=xxx, but ldap here is cn=xxx,du=xxx,dc=xxx

A point where later maintenance can be optimized: 

When troubleshooting LDAP problems, there are often strange problems that the 
user names cannot be matched. It is very laborious to troubleshoot. You need to 
add this information to the log instead of printing it all the time. You can 
consider printing it after polling for a number of times, and printing it when 
it is loaded for the first time. and so on

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5599) In the scenario where there are many users in the user group, the performance of assigning users to the user group is extremely poor

2023-07-03 Thread huangsheng (Jira)
huangsheng created KYLIN-5599:
-

 Summary: In the scenario where there are many users in the user 
group, the performance of assigning users to the user group is extremely poor
 Key: KYLIN-5599
 URL: https://issues.apache.org/jira/browse/KYLIN-5599
 Project: Kylin
  Issue Type: New Feature
  Components: REST Service
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-beta, 5.0-alpha


Currently, in my development environment, there are 2w+ users in a user group, 
and 1w+ users in the production environment. In the scenario where there are 
many users in the user group, the performance of assigning users to the user 
group is extremely poor.

So I want to ensure that when the number of users in the user group is more 
than 20,000, assign users to the user group to respond within a reasonable time 
(such as 10 seconds)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5598) Support Kylin to use zk to use username + password (no kerberos required)

2023-07-02 Thread huangsheng (Jira)
huangsheng created KYLIN-5598:
-

 Summary: Support Kylin to use zk to use username + password (no 
kerberos required)
 Key: KYLIN-5598
 URL: https://issues.apache.org/jira/browse/KYLIN-5598
 Project: Kylin
  Issue Type: New Feature
  Components: Environment 
Affects Versions: 5.0-alpha
Reporter: huangsheng
Assignee: huangsheng
 Fix For: 5.0-beta, 5.0-alpha


Support Kylin to authenticate zk to use username + password (kerberos is not 
required). In some user-defined scenarios, there is no kerberos, but the user 
name and password are used to authenticate zk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5597) When the /etc/hosts file of the KE node does not configure the IP corresponding to the cluster domain name in the hadoop conf configuration file, KE will fail to start

2023-07-02 Thread huangsheng (Jira)
huangsheng created KYLIN-5597:
-

 Summary: When the /etc/hosts file of the KE node does not 
configure the IP corresponding to the cluster domain name in the hadoop conf 
configuration file, KE will fail to start
 Key: KYLIN-5597
 URL: https://issues.apache.org/jira/browse/KYLIN-5597
 Project: Kylin
  Issue Type: Bug
  Components: Environment , Security
Affects Versions: 5.0-alpha
Reporter: huangsheng
 Fix For: 5.0-alpha


If the IP corresponding to the cluster domain name in the hadoop conf 
configuration file is not configured in the /etc/hosts file of the newly added 
KE node, the  command 
{code:java}
get-properties.sh kylin.kerberos.enabled{code}
will fail when the service starts, and throw java.net.UnknownHostException



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5596) kylin5 dashboard test units error issue

2023-06-30 Thread Li Can (Jira)
Li Can created KYLIN-5596:
-

 Summary: kylin5 dashboard test units error issue
 Key: KYLIN-5596
 URL: https://issues.apache.org/jira/browse/KYLIN-5596
 Project: Kylin
  Issue Type: Test
Affects Versions: 5.0-beta
Reporter: Li Can
Assignee: Li Can
 Fix For: 5.0-beta


main branch code updated, but some code of dashboard is not been consistent 
with the main branch, the test units need to fix. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Final Reminder: Community Over Code call for presentations closing soon

2023-06-28 Thread Rich Bowen
[Note: You're receiving this email because you are subscribed to one or
more project dev@ mailing lists at the Apache Software Foundation.]

This is your final reminder that the Call for Presentations for
Community Over Code (formerly known as ApacheCon) is closing soon - on
Thursday, 13 July 2023 at 23:59:59 GMT.

https://communityovercode.org/call-for-presentations/

We are looking for talk proposals on all topics related to ASF projects
and open source software.

The event will be held in Halifax, Nova Scotia, Octiber 7th through
10th. More details about the event may be found on the event website at
https://communityovercode.org/

Rich, for the event planners


[jira] [Created] (KYLIN-5595) [kylin 5.0] Launch Job Node not initialize spark session issue

2023-06-28 Thread Li Can (Jira)
Li Can created KYLIN-5595:
-

 Summary: [kylin 5.0] Launch Job Node not initialize spark session 
issue
 Key: KYLIN-5595
 URL: https://issues.apache.org/jira/browse/KYLIN-5595
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine
Affects Versions: 5.0-alpha
Reporter: Li Can
Assignee: Li Can
 Fix For: 5.0-alpha
 Attachments: image (87).png, image (88).png

   Saving model will execute  'checkFlatTableSql'  method at job node, and it 
will not skip the step by default. When execute 'checkFlatTableSql' method, it 
will initialize spark session if the job node just started, the process of 
getting spark session costs too much time.

  The pic 87 shows that get spark session costs more than 63s, and the 
execution of checking sql costs more than 2s, it is not friendly for saving 
model first time after node launched, and it is also unreasonable.

  So I suggest that the job node's process of initialization spark session 
should be consistent with the query node, it means that the spark session 
should be initialized as the node just started. And the spark session is a 
singleton model, just need once initialization, as the pic 88 display.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


TAC Applications for Community Over Code North America and Asia now open

2023-06-16 Thread Gavin McDonald
Hi All,

(This email goes out to all our user and dev project mailing lists, so you
may receive this
email more than once.)

The Travel Assistance Committee has opened up applications to help get
people to the following events:


*Community Over Code Asia 2023 - *
*August 18th to August 20th in Beijing , China*

Applications for this event closes on the 6th July so time is short, please
apply as soon as possible. TAC is prioritising applications from the Asia
and Oceania regions.

More details on this event can be found at:
https://apachecon.com/acasia2023/

More information on how to apply please read: https://tac.apache.org/


*Community Over Code North America - *
*October 7th to October 10th in Halifax, Canada*

Applications for this event closes on the 22nd July. We expect many
applications so please do apply as soon as you can. TAC is prioritising
applications from the North and South America regions.

More details on this event can be found at: https://communityovercode.org/

More information on how to apply please read: https://tac.apache.org/


*Have you applied to be a Speaker?*

If you have applied or intend to apply as a Speaker at either of these
events, and think you
may require assistance for Travel and/or Accommodation - TAC advises that
you do not
wait until you have been notified of your speaker status and to apply
early. Should you
not be accepted as a speaker and still wish to attend you can amend you
application to
include Conference fees, or, you may withdraw your application.

The call for presentations for Halifax is here:
https://communityovercode.org/call-for-presentations/
and you have until the 13th of July to apply.

The call for presentations for Beijing is here:
https://apachecon.com/acasia2023/cfp.html
and you have until the 18th June to apply.

*IMPORTANT Note on Visas:*

It is important that you apply for a Visa as soon as possible - do not wait
until you know if you have been accepted for Travel Assistance or not, as
due to current wait times for Interviews in some Countries, waiting that
long may be too late, so please do apply for a Visa right away. Contact
tac-ap...@tac.apache.org if you need any more information or assistance in
this area.

*Spread the Word!!*

TAC encourages you to spread the word about Travel Assistance to get to
these events, so feel free to repost as you see fit on Social Media, at
work, schools, universities etc etc...

Thank You and hope to see you all soon

Gavin McDonald on behalf of the ASF Travel Assistance Committee.


[jira] [Created] (KYLIN-5594) Support separation of data permissions and management permissions

2023-06-15 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5594:


 Summary: Support separation of data permissions and management 
permissions
 Key: KYLIN-5594
 URL: https://issues.apache.org/jira/browse/KYLIN-5594
 Project: Kylin
  Issue Type: New Feature
Reporter: Laura Xia


*原需求*

PLG 和 Managed Service 这种模式下,我们应该是作为服务的运维主体,但是一些诊断包等运维功能,只有 ADMIN 
用户拥有,对于运维人员来说,ADMIN 用户的权限过大,应该有进一步的拆分

*沟通结果*

1托管运维希望在某一个权限(例如 运维)以上就可以打诊断包,最好是单独赋权。

2需要考虑Managed Service更多的权限需求



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5593) After the user is turned off the data permission, the sampled data can still be seen

2023-06-15 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5593:


 Summary: After the user is turned off the data permission, the 
sampled data can still be seen
 Key: KYLIN-5593
 URL: https://issues.apache.org/jira/browse/KYLIN-5593
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia
 Attachments: image-2023-06-15-17-43-50-814.png, 
image-2023-06-15-17-44-18-775.png

无论哪个来源的数据,关闭数据查询权限后,均不能查看明细数据,但可以查看表结构。
比如:工资明细表,
 # 通过抽样获取维度基数,所有字段的最小值、最大值不是具体某一个员工,应当允许查看。(见图 1)

 # 抽样数据,则能看到10 个员工的每月工资,这当然是不能允许的(见图 2)图1

!image-2023-06-15-17-43-50-814.png!

图2

!image-2023-06-15-17-44-18-775.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5592) Click to delete the connection between the fact table and the dimension table, cancel the operation, return to the query analysis screen, click Delete, will pop up to del

2023-06-15 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5592:


 Summary: Click to delete the connection between the fact table and 
the dimension table, cancel the operation, return to the query analysis screen, 
click Delete, will pop up to delete the model connection
 Key: KYLIN-5592
 URL: https://issues.apache.org/jira/browse/KYLIN-5592
 Project: Kylin
  Issue Type: Bug
Reporter: Laura Xia


步骤一:在查询分析界面任意查询一条SQL

SELECT "KYLIN_SALES"."TRANS_ID"
FROM
"DEFAULT"."KYLIN_SALES" as "KYLIN_SALES"
LIMIT 500

步骤二:在查询分析窗口里按下delete键,可以正常删除内容

步骤三:点击一个模型(模型有连接关系,事实表join维表)

步骤四:点击编辑模型,点击删除join关系的x按钮

步骤五:弹出删除关联关系的提示,点击取消,再取消编辑模型,返回查询分析界面

步骤六:重复步骤二的操作,会弹出删除关联关系的提示,取消掉弹窗后,无法输入内容



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5591) Dev Env Setting Docs

2023-06-10 Thread unical1988 (Jira)
unical1988 created KYLIN-5591:
-

 Summary: Dev Env Setting Docs
 Key: KYLIN-5591
 URL: https://issues.apache.org/jira/browse/KYLIN-5591
 Project: Kylin
  Issue Type: Improvement
Reporter: unical1988


I am setting the development environment for Kylin in Windows, and I am 
following their docs to do so 
:([https://kylin.apache.org/development40/dev_env.html])

The docs state that If using IntellJ Idea>17 then there's a need to modify 
“server/kylin-server.iml” file, replace all “PROVIDED” to “COMPILE”, otherwise 
an {{“java.lang.NoClassDefFoundError: org/apache/catalina/LifecycleListener”}} 
error may be thrown..

I am at this point and I can't find the mentioned file in the 
server/kylin-server.iml under kylin/server in the code cloned through git clone 
[https://github.com/apache/kylin.git]

Any clues what is this file kylin-server.iml ? and what is meant by replace all 
provided by compile?

Thanks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5590) spark cube job supports priority, add job execution limit

2023-06-09 Thread Chuang Lee (Jira)
Chuang Lee created KYLIN-5590:
-

 Summary: spark cube job supports priority, add job execution limit
 Key: KYLIN-5590
 URL: https://issues.apache.org/jira/browse/KYLIN-5590
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine
Affects Versions: v4.0.1
Reporter: Chuang Lee


spark cube job supports priority, add job execution limit parallelism limit and 
execution time period limit to prevent excessive cluster resources



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5589) Supports semi-additive measure

2023-06-09 Thread Laura Xia (Jira)
Laura Xia created KYLIN-5589:


 Summary: Supports semi-additive measure
 Key: KYLIN-5589
 URL: https://issues.apache.org/jira/browse/KYLIN-5589
 Project: Kylin
  Issue Type: New Feature
Reporter: Laura Xia


创建模型时,增加半累加度量,同时在模型中正确保存半累加度量的元数据。



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5588) Update spark version to 3.2.0-kylin-4.6.7.0

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5588:
-

 Summary: Update spark version to 3.2.0-kylin-4.6.7.0
 Key: KYLIN-5588
 URL: https://issues.apache.org/jira/browse/KYLIN-5588
 Project: Kylin
  Issue Type: Bug
  Components: Spark Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5587) Upgrade spring-webmvc to 5.3.26 to fix the vulnerability

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5587:
-

 Summary: Upgrade spring-webmvc to 5.3.26 to fix the vulnerability
 Key: KYLIN-5587
 URL: https://issues.apache.org/jira/browse/KYLIN-5587
 Project: Kylin
  Issue Type: Bug
  Components: Others, Security
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5586) Upgrade json-smart from 2.4.7 to 2.4.9, to eliminate the vulnerability

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5586:
-

 Summary: Upgrade json-smart from 2.4.7 to 2.4.9, to eliminate the 
vulnerability
 Key: KYLIN-5586
 URL: https://issues.apache.org/jira/browse/KYLIN-5586
 Project: Kylin
  Issue Type: Bug
  Components: Security
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5585) Bug fix for loading tables, to add the corresponding message of the failure into http response

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5585:
-

 Summary: Bug fix for loading tables, to add the corresponding 
message of the failure into http response
 Key: KYLIN-5585
 URL: https://issues.apache.org/jira/browse/KYLIN-5585
 Project: Kylin
  Issue Type: Bug
  Components: RDBMS Source, REST Service
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5584) To fix sonar checked errors

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5584:
-

 Summary: To fix sonar checked errors
 Key: KYLIN-5584
 URL: https://issues.apache.org/jira/browse/KYLIN-5584
 Project: Kylin
  Issue Type: Bug
  Components: Others
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5583) Minor bug fix for returning the wrong result by query with grouping sets

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5583:
-

 Summary: Minor bug fix for returning the wrong result by query 
with grouping sets
 Key: KYLIN-5583
 URL: https://issues.apache.org/jira/browse/KYLIN-5583
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


The issue is brought in by https://issues.apache.org/jira/browse/KYLIN-5577.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5582) Minor fix for query collectors in BloomFilter

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5582:
-

 Summary: Minor fix for query collectors in BloomFilter
 Key: KYLIN-5582
 URL: https://issues.apache.org/jira/browse/KYLIN-5582
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5581) Can't using metadata to respond to min/max aggregations on date/timestamp columns

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5581:
-

 Summary: Can't using metadata to respond to min/max aggregations 
on date/timestamp columns
 Key: KYLIN-5581
 URL: https://issues.apache.org/jira/browse/KYLIN-5581
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


Th root cause is that Kylin will parse Date/Timestamp value as Integer 
internally, when enabling _kylin.query.try-route-to-metadata-enabled,_ the 
current code will mismatch the real data type.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5580) Refactor multi-tenant to make resource separable

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5580:
-

 Summary: Refactor multi-tenant to make resource separable
 Key: KYLIN-5580
 URL: https://issues.apache.org/jira/browse/KYLIN-5580
 Project: Kylin
  Issue Type: Improvement
  Components: REST Service
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5579) Remove wrong exclusions

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5579:
-

 Summary: Remove wrong exclusions
 Key: KYLIN-5579
 URL: https://issues.apache.org/jira/browse/KYLIN-5579
 Project: Kylin
  Issue Type: Bug
  Components: Others
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5578) Using metadata to respond to min/max aggregation queries

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5578:
-

 Summary: Using metadata to respond to min/max aggregation queries
 Key: KYLIN-5578
 URL: https://issues.apache.org/jira/browse/KYLIN-5578
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha


KE's segment contains enough metrics, especially existed min/max statistics, so 
it's possible to respond to min/max aggregations conditionally, thus to enhance 
KE's querying speed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5577) Should prohibit creating a new model of which name is equal to the existed ignoring letter case

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5577:
-

 Summary: Should prohibit creating a new model of which name is 
equal to the existed ignoring letter case
 Key: KYLIN-5577
 URL: https://issues.apache.org/jira/browse/KYLIN-5577
 Project: Kylin
  Issue Type: Bug
  Components: REST Service
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5576) Listing files using WHERE conditions with subquery on partition columns will lead to the failure of building model

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5576:
-

 Summary: Listing files using WHERE conditions with subquery on 
partition columns will lead to the failure of building model
 Key: KYLIN-5576
 URL: https://issues.apache.org/jira/browse/KYLIN-5576
 Project: Kylin
  Issue Type: Bug
  Components: Modeling
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha


KE should not using the filters containing subquery on partition columns  to 
detecting resources.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5575) OPERATION role has no privilege to build the index if a model contains the base index.

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5575:
-

 Summary: OPERATION role has no privilege to build the index if a 
model contains the base index.
 Key: KYLIN-5575
 URL: https://issues.apache.org/jira/browse/KYLIN-5575
 Project: Kylin
  Issue Type: Bug
  Components: Modeling
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


OPERATION role should also have the privilege to modify the base index.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5574) Build model failed if any of the source table schema had been changed, saying deleted unrelated columns.

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5574:
-

 Summary: Build model failed if any of the source table schema had 
been changed, saying deleted unrelated columns. 
 Key: KYLIN-5574
 URL: https://issues.apache.org/jira/browse/KYLIN-5574
 Project: Kylin
  Issue Type: Bug
  Components: Modeling
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


How the error occurred?

During build a model, KE will construct a new SELECT sql from KE's table 
metadata, to read data from source tables. But if the user had deleted some 
unused columns from any source table, KE's metadata will be different with 
Hive's metastore, as a consequence, `SparkSession::sql` will throws validation 
exceptions because of the mismatched columns found in the SELECT sql.

How to solve it?

Using the intersection between KE's columns and Hive metastore's.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5573) Refine the response message when loading more than 1000 tables

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5573:
-

 Summary: Refine the response message when loading more than 1000 
tables
 Key: KYLIN-5573
 URL: https://issues.apache.org/jira/browse/KYLIN-5573
 Project: Kylin
  Issue Type: Bug
  Components: Others
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


Chinese tips:

From:一次最多可加载 1000 张表,请修改后重新提交。

To:一次最多可加载 1000 张表,请修改后重试。

English tips:

From:Up to 1000 tables could be loaded per time, please modify and resubmit。

To:Up to 1000 tables could be loaded per time, please modify and try again.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5572) Should provide a REST API to allow users to build specific indexes

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5572:
-

 Summary: Should provide a REST API to allow users to build 
specific indexes
 Key: KYLIN-5572
 URL: https://issues.apache.org/jira/browse/KYLIN-5572
 Project: Kylin
  Issue Type: New Feature
  Components: Modeling, REST Service
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


Our product (KC 4.5.22.2) currently doesn’t support building specific indexes 
through API. Users can only refresh the whole segments to rebuild some of the 
indexes in the segment which is very inefficient. However, the same capability 
is available in the UI. Therefore, UBS wants to have a REST API to provide:
 # The ability to build specific indexes instead of all indexes in segments

 # The indexes to be built can be specified by *indexes ID or(and) status.* 
Status can be any valid status of indexes, for example, NO BUILD, ONLINE, etc.

 # The response of this API should return the job ID for further tracking

Why should we provide this API? (Business Impact)

Without this API, users can only rely on a *Kyligence ADMIN* to operate index 
building for each team from UI or simply rebuild the whole segment. In UBS, all 
operations on production have to be automated. No manual operations are 
allowed. To fully utilize the capability of AI recommendations, providing such 
an API will help UBS make the whole workflow more smooth and increase Kyligence 
usage. Meanwhile, the maintenance effort and total infra cost can be 
significantly reduced also.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5571) It takes too much time to calculate the data size during pushing down queries, which will lead to the queries un-stoppable.

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5571:
-

 Summary: It takes too much time to calculate the data size during 
pushing down queries, which will lead to the queries un-stoppable. 
 Key: KYLIN-5571
 URL: https://issues.apache.org/jira/browse/KYLIN-5571
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha


During pushing down the query, KE will try to calculate the included data size 
to set Spark partitions, but if there were too many files on HDFS, it will take 
a lot of time to complete.

So in order to improve this situation, the following things will be done:
 # Using a limited thread pool to calculate the data size
 # Add timeout for the calculation, so as to stop the query as soon as possible

After these changes, we can expected the query complete in a fixed duration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5570) Incorrect result produced by the query with grouping sets

2023-06-08 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5570:
-

 Summary: Incorrect result produced by the query with grouping sets
 Key: KYLIN-5570
 URL: https://issues.apache.org/jira/browse/KYLIN-5570
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha


The following SQL pattern will produce the wrong result when hitting a model. 
In such a case, assuming each group contains 2 rows, this query will return 10 
rows by KE, rather than 18 rows actually.
{code:java}
//代码占位符
select
GROUPING(gr1) as gr1,
GROUPING(gr2) as gr2,
GROUPING(gr3) as gr3,
GROUPING(gr4) as gr4,
GROUPING(gr5) as gr5,
GROUPING(gr6) as gr6,
GROUPING(gr7) as gr7,
GROUPING(gr8) as gr8,
GROUPING(gr9) as gr9,
count(distinct case when 1=1 then LO_ORDERKEY else null end) as goal_group
from(
select
case when LO_ORDERPRIOTITY = '1-URGENT' then '立刻发出' else '延后发出' end gr1,
case when LO_SHIPMODE = 'AIR' then '空运' else '海运' end gr2,
case when LO_LINENUMBER = 1 then '1' else '0' end gr3,
case when LO_CUSTKEY = 1 then '1' else '0' end gr4,
case when LO_PARTKEY = 1 then '1' else '0' end gr5,
case when LO_SUPPKEY = 1 then '1' else '0' end gr6,
case when LO_QUANTITY = 1 then '1' else '0' end gr7,
case when LO_EXTENDEDPRICE = 90400 then '1' else '0' end gr8,
case when LO_TAX = 0 then '1' else '0' end gr9,
LO_ORDERKEY
from
SSB.LINEORDER
)
group by
GROUPING SETS
(
(gr1),
(gr2),
(gr3),
(gr4),
(gr5),
(gr6),
(gr7),
(gr8),
(gr9)
)
order by 1,2,3,4,5,6,7,8,9
LIMIT 500{code}
 

A simple resolution is to add *groupByColumns* ids into *OLAPProjectRel* node's 
*digest* field.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5569) Support manually setting ssh encrypted password when enable Job multi-live

2023-06-07 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5569:
-

 Summary:  Support manually setting ssh encrypted password when 
enable Job multi-live
 Key: KYLIN-5569
 URL: https://issues.apache.org/jira/browse/KYLIN-5569
 Project: Kylin
  Issue Type: Improvement
  Components: Job Engine, Security
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


In some scenarios, users hope  that the password set in properties file should 
be encrypted, to avoid the potential security issues.

So it's great for Kylin to supply such an approach. After changes,  we could 
config the password as the folloing:
{code:java}
配置项加密: kylin.job.ssh-password=ENC('${encrypted_password}')
举例:
输入:${KYLIN_HOME}/bin/kylin.sh io.kyligence.kap.tool.general.CryptTool -e AES -s 
kylin
输出:AES encrypted password is:
YeqVr9MakSFbgxEec9sBwg==

// Sample
kylin.server.leader-race.heart-beat-timeout=60
kylin.server.leader-race.heart-beat-interval=30
kylin.job.ssh-username=quard
kylin.job.ssh-password=ENC('k7lRPO1yqWRgtR09uG+F2w==')
#kylin.job.ssh-password=PlainTextPassWord {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5568) Some JDBC datasources will fail to query or load parts of tables, like GaussDB, responding to KE

2023-06-07 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5568:
-

 Summary: Some JDBC datasources will fail to query or load parts of 
tables, like GaussDB, responding to KE 
 Key: KYLIN-5568
 URL: https://issues.apache.org/jira/browse/KYLIN-5568
 Project: Kylin
  Issue Type: Bug
  Components: Driver - JDBC
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha


Currently KE keep all the table metadata as upper case, so it will fail to 
contacting with some JDBC datasources, like GaussDB, which are sensitive to the 
letters.

So to solve this issue, a new boolean property 
{*}kylin.source.jdbc.convert-to-lowercase will be introduced{*}, false by 
default, to transform KE metadata to lowercase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5567) Make ke-external module obey the open source specs

2023-06-07 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5567:
-

 Summary: Make ke-external module obey the open source specs
 Key: KYLIN-5567
 URL: https://issues.apache.org/jira/browse/KYLIN-5567
 Project: Kylin
  Issue Type: Bug
  Components: Others
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Yifan Zhang
 Fix For: 5.0-alpha


In order to solve this issue, there are 5 things to do:
 # Replace *kap-external:4.5.9* with *kylin-external:5.0.0*
 # Change package path *org.apache.kylin.guava20.shaded.** to 
*org.apache.kylin.guava30.shaded.**
 # Publish *kylin-external-guava30* lib to public nexus repo
 # Add checkstyle rules to prohibit the official guava references



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5566) To fix the error of verifying the model alias

2023-06-07 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5566:
-

 Summary: To fix the error of verifying the model alias
 Key: KYLIN-5566
 URL: https://issues.apache.org/jira/browse/KYLIN-5566
 Project: Kylin
  Issue Type: Bug
  Components: REST Service
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


When verifying the requested model by alias, should transfer it to lowercase.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5565) Upgrade Netty to 4.1.89 to fix the security vulnerabilities

2023-06-07 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5565:
-

 Summary: Upgrade Netty to 4.1.89 to fix the security 
vulnerabilities
 Key: KYLIN-5565
 URL: https://issues.apache.org/jira/browse/KYLIN-5565
 Project: Kylin
  Issue Type: Bug
  Components: Security
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5564) Introduce Bloom Filter to optimize data scanning based on Spark

2023-06-07 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5564:
-

 Summary: Introduce Bloom Filter to optimize data scanning based on 
Spark
 Key: KYLIN-5564
 URL: https://issues.apache.org/jira/browse/KYLIN-5564
 Project: Kylin
  Issue Type: Improvement
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha


Currently, all the data generated by Kylin are saved as Parquet files through 
Spark, but Kylin has not make full use of the features of Parquet when scanning 
data. Among them, BloomFilter must be stressed, because it's the most common 
tool to help READERs to skip useless data.

Therefore, we introduced a approach to build BloomFilter automatically, 
conditionally and smartly when constructing segments, on the desired columns 
especially according to the query histories.

After brought in BloomFilter, Spark will have a good performance improvement in 
the most cases.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5563) Enable operation role has the privilege to manage the model indexes

2023-06-07 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5563:
-

 Summary: Enable operation role has the privilege to manage the 
model indexes
 Key: KYLIN-5563
 URL: https://issues.apache.org/jira/browse/KYLIN-5563
 Project: Kylin
  Issue Type: Improvement
  Components: Security
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha


In daily works, we will cooperate with other guys to manage a shared model, 
generally the model is private, but we can let other roles, especially 
_*OPERATION role,*_ to have the privilege to manage the indexes on this model.

But Kylin lacks some manners to easily enable this feature, hence we propose a 
new property `kylin.index.enable-operator-design` to make it work. When it's 
true, OPERATIONs can CRUD the indexes, but modify the model.

Of course, in order to bring in this switch, we have to change the default 
checks for the related APIs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5562) Building job will be scheduled and executed repeatedly

2023-06-07 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5562:
-

 Summary: Building job will be scheduled and executed repeatedly
 Key: KYLIN-5562
 URL: https://issues.apache.org/jira/browse/KYLIN-5562
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: sibing.zhang
 Fix For: 5.0-alpha


This issue will happen in the recent versions, because of the previous changes 
on the logic of appending *RUNNING* job, so we need to revert the related code.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5561) Optimize the build performance for models containing semi-additive measure

2023-06-07 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5561:
-

 Summary: Optimize the build performance for models containing 
semi-additive measure
 Key: KYLIN-5561
 URL: https://issues.apache.org/jira/browse/KYLIN-5561
 Project: Kylin
  Issue Type: Bug
  Components: Modeling
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Yaguang Jia
 Fix For: 5.0-alpha


When building a model with aggregate function `sum_lc`, it takes too much time 
to complete the calculation even on a small dataset. After dug into it's 
implementation, we found the root cause is that the `serialize` will always 
allocate a new array with `1024 * 1024` bytes as the temporary place to store 
the serialized value of `SumLCCounter`.

Actually, only a decimal and a long value of a `SumLCCounter` object should be 
serialized, generally the serialized data size is about `8 + 8` bytes in 64-bit 
platform, so obviously the temporary array is too big to store the result.

After deduce the init size of the temporary array, for example 32-Bytes, the 
total time to complete the calculation of `sum_lc` on 10GB datasets, have been 
reduced from 16min => 4min.

Here is the benchmark tests:
{code:java}
// After optimized

# Warmup: 1 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: io.kyligence.pe.JmhSumLCApplication.dynamicLength

# Run progress: 0.00% complete, ETA 00:04:00
# Fork: 1 of 2
# Warmup Iteration   1: 39082.864 ops/ms
Iteration   1: 41760.550 ops/ms
Iteration   2: 47911.634 ops/ms
Iteration   3: 47353.936 ops/ms
Iteration   4: 46888.688 ops/ms
Iteration   5: 48378.075 ops/ms

# Run progress: 25.00% complete, ETA 00:03:02
# Fork: 2 of 2
# Warmup Iteration   1: 39479.279 ops/ms
Iteration   1: 42066.415 ops/ms
Iteration   2: 48499.974 ops/ms
Iteration   3: 48524.844 ops/ms
Iteration   4: 48431.830 ops/ms
Iteration   5: 48451.256 ops/ms


Result "io.kyligence.pe.JmhSumLCApplication.dynamicLength":
  46826.720 ±(99.9%) 4002.887 ops/ms [Average]
  (min, avg, max) = (41760.550, 46826.720, 48524.844), stdev = 2647.662
  CI (99.9%): [42823.833, 50829.607] (assumes normal distribution)


// Before optimized
# Warmup: 1 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: io.kyligence.pe.JmhSumLCApplication.fixLength

# Run progress: 50.00% complete, ETA 00:02:01
# Fork: 1 of 2
# Warmup Iteration   1: 22.364 ops/ms
Iteration   1: 25.354 ops/ms
Iteration   2: 25.252 ops/ms
Iteration   3: 20.566 ops/ms
Iteration   4: 20.668 ops/ms
Iteration   5: 21.585 ops/ms

# Run progress: 75.00% complete, ETA 00:01:00
# Fork: 2 of 2
# Warmup Iteration   1: 22.953 ops/ms
Iteration   1: 25.362 ops/ms
Iteration   2: 24.041 ops/ms
Iteration   3: 21.774 ops/ms
Iteration   4: 25.131 ops/ms
Iteration   5: 25.594 ops/ms


Result "io.kyligence.pe.JmhSumLCApplication.fixLength":
  23.533 ±(99.9%) 3.210 ops/ms [Average]
  (min, avg, max) = (20.566, 23.533, 25.594), stdev = 2.123
  CI (99.9%): [20.323, 26.743] (assumes normal distribution)


# Run complete. Total time: 00:04:03

REMEMBER: The numbers below are just data. To gain reusable insights, you need 
to follow up on
why the numbers are the way they are. Use profilers (see -prof, -lprof), design 
factorial
experiments, perform baseline and negative tests that provide experimental 
control, make sure
the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from 
the domain experts.
Do not assume the numbers tell you what you want them to tell.

Benchmark   Mode  Cnt  Score  Error   Units
JmhSumLCApplication.dynamicLength  thrpt   10  46826.720 ± 4002.887  ops/ms
JmhSumLCApplication.fixLength  thrpt   10 23.533 ±3.210  ops/ms 
{code}
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5560) To improving Kylin logging abilities

2023-06-06 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5560:
-

 Summary: To improving Kylin logging abilities
 Key: KYLIN-5560
 URL: https://issues.apache.org/jira/browse/KYLIN-5560
 Project: Kylin
  Issue Type: Improvement
  Components: Integration
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


Currently most of the works, such as querying, are relying on the REST API, but 
Kylin lacks the measure to trace the whole life of a request, including 
accessing, handling, etc. Users have to pay much more time to find out the 
relations between the request and the corresponding procedures.

In practice, we had add a new attribute `traceId` into each http request 
{*}HttpServletRequest{*}, and into *org.apache.kylin.rest.interceptor.KEFilter* 
utilizing log4j MDC(Mapped Diagnostic Context) tech to help us to improve the 
situation.

So we hope our works could have the worth to improve Kylin.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5559) The vulnerability in Apache Avro version <= 1.10.2 will result in security issues

2023-06-06 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5559:
-

 Summary: The vulnerability in Apache Avro version <= 1.10.2 will 
result in security issues
 Key: KYLIN-5559
 URL: https://issues.apache.org/jira/browse/KYLIN-5559
 Project: Kylin
  Issue Type: Bug
  Components: Security
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
 Fix For: 5.0-alpha


We need to upgrade Apache Avro to 1.11.1, to avoid the potential issue.

More details of the issue, please see: 
https://security.snyk.io/vuln/SNYK-DOTNET-APACHEAVRO-2331660



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KYLIN-5558) The file for recording the child process to execute the asynchronous query in QUERY NODE dose not exists

2023-06-06 Thread Guangyuan Feng (Jira)
Guangyuan Feng created KYLIN-5558:
-

 Summary: The file for recording the child process  to execute the 
asynchronous query in QUERY NODE dose not exists
 Key: KYLIN-5558
 URL: https://issues.apache.org/jira/browse/KYLIN-5558
 Project: Kylin
  Issue Type: Bug
  Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
 Fix For: 5.0-alpha


As the title mentioned, without the recording file, QUERY NODE can't kill the 
child process,  which will result in the server losing control.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


<    1   2   3   4   5   6   7   8   9   10   >