[jira] [Updated] (KYLIN-1986) CubeMigrationCLI: make global dictionary unique

2016-08-30 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1986:
--
Attachment: KYLIN-1986.patch

This is the patch

> CubeMigrationCLI: make global dictionary unique
> ---
>
> Key: KYLIN-1986
> URL: https://issues.apache.org/jira/browse/KYLIN-1986
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Affects Versions: v1.5.3
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1986.patch
>
>
> The global dictionary is shared by all segments of one cube, so when we 
> migrate the global dictionary, we should copy the global dictionary file only 
> once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1982) CubeMigrationCLI: associate model with project

2016-08-29 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446182#comment-15446182
 ] 

kangkaisen commented on KYLIN-1982:
---

Hi Shaofeng. I looked the history of related files. I thought this bug may be 
exposed by `cubes.js` in KYLIN-1660.

> CubeMigrationCLI: associate  model with project
> ---
>
> Key: KYLIN-1982
> URL: https://issues.apache.org/jira/browse/KYLIN-1982
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Affects Versions: v1.5.3
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1982.patch
>
>
> In the current `CubeMigrationCLI`, when we migrated the cube, the model 
> metadata has migrated indeed, but the model hasn't associated with the 
> project. 
> So, if we get model via `getModels` in `ModelController` with "modelName" and 
> "projectName",  we will get null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1982) CubeMigrationCLI: associate model with project

2016-08-29 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15446170#comment-15446170
 ] 

kangkaisen commented on KYLIN-1982:
---

Hi Shaofeng. I looked the history of related files. I thought this bug may be 
introduced by `cubes.js` in KYLIN-1660.

> CubeMigrationCLI: associate  model with project
> ---
>
> Key: KYLIN-1982
> URL: https://issues.apache.org/jira/browse/KYLIN-1982
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Affects Versions: v1.5.3
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1982.patch
>
>
> In the current `CubeMigrationCLI`, when we migrated the cube, the model 
> metadata has migrated indeed, but the model hasn't associated with the 
> project. 
> So, if we get model via `getModels` in `ModelController` with "modelName" and 
> "projectName",  we will get null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1982) CubeMigrationCLI: associate model with project

2016-08-29 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1982:
--
Attachment: KYLIN-1982.patch

This is the patch.

> CubeMigrationCLI: associate  model with project
> ---
>
> Key: KYLIN-1982
> URL: https://issues.apache.org/jira/browse/KYLIN-1982
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Affects Versions: v1.5.3
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1982.patch
>
>
> In the current `CubeMigrationCLI`, when we migrated the cube, the model 
> metadata has migrated indeed, but the model hasn't associated with the 
> project. 
> So, if we get model via `getModels` in `ModelController` with "modelName" and 
> "projectName",  we will get null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1982) CubeMigrationCLI: associate model with project

2016-08-29 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1982:
--
Summary: CubeMigrationCLI: associate  model with project  (was: 
CubeMigrationCLI: associate  model_name with project)

> CubeMigrationCLI: associate  model with project
> ---
>
> Key: KYLIN-1982
> URL: https://issues.apache.org/jira/browse/KYLIN-1982
> Project: Kylin
>  Issue Type: Bug
>  Components: Tools, Build and Test
>Affects Versions: v1.5.3
>Reporter: kangkaisen
>Assignee: kangkaisen
>
> In the current `CubeMigrationCLI`, when we migrated the cube, the model 
> metadata has migrated indeed, but the model hasn't associated with the 
> project. 
> So, if we get model via `getModels` in `ModelController` with "modelName" and 
> "projectName",  we will get null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1982) CubeMigrationCLI: associate model_name with project

2016-08-29 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1982:
-

 Summary: CubeMigrationCLI: associate  model_name with project
 Key: KYLIN-1982
 URL: https://issues.apache.org/jira/browse/KYLIN-1982
 Project: Kylin
  Issue Type: Bug
  Components: Tools, Build and Test
Affects Versions: v1.5.3
Reporter: kangkaisen
Assignee: kangkaisen


In the current `CubeMigrationCLI`, when we migrated the cube, the model 
metadata has migrated indeed, but the model hasn't associated with the project. 
So, if we get model via `getModels` in `ModelController` with "modelName" and 
"projectName",  we will get null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1908) Collect Metrics to JMX

2016-08-27 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15441196#comment-15441196
 ] 

kangkaisen commented on KYLIN-1908:
---

Hi [~yimingliu], I have wrote a document and created a PR for Kylin document 
branch.  please you review, thanks.

> Collect Metrics to JMX
> --
>
> Key: KYLIN-1908
> URL: https://issues.apache.org/jira/browse/KYLIN-1908
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Fix For: v1.5.4
>
> Attachments: KYLIN-1908.patch, QueryMetrics.java
>
>
> As we all known, some performance metrics is important for enterprise 
> applications. so we should support to collect metrics to JMX in Kylin.
> The method I have done is As shown below:
> 1. use `org.apache.hadoop.metrics2` as the metrics collection framework.
> 2. define MBean Class for the metrics that we need to collect.
> 3. update metrics in right place.
> The questions I have:
> 1. can I depend on `org.apache.hadoop.metrics2` directly?
> 2. how do you think about my method?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1908) Collect Metrics to JMX

2016-08-23 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432419#comment-15432419
 ] 

kangkaisen commented on KYLIN-1908:
---

OK, I see it. I will write a document.

> Collect Metrics to JMX
> --
>
> Key: KYLIN-1908
> URL: https://issues.apache.org/jira/browse/KYLIN-1908
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Fix For: v1.5.4
>
> Attachments: KYLIN-1908.patch, QueryMetrics.java
>
>
> As we all known, some performance metrics is important for enterprise 
> applications. so we should support to collect metrics to JMX in Kylin.
> The method I have done is As shown below:
> 1. use `org.apache.hadoop.metrics2` as the metrics collection framework.
> 2. define MBean Class for the metrics that we need to collect.
> 3. update metrics in right place.
> The questions I have:
> 1. can I depend on `org.apache.hadoop.metrics2` directly?
> 2. how do you think about my method?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1908) Collect Metrics to JMX

2016-08-23 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15432364#comment-15432364
 ] 

kangkaisen commented on KYLIN-1908:
---

Hi yiming, What do you mean is that how to collect  these metrics or the 
meaning of these metrics? 

> Collect Metrics to JMX
> --
>
> Key: KYLIN-1908
> URL: https://issues.apache.org/jira/browse/KYLIN-1908
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Fix For: v1.5.4
>
> Attachments: KYLIN-1908.patch, QueryMetrics.java
>
>
> As we all known, some performance metrics is important for enterprise 
> applications. so we should support to collect metrics to JMX in Kylin.
> The method I have done is As shown below:
> 1. use `org.apache.hadoop.metrics2` as the metrics collection framework.
> 2. define MBean Class for the metrics that we need to collect.
> 3. update metrics in right place.
> The questions I have:
> 1. can I depend on `org.apache.hadoop.metrics2` directly?
> 2. how do you think about my method?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1965) Check duplicated measure name

2016-08-21 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1965:
--
Attachment: KYLIN-1965.patch

update the patch.

> Check duplicated measure name
> -
>
> Key: KYLIN-1965
> URL: https://issues.apache.org/jira/browse/KYLIN-1965
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v1.5.2, v1.5.3
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1965.patch
>
>
> The duplicated measure's name will lead to query failed, so we should check 
> duplicated measure name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1965) Check duplicated measure name

2016-08-21 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1965:
--
Attachment: (was: KYLIN-1965.patch)

> Check duplicated measure name
> -
>
> Key: KYLIN-1965
> URL: https://issues.apache.org/jira/browse/KYLIN-1965
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v1.5.2, v1.5.3
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1965.patch
>
>
> The duplicated measure's name will lead to query failed, so we should check 
> duplicated measure name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1908) Collect Metrics to JMX

2016-08-21 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15429694#comment-15429694
 ] 

kangkaisen commented on KYLIN-1908:
---

I agree with you. 

your refactor is more appropriate, thanks you.

> Collect Metrics to JMX
> --
>
> Key: KYLIN-1908
> URL: https://issues.apache.org/jira/browse/KYLIN-1908
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Fix For: v1.5.4
>
> Attachments: KYLIN-1908.patch, QueryMetrics.java
>
>
> As we all known, some performance metrics is important for enterprise 
> applications. so we should support to collect metrics to JMX in Kylin.
> The method I have done is As shown below:
> 1. use `org.apache.hadoop.metrics2` as the metrics collection framework.
> 2. define MBean Class for the metrics that we need to collect.
> 3. update metrics in right place.
> The questions I have:
> 1. can I depend on `org.apache.hadoop.metrics2` directly?
> 2. how do you think about my method?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1965) Check duplicated measure name

2016-08-19 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1965:
--
Attachment: KYLIN-1965.patch

This is the patch.

> Check duplicated measure name
> -
>
> Key: KYLIN-1965
> URL: https://issues.apache.org/jira/browse/KYLIN-1965
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v1.5.2, v1.5.3
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1965.patch
>
>
> The duplicated measure's name will lead to query failed, so we should check 
> duplicated measure name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1965) Check duplicated measure name

2016-08-19 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1965:
-

 Summary: Check duplicated measure name
 Key: KYLIN-1965
 URL: https://issues.apache.org/jira/browse/KYLIN-1965
 Project: Kylin
  Issue Type: Improvement
  Components: Metadata
Affects Versions: v1.5.2, v1.5.3
Reporter: kangkaisen
Assignee: kangkaisen


The duplicated measure's name will lead to query failed, so we should check 
duplicated measure name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1849) add basic search capability at model UI

2016-08-16 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1849:
--
Attachment: KYLIN-1849.patch

This patch supports to search cube by name in web.

> add basic search capability at model UI
> ---
>
> Key: KYLIN-1849
> URL: https://issues.apache.org/jira/browse/KYLIN-1849
> Project: Kylin
>  Issue Type: New Feature
>  Components: Web 
>Affects Versions: v1.5.2
>Reporter: Dayue Gao
>Assignee: kangkaisen
> Attachments: KYLIN-1849.patch
>
>
> In order to work with dozens of cubes, could we add a search box at "Model" 
> page? Just like the one at "Monitor" page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (KYLIN-1849) add basic search capability at model UI

2016-08-16 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen reassigned KYLIN-1849:
-

Assignee: kangkaisen  (was: Zhong,Jason)

> add basic search capability at model UI
> ---
>
> Key: KYLIN-1849
> URL: https://issues.apache.org/jira/browse/KYLIN-1849
> Project: Kylin
>  Issue Type: New Feature
>  Components: Web 
>Affects Versions: v1.5.2
>Reporter: Dayue Gao
>Assignee: kangkaisen
>
> In order to work with dozens of cubes, could we add a search box at "Model" 
> page? Just like the one at "Monitor" page.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1908) Collect Metrics to JMX

2016-07-27 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15395655#comment-15395655
 ] 

kangkaisen commented on KYLIN-1908:
---

Hi, Yiming:

Firstly, `org.apache.hadoop.metrics2` is stable, reliable and easy-to-use.

Secondly, `org.apache.hadoop.metrics2` doesn't introduce other dependencies.

Thirdly, if one day in the future Kylin doesn't depend on `hadoop-common`, we 
could change the code easily. But I think it need a long time, because at now 
running Kylin server depend on HBase and HBase depend on `hadoop-common`.

what do you think about it?

> Collect Metrics to JMX
> --
>
> Key: KYLIN-1908
> URL: https://issues.apache.org/jira/browse/KYLIN-1908
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1908.patch, QueryMetrics.java
>
>
> As we all known, some performance metrics is important for enterprise 
> applications. so we should support to collect metrics to JMX in Kylin.
> The method I have done is As shown below:
> 1. use `org.apache.hadoop.metrics2` as the metrics collection framework.
> 2. define MBean Class for the metrics that we need to collect.
> 3. update metrics in right place.
> The questions I have:
> 1. can I depend on `org.apache.hadoop.metrics2` directly?
> 2. how do you think about my method?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1908) Collect Metrics to JMX

2016-07-27 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1908:
--
Attachment: KYLIN-1908.patch

This patch add query metrics in the Server, Project, Cube three levels.
The finally ObjectNames are as shown below:

Hadoop:name=Server_Total,service=Kylin 
Hadoop:name=learn_kylin,service=Kylin.QueryCount  
Hadoop:name=learn_kylin,service=Kylin,sub=kylin_sales_cube.QueryCount

The first word of ObjectName "Hadoop" is hard coding in the 
"org.apache.hadoop.metrics2",
I can't change it to "Kylin".

Each ObjectName has the following metrics:
QueryCount;
QueryFailCount;
QuerySuccessCount;
CacheHitCount, CacheHitCount60sNumOps, CacheHitCount300sNumOps, 
CacheHitCount3600sNumOps;

QueryLatency,QueryLatencyNumOps,QueryLatencyAvgTime,QueryLatencyMaxTime,
QueryLatencyMinTime,QueryLatency60s99thPercentile,QueryLatency300s99thPercentile,
QueryLatency3600s90thPercentile;

ResultRowCount and ScanRowCount are like QueryLatency.

> Collect Metrics to JMX
> --
>
> Key: KYLIN-1908
> URL: https://issues.apache.org/jira/browse/KYLIN-1908
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1908.patch, QueryMetrics.java
>
>
> As we all known, some performance metrics is important for enterprise 
> applications. so we should support to collect metrics to JMX in Kylin.
> The method I have done is As shown below:
> 1. use `org.apache.hadoop.metrics2` as the metrics collection framework.
> 2. define MBean Class for the metrics that we need to collect.
> 3. update metrics in right place.
> The questions I have:
> 1. can I depend on `org.apache.hadoop.metrics2` directly?
> 2. how do you think about my method?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1908) Collect Metrics to JMX

2016-07-21 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387471#comment-15387471
 ] 

kangkaisen commented on KYLIN-1908:
---

OK,I am doing this.

> Collect Metrics to JMX
> --
>
> Key: KYLIN-1908
> URL: https://issues.apache.org/jira/browse/KYLIN-1908
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: QueryMetrics.java
>
>
> As we all known, some performance metrics is important for enterprise 
> applications. so we should support to collect metrics to JMX in Kylin.
> The method I have done is As shown below:
> 1. use `org.apache.hadoop.metrics2` as the metrics collection framework.
> 2. define MBean Class for the metrics that we need to collect.
> 3. update metrics in right place.
> The questions I have:
> 1. can I depend on `org.apache.hadoop.metrics2` directly?
> 2. how do you think about my method?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1908) Collect Metrics to JMX

2016-07-21 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387362#comment-15387362
 ] 

kangkaisen commented on KYLIN-1908:
---

JMX may be not the best solution, but JMX is a standard management and 
monitoring technology in Java and most of Hadoop systems  and monitor systems 
support JMX. if Kylin support JMX, it will more easier for most of enterprises 
to set up their own monitor system. so I think Kylin should support JMX.

If you have a better solution, I will be willing to do it.

> Collect Metrics to JMX
> --
>
> Key: KYLIN-1908
> URL: https://issues.apache.org/jira/browse/KYLIN-1908
> Project: Kylin
>  Issue Type: New Feature
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: QueryMetrics.java
>
>
> As we all known, some performance metrics is important for enterprise 
> applications. so we should support to collect metrics to JMX in Kylin.
> The method I have done is As shown below:
> 1. use `org.apache.hadoop.metrics2` as the metrics collection framework.
> 2. define MBean Class for the metrics that we need to collect.
> 3. update metrics in right place.
> The questions I have:
> 1. can I depend on `org.apache.hadoop.metrics2` directly?
> 2. how do you think about my method?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1908) Collect Metrics to JMX

2016-07-20 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1908:
-

 Summary: Collect Metrics to JMX
 Key: KYLIN-1908
 URL: https://issues.apache.org/jira/browse/KYLIN-1908
 Project: Kylin
  Issue Type: New Feature
  Components: Tools, Build and Test
Affects Versions: v1.5.2
Reporter: kangkaisen
Assignee: kangkaisen


As we all known, some performance metrics is important for enterprise 
applications. so we should support to collect metrics to JMX in Kylin.

The method I have done is As shown below:

1. use `org.apache.hadoop.metrics2` as the metrics collection framework.
2. define MBean Class for the metrics that we need to collect.
3. update metrics in right place.

The questions I have:
1. can I depend on `org.apache.hadoop.metrics2` directly?
2. how do you think about my method?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1854) Allow deleting cube instance when its underlying cubedesc went wrong

2016-07-18 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383433#comment-15383433
 ] 

kangkaisen commented on KYLIN-1854:
---

OK,thanks.

> Allow deleting cube instance when its underlying cubedesc went wrong
> 
>
> Key: KYLIN-1854
> URL: https://issues.apache.org/jira/browse/KYLIN-1854
> Project: Kylin
>  Issue Type: Improvement
>Reporter: hongbin ma
>Assignee: hongbin ma
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1896) JDBC support mybatis

2016-07-15 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1896:
--
Attachment: KYLIN-1896.patch

This is the patch.

> JDBC support mybatis
> 
>
> Key: KYLIN-1896
> URL: https://issues.apache.org/jira/browse/KYLIN-1896
> Project: Kylin
>  Issue Type: Bug
>  Components: Driver - JDBC
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1896.patch
>
>
> When our user used Mybatis, he found Mybatis need `columnClassType` in 
> `ColumnMetaData`. But in the current version of Kylin, when construct the 
> `ColumnMetaData`, the  last parameter `columnClassType` is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1896) JDBC support mybatis

2016-07-15 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1896:
-

 Summary: JDBC support mybatis
 Key: KYLIN-1896
 URL: https://issues.apache.org/jira/browse/KYLIN-1896
 Project: Kylin
  Issue Type: Bug
  Components: Driver - JDBC
Affects Versions: v1.5.2
Reporter: kangkaisen
Assignee: kangkaisen


When our user used Mybatis, he found Mybatis need `columnClassType` in 
`ColumnMetaData`. But in the current version of Kylin, when construct the 
`ColumnMetaData`, the  last parameter `columnClassType` is null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1854) Allow deleting cube instance when its underlying cubedesc went wrong

2016-07-15 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15379051#comment-15379051
 ] 

kangkaisen commented on KYLIN-1854:
---

hi, hongbin. this feature whether  support to change column's name of hive 
table?

> Allow deleting cube instance when its underlying cubedesc went wrong
> 
>
> Key: KYLIN-1854
> URL: https://issues.apache.org/jira/browse/KYLIN-1854
> Project: Kylin
>  Issue Type: Improvement
>Reporter: hongbin ma
>Assignee: hongbin ma
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (KYLIN-1893) Upgrade spring-boot framework because of security vulnerabilities

2016-07-14 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen closed KYLIN-1893.
-
Resolution: Duplicate

> Upgrade spring-boot framework because of security vulnerabilities
> -
>
> Key: KYLIN-1893
> URL: https://issues.apache.org/jira/browse/KYLIN-1893
> Project: Kylin
>  Issue Type: Bug
>  Components: REST Service
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: Zhong,Jason
>Priority: Critical
>
> The Spring Boot Framework has a expression of SPEL type injection common 
> vulnerabilities, which affect versions is 1.1-1.3.0.
> we need upgrade to version 1.3.1 or later.
> https://www.chinacybersafety.com/tag/the-common-vulnerabilities-and-high-risk-vulnerabilities-early-warning-framework-spring-boot



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1893) Upgrade spring-boot framework because of security vulnerabilities

2016-07-14 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376966#comment-15376966
 ] 

kangkaisen commented on KYLIN-1893:
---

OK, I know. thanks!

> Upgrade spring-boot framework because of security vulnerabilities
> -
>
> Key: KYLIN-1893
> URL: https://issues.apache.org/jira/browse/KYLIN-1893
> Project: Kylin
>  Issue Type: Bug
>  Components: REST Service
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: Zhong,Jason
>Priority: Critical
>
> The Spring Boot Framework has a expression of SPEL type injection common 
> vulnerabilities, which affect versions is 1.1-1.3.0.
> we need upgrade to version 1.3.1 or later.
> https://www.chinacybersafety.com/tag/the-common-vulnerabilities-and-high-risk-vulnerabilities-early-warning-framework-spring-boot



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1893) Upgrade spring-boot framework because of security vulnerabilities

2016-07-14 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1893:
-

 Summary: Upgrade spring-boot framework because of security 
vulnerabilities
 Key: KYLIN-1893
 URL: https://issues.apache.org/jira/browse/KYLIN-1893
 Project: Kylin
  Issue Type: Bug
  Components: REST Service
Affects Versions: v1.5.2
Reporter: kangkaisen
Assignee: Zhong,Jason
Priority: Critical


The Spring Boot Framework has a expression of SPEL type injection common 
vulnerabilities, which affect versions is 1.1-1.3.0.
we need upgrade to version 1.3.1 or later.



https://www.chinacybersafety.com/tag/the-common-vulnerabilities-and-high-risk-vulnerabilities-early-warning-framework-spring-boot



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1884) Reload metadata automatically after migrating cube

2016-07-13 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1884:
--
Attachment: KYLIN-1884.patch

this is the patch.

> Reload metadata automatically after migrating cube
> --
>
> Key: KYLIN-1884
> URL: https://issues.apache.org/jira/browse/KYLIN-1884
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-1884.patch
>
>
> in the current version of Kylin, after migrating cube we need reload metadata 
> manually.
> in our production environment, we have many restServers. 
> so, we hope to reload metadata automatically after migrating cube.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1884) Reload metadata automatically after migrating cube

2016-07-13 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374711#comment-15374711
 ] 

kangkaisen commented on KYLIN-1884:
---

I will do this work.

> Reload metadata automatically after migrating cube
> --
>
> Key: KYLIN-1884
> URL: https://issues.apache.org/jira/browse/KYLIN-1884
> Project: Kylin
>  Issue Type: Improvement
>  Components: Tools, Build and Test
>Affects Versions: v1.5.2
>Reporter: kangkaisen
>Assignee: kangkaisen
>
> in the current version of Kylin, after migrating cube we need reload metadata 
> manually.
> in our production environment, we have many restServers. 
> so, we hope to reload metadata automatically after migrating cube.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1884) Reload metadata automatically after migrating cube

2016-07-13 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1884:
-

 Summary: Reload metadata automatically after migrating cube
 Key: KYLIN-1884
 URL: https://issues.apache.org/jira/browse/KYLIN-1884
 Project: Kylin
  Issue Type: Improvement
  Components: Tools, Build and Test
Affects Versions: v1.5.2
Reporter: kangkaisen
Assignee: kangkaisen


in the current version of Kylin, after migrating cube we need reload metadata 
manually.
in our production environment, we have many restServers. 
so, we hope to reload metadata automatically after migrating cube.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1695) Skip cardinality calculation job when loading hive table

2016-05-21 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1695:
--
Attachment: KYLIN-1695.patch

This patch add a checkbox in the "add hive table" web page, asking user to 
select whether calculate the cardinality. the default value of checkbox is true.

> Skip cardinality calculation job when loading hive table
> 
>
> Key: KYLIN-1695
> URL: https://issues.apache.org/jira/browse/KYLIN-1695
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v1.5.1
>Reporter: kangkaisen
>Assignee: Dong Li
> Attachments: KYLIN-1695.patch
>
>
> When user loads/reloads hive tables from web console, kylin will submit a mr 
> job asynchronously to calculate column cardinalities. This has four major 
> problems:
> # the calculated cardinality is stored in table metadata, but never used in 
> cubing/querying
> # table may change after loading, so the cardinality doesn't necessarily 
> reflect the actual value
> # the current `HiveColumnCardinalityJob` has many limitations, e.g., it 
> doesn't support views
> # the `HiveColumnCardinalityJob` may use lots of resources when computing 
> cardinality of partitioned table
> Due to these problems, we should disable it by default and (maybe) remove it 
> in future releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1695) Skip cardinality calculation job when loading hive table

2016-05-21 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1695:
--
Attachment: (was: KYLIN-1695.patch)

> Skip cardinality calculation job when loading hive table
> 
>
> Key: KYLIN-1695
> URL: https://issues.apache.org/jira/browse/KYLIN-1695
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v1.5.1
>Reporter: kangkaisen
>Assignee: Dong Li
>
> When user loads/reloads hive tables from web console, kylin will submit a mr 
> job asynchronously to calculate column cardinalities. This has four major 
> problems:
> # the calculated cardinality is stored in table metadata, but never used in 
> cubing/querying
> # table may change after loading, so the cardinality doesn't necessarily 
> reflect the actual value
> # the current `HiveColumnCardinalityJob` has many limitations, e.g., it 
> doesn't support views
> # the `HiveColumnCardinalityJob` may use lots of resources when computing 
> cardinality of partitioned table
> Due to these problems, we should disable it by default and (maybe) remove it 
> in future releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1323) Improve performance of converting data to hfile

2016-05-19 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292640#comment-15292640
 ] 

kangkaisen commented on KYLIN-1323:
---

That's OK. I  looked your commit of "a bug in hbase read-write separation mode" 
for github-master branch. 
You may be missing another configuration in 
"  Configuration conf = HadoopUtil.getCurrentConfiguration();
SequenceFile.Writer hfilePartitionWriter = 
SequenceFile.createWriter(conf, SequenceFile.Writer.file(hfilePartitionFile), 
SequenceFile.Writer.keyClass(ImmutableBytesWritable.class), 
SequenceFile.Writer.valueClass(NullWritable.class));".

> Improve performance of converting data to hfile
> ---
>
> Key: KYLIN-1323
> URL: https://issues.apache.org/jira/browse/KYLIN-1323
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v1.2
>Reporter: Yerui Sun
>Assignee: Yerui Sun
> Fix For: v1.4.0, v1.3.0, v1.5.2
>
> Attachments: KYLIN-1323-1.x-staging.2.patch, 
> KYLIN-1323-1.x-staging.patch, KYLIN-1323-2.x-staging.2.patch
>
>
> Supposed that we got 100GB data after cuboid building, and with setting that 
> 10GB per region. For now, 10 split keys was calculated, and 10 region 
> created, 10 reducer used in ‘convert to hfile’ step. 
> With optimization, we could calculate 100 (or more) split keys, and use all 
> them in ‘covert to file’ step, but sampled 10 keys in them to create regions. 
> The result is still 10 region created, but 100 reducer used in ‘convert to 
> file’ step. Of course, the hfile created is also 100, and load 10 files per 
> region. That’s should be fine, doesn’t affect the query performance 
> dramatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1323) Improve performance of converting data to hfile

2016-05-18 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290443#comment-15290443
 ] 

kangkaisen commented on KYLIN-1323:
---

Hi,shaofeng. it may be have a little bug in "saveHFileSplits", when the HDFS 
cluster of HBase is different from the HDFS cluster for MR job,the 
Configuration use in "saveHFileSplits" should be 
"HBaseConnection.getCurrentHBaseConfiguration()", and not 
"HadoopUtil.getCurrentConfiguration()"。

I have merged the KYLIN-1323 patch in our branch, after I changed the 
Configuration use in "saveHFileSplits", the cube builded successfully。

> Improve performance of converting data to hfile
> ---
>
> Key: KYLIN-1323
> URL: https://issues.apache.org/jira/browse/KYLIN-1323
> Project: Kylin
>  Issue Type: Improvement
>  Components: Job Engine
>Affects Versions: v1.2
>Reporter: Yerui Sun
>Assignee: Yerui Sun
> Fix For: v1.4.0, v1.3.0, v1.5.2
>
> Attachments: KYLIN-1323-1.x-staging.2.patch, 
> KYLIN-1323-1.x-staging.patch, KYLIN-1323-2.x-staging.2.patch
>
>
> Supposed that we got 100GB data after cuboid building, and with setting that 
> 10GB per region. For now, 10 split keys was calculated, and 10 region 
> created, 10 reducer used in ‘convert to hfile’ step. 
> With optimization, we could calculate 100 (or more) split keys, and use all 
> them in ‘covert to file’ step, but sampled 10 keys in them to create regions. 
> The result is still 10 region created, but 100 reducer used in ‘convert to 
> file’ step. Of course, the hfile created is also 100, and load 10 files per 
> region. That’s should be fine, doesn’t affect the query performance 
> dramatically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1695) disable cardinality calculation job when loading hive table

2016-05-17 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15286295#comment-15286295
 ] 

kangkaisen commented on KYLIN-1695:
---

OK, I will try to do it.

> disable cardinality calculation job when loading hive table
> ---
>
> Key: KYLIN-1695
> URL: https://issues.apache.org/jira/browse/KYLIN-1695
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v1.5.1
>Reporter: kangkaisen
>Assignee: Dong Li
> Attachments: KYLIN-1695.patch
>
>
> When user loads/reloads hive tables from web console, kylin will submit a mr 
> job asynchronously to calculate column cardinalities. This has four major 
> problems:
> # the calculated cardinality is stored in table metadata, but never used in 
> cubing/querying
> # table may change after loading, so the cardinality doesn't necessarily 
> reflect the actual value
> # the current `HiveColumnCardinalityJob` has many limitations, e.g., it 
> doesn't support views
> # the `HiveColumnCardinalityJob` may use lots of resources when computing 
> cardinality of partitioned table
> Due to these problems, we should disable it by default and (maybe) remove it 
> in future releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1695) disable cardinality calculation job when loading hive table

2016-05-16 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285867#comment-15285867
 ] 

kangkaisen commented on KYLIN-1695:
---

Hi shaofeng, OK, I know what you mean.I found the "Calculate Cardinality" 
button in "System", I'll look the relevant code.

> disable cardinality calculation job when loading hive table
> ---
>
> Key: KYLIN-1695
> URL: https://issues.apache.org/jira/browse/KYLIN-1695
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v1.5.1
>Reporter: kangkaisen
>Assignee: Dong Li
> Attachments: KYLIN-1695.patch
>
>
> When user loads/reloads hive tables from web console, kylin will submit a mr 
> job asynchronously to calculate column cardinalities. This has four major 
> problems:
> # the calculated cardinality is stored in table metadata, but never used in 
> cubing/querying
> # table may change after loading, so the cardinality doesn't necessarily 
> reflect the actual value
> # the current `HiveColumnCardinalityJob` has many limitations, e.g., it 
> doesn't support views
> # the `HiveColumnCardinalityJob` may use lots of resources when computing 
> cardinality of partitioned table
> Due to these problems, we should disable it by default and (maybe) remove it 
> in future releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-1695) disable cardinality calculation job when loading hive table

2016-05-16 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-1695:
--
Attachment: KYLIN-1695.patch

this patch adds one configuration "kylin.hive.calculate.cardinality.enabled" , 
the default is "false"

> disable cardinality calculation job when loading hive table
> ---
>
> Key: KYLIN-1695
> URL: https://issues.apache.org/jira/browse/KYLIN-1695
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v1.5.1
>Reporter: kangkaisen
>Assignee: Dong Li
> Attachments: KYLIN-1695.patch
>
>
> When user loads/reloads hive tables from web console, kylin will submit a mr 
> job asynchronously to calculate column cardinalities. This has four major 
> problems:
> # the calculated cardinality is stored in table metadata, but never used in 
> cubing/querying
> # table may change after loading, so the cardinality doesn't necessarily 
> reflect the actual value
> # the current `HiveColumnCardinalityJob` has many limitations, e.g., it 
> doesn't support views
> # the `HiveColumnCardinalityJob` may use lots of resources when computing 
> cardinality of partitioned table
> Due to these problems, we should disable it by default and (maybe) remove it 
> in future releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-1695) disable cardinality calculation job when loading hive table

2016-05-16 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15284239#comment-15284239
 ] 

kangkaisen commented on KYLIN-1695:
---

I would like to fix this bug

> disable cardinality calculation job when loading hive table
> ---
>
> Key: KYLIN-1695
> URL: https://issues.apache.org/jira/browse/KYLIN-1695
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v1.5.1
>Reporter: kangkaisen
>Assignee: Dong Li
>
> When user loads/reloads hive tables from web console, kylin will submit a mr 
> job asynchronously to calculate column cardinalities. This has four major 
> problems:
> # the calculated cardinality is stored in table metadata, but never used in 
> cubing/querying
> # table may change after loading, so the cardinality doesn't necessarily 
> reflect the actual value
> # the current `HiveColumnCardinalityJob` has many limitations, e.g., it 
> doesn't support views
> # the `HiveColumnCardinalityJob` may use lots of resources when computing 
> cardinality of partitioned table
> Due to these problems, we should disable it by default and (maybe) remove it 
> in future releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1695) disable cardinality calculation job when loading hive table

2016-05-15 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1695:
-

 Summary: disable cardinality calculation job when loading hive 
table
 Key: KYLIN-1695
 URL: https://issues.apache.org/jira/browse/KYLIN-1695
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: v1.5.1
Reporter: kangkaisen
Assignee: Dong Li


When user loads/reloads hive tables from web console, kylin will submit a mr 
job asynchronously to calculate column cardinalities. This has four major 
problems:

# the calculated cardinality is stored in table metadata, but never used in 
cubing/querying
# table may change after loading, so the cardinality doesn't necessarily 
reflect the actual value
# the current `HiveColumnCardinalityJob` has many limitations, e.g., it doesn't 
support views
# the `HiveColumnCardinalityJob` may use lots of resources when computing 
cardinality of partitioned table

Due to these problems, we should disable it by default and (maybe) remove it in 
future releases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-1694) make multiply coefficient configurable when estimating cuboid size

2016-05-15 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-1694:
-

 Summary: make multiply coefficient configurable when estimating 
cuboid size
 Key: KYLIN-1694
 URL: https://issues.apache.org/jira/browse/KYLIN-1694
 Project: Kylin
  Issue Type: Bug
  Components: Job Engine
Affects Versions: v1.5.1, v1.5.0
Reporter: kangkaisen
Assignee: Dong Li


In the current version of MRv2 build engine, in CubeStatsReader when estimating 
cuboid size , the curent method is "cube is memory hungry, storage size 
estimation multiply 0.05" and "cube is not memory hungry, storage size 
estimation multiply 0.25".

This has one major problems:the default multiply coefficient is smaller, this 
will make the estimated cuboid size much less than the actual
cuboid size,which will lead to the region numbers of HBase and the reducer 
numbers of CubeHFileJob are both smaller. obviously, the current method
makes the job of CubeHFileJob much slower.

After we remove the the default multiply coefficient, the job of CubeHFileJob 
becomes much faster.

we'd better make multiply coefficient configurable and this could be more 
friendly for user.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3   4   5   6