[jira] [Updated] (KYLIN-2366) multi FactDistinctColumnsReducer task write the same output file

2017-01-07 Thread zheng zhifeng (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zheng zhifeng updated KYLIN-2366:
-
Description: 
  Multi FactDistinctColumnsReducer task will write the 
distinct column to the same output file if the mapreduce
property mapreduce.reduce.speculative is true. This will
make the mapreduce job failed or the output file is redundancy.
  I think it shoud set mapreduce.reduce.speculative to false on
Extract Fact Table Distinct Columns step.

  was:
  Multi FactDistinctColumnsReducer task will write the 
distinct columns to the same output file if the mapreduce
property mapreduce.reduce.speculative is true. This will
make the mapreduce job failed or the output file is redundancy.
  I think it shoud set mapreduce.reduce.speculative to false on
Extract Fact Table Distinct Columns step.


> multi FactDistinctColumnsReducer task write the same output file
> 
>
> Key: KYLIN-2366
> URL: https://issues.apache.org/jira/browse/KYLIN-2366
> Project: Kylin
>  Issue Type: Bug
>  Components: streaming
>Affects Versions: v1.6.0
>Reporter: zheng zhifeng
> Fix For: Future
>
>
>   Multi FactDistinctColumnsReducer task will write the 
> distinct column to the same output file if the mapreduce
> property mapreduce.reduce.speculative is true. This will
> make the mapreduce job failed or the output file is redundancy.
>   I think it shoud set mapreduce.reduce.speculative to false on
> Extract Fact Table Distinct Columns step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2366) multi FactDistinctColumnsReducer task write the same output file

2017-01-07 Thread zheng zhifeng (JIRA)
zheng zhifeng created KYLIN-2366:


 Summary: multi FactDistinctColumnsReducer task write the same 
output file
 Key: KYLIN-2366
 URL: https://issues.apache.org/jira/browse/KYLIN-2366
 Project: Kylin
  Issue Type: Bug
  Components: streaming
Affects Versions: v1.6.0
Reporter: zheng zhifeng
 Fix For: Future


  Multi FactDistinctColumnsReducer task will write the 
distinct columns to the same output file if the mapreduce
property mapreduce.reduce.speculative is true. This will
make the mapreduce job failed or the output file is redundancy.
  I think it shoud set mapreduce.reduce.speculative to false on
Extract Fact Table Distinct Columns step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2365) multi FactDistinctColumnsReducer task write the same output file

2017-01-07 Thread zheng zhifeng (JIRA)
zheng zhifeng created KYLIN-2365:


 Summary: multi FactDistinctColumnsReducer task write the same 
output file
 Key: KYLIN-2365
 URL: https://issues.apache.org/jira/browse/KYLIN-2365
 Project: Kylin
  Issue Type: Bug
  Components: streaming
Affects Versions: v1.6.0
Reporter: zheng zhifeng
 Fix For: Future


  Multi FactDistinctColumnsReducer task will write the 
distinct columns to the same output file if the mapreduce
property mapreduce.reduce.speculative is true. This will
make the mapreduce job failed or the output file is redundancy.
  I think it shoud set mapreduce.reduce.speculative to false on
Extract Fact Table Distinct Columns step.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2364) Output table name to error info in LookupTable

2017-01-07 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15808688#comment-15808688
 ] 

kangkaisen commented on KYLIN-2364:
---

OK. Thank you, Billy.

> Output table name to error info in LookupTable
> --
>
> Key: KYLIN-2364
> URL: https://issues.apache.org/jira/browse/KYLIN-2364
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v1.6.0
>Reporter: kangkaisen
>Assignee: kangkaisen
>Priority: Minor
> Fix For: v2.0.0
>
> Attachments: KYLIN-2364.patch
>
>
> We should output table name so that the user know which LookupTable is broken.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KYLIN-2364) Output table name to error info in LookupTable

2017-01-07 Thread Billy Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Billy Liu resolved KYLIN-2364.
--
   Resolution: Fixed
Fix Version/s: v2.0.0

> Output table name to error info in LookupTable
> --
>
> Key: KYLIN-2364
> URL: https://issues.apache.org/jira/browse/KYLIN-2364
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v1.6.0
>Reporter: kangkaisen
>Assignee: kangkaisen
>Priority: Minor
> Fix For: v2.0.0
>
> Attachments: KYLIN-2364.patch
>
>
> We should output table name so that the user know which LookupTable is broken.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2364) Output table name to error info in LookupTable

2017-01-07 Thread Billy Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15808675#comment-15808675
 ] 

Billy Liu commented on KYLIN-2364:
--

Thanks, [~kangkaisen], patch merged. 
https://github.com/apache/kylin/commit/e46d699e05100db084db354d7efb3786575d5c54

> Output table name to error info in LookupTable
> --
>
> Key: KYLIN-2364
> URL: https://issues.apache.org/jira/browse/KYLIN-2364
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v1.6.0
>Reporter: kangkaisen
>Assignee: kangkaisen
>Priority: Minor
> Attachments: KYLIN-2364.patch
>
>
> We should output table name so that the user know which LookupTable is broken.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-2364) Output table name to error info in LookupTable

2017-01-07 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-2364:
--
Attachment: KYLIN-2364.patch

This is the patch.

> Output table name to error info in LookupTable
> --
>
> Key: KYLIN-2364
> URL: https://issues.apache.org/jira/browse/KYLIN-2364
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v1.6.0
>Reporter: kangkaisen
>Assignee: kangkaisen
>Priority: Minor
> Attachments: KYLIN-2364.patch
>
>
> We should output table name so that the user know which LookupTable is broken.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2338) refactor BitmapCounter.DataInputByteBuffer

2017-01-07 Thread kangkaisen (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807346#comment-15807346
 ] 

kangkaisen commented on KYLIN-2338:
---

OK, thank you very much.

> refactor BitmapCounter.DataInputByteBuffer
> --
>
> Key: KYLIN-2338
> URL: https://issues.apache.org/jira/browse/KYLIN-2338
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: kangkaisen
>Assignee: kangkaisen
>Priority: Minor
> Fix For: v2.0.0
>
> Attachments: KYLIN-2338.patch
>
>
> Make BitmapCounter.DataInputByteBuffer simpler and more readable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KYLIN-2338) refactor BitmapCounter.DataInputByteBuffer

2017-01-07 Thread liyang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyang resolved KYLIN-2338.
---
   Resolution: Fixed
Fix Version/s: v2.0.0

> refactor BitmapCounter.DataInputByteBuffer
> --
>
> Key: KYLIN-2338
> URL: https://issues.apache.org/jira/browse/KYLIN-2338
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: kangkaisen
>Assignee: kangkaisen
>Priority: Minor
> Fix For: v2.0.0
>
> Attachments: KYLIN-2338.patch
>
>
> Make BitmapCounter.DataInputByteBuffer simpler and more readable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KYLIN-2353) Serialize BitmapCounter with distinct count

2017-01-07 Thread liyang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liyang resolved KYLIN-2353.
---
   Resolution: Fixed
Fix Version/s: v2.0.0

> Serialize BitmapCounter with distinct count
> ---
>
> Key: KYLIN-2353
> URL: https://issues.apache.org/jira/browse/KYLIN-2353
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Affects Versions: v1.6.0
>Reporter: kangkaisen
>Assignee: kangkaisen
> Fix For: v2.0.0
>
> Attachments: KYLIN-2353.patch
>
>
> Currently, we deserialize the bitmap whether we need to aggregate or not.
> Actually, we could serialize {{BitmapCounter}} with bitmap counter and delay 
> to deserialize bitmap until we need to aggregate bitmap and only get the 
> counter for the bitmap when deserialize.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KYLIN-2364) Output table name to error info in LookupTable

2017-01-07 Thread kangkaisen (JIRA)
kangkaisen created KYLIN-2364:
-

 Summary: Output table name to error info in LookupTable
 Key: KYLIN-2364
 URL: https://issues.apache.org/jira/browse/KYLIN-2364
 Project: Kylin
  Issue Type: Improvement
  Components: Metadata
Affects Versions: v1.6.0
Reporter: kangkaisen
Assignee: kangkaisen
Priority: Minor


We should output table name so that the user know which LookupTable is broken.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-2357) Make ERROR_RECORD_LOG_THRESHOLD configurable

2017-01-07 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-2357:
--
Attachment: KYLIN-2357.patch

This is the patch.

> Make ERROR_RECORD_LOG_THRESHOLD configurable
> 
>
> Key: KYLIN-2357
> URL: https://issues.apache.org/jira/browse/KYLIN-2357
> Project: Kylin
>  Issue Type: Bug
>  Components: Job Engine
>Affects Versions: v1.6.0
>Reporter: kangkaisen
>Assignee: kangkaisen
>Priority: Minor
> Attachments: KYLIN-2357.patch
>
>
> currently, the {{BatchConstants.ERROR_RECORD_LOG_THRESHOLD}} is hardcode to 
> 100.I wonder why we accept the error record. 
> Normally, the cubing should have zero error record.Besides, even if only have 
> one error record, the query results will be different from Hive or Presto.
> So. I think we could make the ERROR_RECORD_LOG_THRESHOLD configurable and the 
> default value is 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-2308) Allow user to set more columnFamily in web

2017-01-07 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-2308:
--
Affects Version/s: (was: v2.0.0)
   v1.6.0

> Allow user to set more columnFamily in web 
> ---
>
> Key: KYLIN-2308
> URL: https://issues.apache.org/jira/browse/KYLIN-2308
> Project: Kylin
>  Issue Type: Improvement
>  Components: Web 
>Affects Versions: v1.6.0
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-2308.patch
>
>
> currently, when user set dozens of precise count distinct metrics in one 
> cube, we put all the count distinct metrics column in one columnFamily. Which 
> result in HBase scan become slow because the one {{KeyValue}} is too big. we 
> couldset more columnFamily to speed up the HBase scan in this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KYLIN-2308) Allow user to set more columnFamily in web

2017-01-07 Thread kangkaisen (JIRA)

 [ 
https://issues.apache.org/jira/browse/KYLIN-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kangkaisen updated KYLIN-2308:
--
Attachment: KYLIN-2308.patch

This is the patch.

> Allow user to set more columnFamily in web 
> ---
>
> Key: KYLIN-2308
> URL: https://issues.apache.org/jira/browse/KYLIN-2308
> Project: Kylin
>  Issue Type: Improvement
>  Components: Web 
>Affects Versions: v1.6.0
>Reporter: kangkaisen
>Assignee: kangkaisen
> Attachments: KYLIN-2308.patch
>
>
> currently, when user set dozens of precise count distinct metrics in one 
> cube, we put all the count distinct metrics column in one columnFamily. Which 
> result in HBase scan become slow because the one {{KeyValue}} is too big. we 
> couldset more columnFamily to speed up the HBase scan in this scenario.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2338) refactor BitmapCounter.DataInputByteBuffer

2017-01-07 Thread liyang (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807281#comment-15807281
 ] 

liyang commented on KYLIN-2338:
---

I'm merging this one and KYLIN-2349, KYLIN-2353

> refactor BitmapCounter.DataInputByteBuffer
> --
>
> Key: KYLIN-2338
> URL: https://issues.apache.org/jira/browse/KYLIN-2338
> Project: Kylin
>  Issue Type: Improvement
>  Components: Metadata
>Reporter: kangkaisen
>Assignee: kangkaisen
>Priority: Minor
> Attachments: KYLIN-2338.patch
>
>
> Make BitmapCounter.DataInputByteBuffer simpler and more readable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2363) support limit of dimensions in a cuboid

2017-01-07 Thread fengYu (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807081#comment-15807081
 ] 

fengYu commented on KYLIN-2363:
---

yes, set a range or enumerate all levels to be calculated is a more 
user-friendly solution.

> support limit of dimensions in a cuboid
> ---
>
> Key: KYLIN-2363
> URL: https://issues.apache.org/jira/browse/KYLIN-2363
> Project: Kylin
>  Issue Type: Improvement
>Reporter: fengYu
>
> the scene like this:
> I have 20+ dimensions, However the query will only use at most 5 dimensions 
> in all dimensions, so cuboid that contains 5+ dimensions(except base cuboid) 
> is useless.
> I think we can add a configuration in cube, which limit the max dimensions 
> that cuboid includes.
> What's more, we can config which level(number of dimension) need to 
> calculate. in above scene, we only calculate leve 1,2,3,4,5. and skip level 5+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KYLIN-2363) support limit of dimensions in a cuboid

2017-01-07 Thread XIE FAN (JIRA)

[ 
https://issues.apache.org/jira/browse/KYLIN-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807070#comment-15807070
 ] 

XIE FAN commented on KYLIN-2363:


I agree with this idea. And I think we can extend this idea a little: allow 
users to select exactly which cuboids they need in a visual way. Users can 
choose what cuboids they need in the front-end and only these cuboids will be 
materialized. For example, users can choose to calculate all the 1-D and 2-D 
cuboids and part of the 3-D, 4-D cuboids and exclude the other.

> support limit of dimensions in a cuboid
> ---
>
> Key: KYLIN-2363
> URL: https://issues.apache.org/jira/browse/KYLIN-2363
> Project: Kylin
>  Issue Type: Improvement
>Reporter: fengYu
>
> the scene like this:
> I have 20+ dimensions, However the query will only use at most 5 dimensions 
> in all dimensions, so cuboid that contains 5+ dimensions(except base cuboid) 
> is useless.
> I think we can add a configuration in cube, which limit the max dimensions 
> that cuboid includes.
> What's more, we can config which level(number of dimension) need to 
> calculate. in above scene, we only calculate leve 1,2,3,4,5. and skip level 5+



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)