[jira] [Updated] (KYLIN-2366) multi FactDistinctColumnsReducer task write the same output file
[ https://issues.apache.org/jira/browse/KYLIN-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zheng zhifeng updated KYLIN-2366: - Description: Multi FactDistinctColumnsReducer task will write the distinct column to the same output file if the mapreduce property mapreduce.reduce.speculative is true. This will make the mapreduce job failed or the output file is redundancy. I think it shoud set mapreduce.reduce.speculative to false on Extract Fact Table Distinct Columns step. was: Multi FactDistinctColumnsReducer task will write the distinct columns to the same output file if the mapreduce property mapreduce.reduce.speculative is true. This will make the mapreduce job failed or the output file is redundancy. I think it shoud set mapreduce.reduce.speculative to false on Extract Fact Table Distinct Columns step. > multi FactDistinctColumnsReducer task write the same output file > > > Key: KYLIN-2366 > URL: https://issues.apache.org/jira/browse/KYLIN-2366 > Project: Kylin > Issue Type: Bug > Components: streaming >Affects Versions: v1.6.0 >Reporter: zheng zhifeng > Fix For: Future > > > Multi FactDistinctColumnsReducer task will write the > distinct column to the same output file if the mapreduce > property mapreduce.reduce.speculative is true. This will > make the mapreduce job failed or the output file is redundancy. > I think it shoud set mapreduce.reduce.speculative to false on > Extract Fact Table Distinct Columns step. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2366) multi FactDistinctColumnsReducer task write the same output file
zheng zhifeng created KYLIN-2366: Summary: multi FactDistinctColumnsReducer task write the same output file Key: KYLIN-2366 URL: https://issues.apache.org/jira/browse/KYLIN-2366 Project: Kylin Issue Type: Bug Components: streaming Affects Versions: v1.6.0 Reporter: zheng zhifeng Fix For: Future Multi FactDistinctColumnsReducer task will write the distinct columns to the same output file if the mapreduce property mapreduce.reduce.speculative is true. This will make the mapreduce job failed or the output file is redundancy. I think it shoud set mapreduce.reduce.speculative to false on Extract Fact Table Distinct Columns step. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2365) multi FactDistinctColumnsReducer task write the same output file
zheng zhifeng created KYLIN-2365: Summary: multi FactDistinctColumnsReducer task write the same output file Key: KYLIN-2365 URL: https://issues.apache.org/jira/browse/KYLIN-2365 Project: Kylin Issue Type: Bug Components: streaming Affects Versions: v1.6.0 Reporter: zheng zhifeng Fix For: Future Multi FactDistinctColumnsReducer task will write the distinct columns to the same output file if the mapreduce property mapreduce.reduce.speculative is true. This will make the mapreduce job failed or the output file is redundancy. I think it shoud set mapreduce.reduce.speculative to false on Extract Fact Table Distinct Columns step. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2364) Output table name to error info in LookupTable
[ https://issues.apache.org/jira/browse/KYLIN-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15808688#comment-15808688 ] kangkaisen commented on KYLIN-2364: --- OK. Thank you, Billy. > Output table name to error info in LookupTable > -- > > Key: KYLIN-2364 > URL: https://issues.apache.org/jira/browse/KYLIN-2364 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v1.6.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.0.0 > > Attachments: KYLIN-2364.patch > > > We should output table name so that the user know which LookupTable is broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-2364) Output table name to error info in LookupTable
[ https://issues.apache.org/jira/browse/KYLIN-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billy Liu resolved KYLIN-2364. -- Resolution: Fixed Fix Version/s: v2.0.0 > Output table name to error info in LookupTable > -- > > Key: KYLIN-2364 > URL: https://issues.apache.org/jira/browse/KYLIN-2364 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v1.6.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.0.0 > > Attachments: KYLIN-2364.patch > > > We should output table name so that the user know which LookupTable is broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2364) Output table name to error info in LookupTable
[ https://issues.apache.org/jira/browse/KYLIN-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15808675#comment-15808675 ] Billy Liu commented on KYLIN-2364: -- Thanks, [~kangkaisen], patch merged. https://github.com/apache/kylin/commit/e46d699e05100db084db354d7efb3786575d5c54 > Output table name to error info in LookupTable > -- > > Key: KYLIN-2364 > URL: https://issues.apache.org/jira/browse/KYLIN-2364 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v1.6.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2364.patch > > > We should output table name so that the user know which LookupTable is broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-2364) Output table name to error info in LookupTable
[ https://issues.apache.org/jira/browse/KYLIN-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2364: -- Attachment: KYLIN-2364.patch This is the patch. > Output table name to error info in LookupTable > -- > > Key: KYLIN-2364 > URL: https://issues.apache.org/jira/browse/KYLIN-2364 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v1.6.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2364.patch > > > We should output table name so that the user know which LookupTable is broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2338) refactor BitmapCounter.DataInputByteBuffer
[ https://issues.apache.org/jira/browse/KYLIN-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807346#comment-15807346 ] kangkaisen commented on KYLIN-2338: --- OK, thank you very much. > refactor BitmapCounter.DataInputByteBuffer > -- > > Key: KYLIN-2338 > URL: https://issues.apache.org/jira/browse/KYLIN-2338 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.0.0 > > Attachments: KYLIN-2338.patch > > > Make BitmapCounter.DataInputByteBuffer simpler and more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-2338) refactor BitmapCounter.DataInputByteBuffer
[ https://issues.apache.org/jira/browse/KYLIN-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyang resolved KYLIN-2338. --- Resolution: Fixed Fix Version/s: v2.0.0 > refactor BitmapCounter.DataInputByteBuffer > -- > > Key: KYLIN-2338 > URL: https://issues.apache.org/jira/browse/KYLIN-2338 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Fix For: v2.0.0 > > Attachments: KYLIN-2338.patch > > > Make BitmapCounter.DataInputByteBuffer simpler and more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (KYLIN-2353) Serialize BitmapCounter with distinct count
[ https://issues.apache.org/jira/browse/KYLIN-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyang resolved KYLIN-2353. --- Resolution: Fixed Fix Version/s: v2.0.0 > Serialize BitmapCounter with distinct count > --- > > Key: KYLIN-2353 > URL: https://issues.apache.org/jira/browse/KYLIN-2353 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Affects Versions: v1.6.0 >Reporter: kangkaisen >Assignee: kangkaisen > Fix For: v2.0.0 > > Attachments: KYLIN-2353.patch > > > Currently, we deserialize the bitmap whether we need to aggregate or not. > Actually, we could serialize {{BitmapCounter}} with bitmap counter and delay > to deserialize bitmap until we need to aggregate bitmap and only get the > counter for the bitmap when deserialize. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (KYLIN-2364) Output table name to error info in LookupTable
kangkaisen created KYLIN-2364: - Summary: Output table name to error info in LookupTable Key: KYLIN-2364 URL: https://issues.apache.org/jira/browse/KYLIN-2364 Project: Kylin Issue Type: Improvement Components: Metadata Affects Versions: v1.6.0 Reporter: kangkaisen Assignee: kangkaisen Priority: Minor We should output table name so that the user know which LookupTable is broken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-2357) Make ERROR_RECORD_LOG_THRESHOLD configurable
[ https://issues.apache.org/jira/browse/KYLIN-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2357: -- Attachment: KYLIN-2357.patch This is the patch. > Make ERROR_RECORD_LOG_THRESHOLD configurable > > > Key: KYLIN-2357 > URL: https://issues.apache.org/jira/browse/KYLIN-2357 > Project: Kylin > Issue Type: Bug > Components: Job Engine >Affects Versions: v1.6.0 >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2357.patch > > > currently, the {{BatchConstants.ERROR_RECORD_LOG_THRESHOLD}} is hardcode to > 100.I wonder why we accept the error record. > Normally, the cubing should have zero error record.Besides, even if only have > one error record, the query results will be different from Hive or Presto. > So. I think we could make the ERROR_RECORD_LOG_THRESHOLD configurable and the > default value is 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-2308) Allow user to set more columnFamily in web
[ https://issues.apache.org/jira/browse/KYLIN-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2308: -- Affects Version/s: (was: v2.0.0) v1.6.0 > Allow user to set more columnFamily in web > --- > > Key: KYLIN-2308 > URL: https://issues.apache.org/jira/browse/KYLIN-2308 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v1.6.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2308.patch > > > currently, when user set dozens of precise count distinct metrics in one > cube, we put all the count distinct metrics column in one columnFamily. Which > result in HBase scan become slow because the one {{KeyValue}} is too big. we > couldset more columnFamily to speed up the HBase scan in this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (KYLIN-2308) Allow user to set more columnFamily in web
[ https://issues.apache.org/jira/browse/KYLIN-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kangkaisen updated KYLIN-2308: -- Attachment: KYLIN-2308.patch This is the patch. > Allow user to set more columnFamily in web > --- > > Key: KYLIN-2308 > URL: https://issues.apache.org/jira/browse/KYLIN-2308 > Project: Kylin > Issue Type: Improvement > Components: Web >Affects Versions: v1.6.0 >Reporter: kangkaisen >Assignee: kangkaisen > Attachments: KYLIN-2308.patch > > > currently, when user set dozens of precise count distinct metrics in one > cube, we put all the count distinct metrics column in one columnFamily. Which > result in HBase scan become slow because the one {{KeyValue}} is too big. we > couldset more columnFamily to speed up the HBase scan in this scenario. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2338) refactor BitmapCounter.DataInputByteBuffer
[ https://issues.apache.org/jira/browse/KYLIN-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807281#comment-15807281 ] liyang commented on KYLIN-2338: --- I'm merging this one and KYLIN-2349, KYLIN-2353 > refactor BitmapCounter.DataInputByteBuffer > -- > > Key: KYLIN-2338 > URL: https://issues.apache.org/jira/browse/KYLIN-2338 > Project: Kylin > Issue Type: Improvement > Components: Metadata >Reporter: kangkaisen >Assignee: kangkaisen >Priority: Minor > Attachments: KYLIN-2338.patch > > > Make BitmapCounter.DataInputByteBuffer simpler and more readable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2363) support limit of dimensions in a cuboid
[ https://issues.apache.org/jira/browse/KYLIN-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807081#comment-15807081 ] fengYu commented on KYLIN-2363: --- yes, set a range or enumerate all levels to be calculated is a more user-friendly solution. > support limit of dimensions in a cuboid > --- > > Key: KYLIN-2363 > URL: https://issues.apache.org/jira/browse/KYLIN-2363 > Project: Kylin > Issue Type: Improvement >Reporter: fengYu > > the scene like this: > I have 20+ dimensions, However the query will only use at most 5 dimensions > in all dimensions, so cuboid that contains 5+ dimensions(except base cuboid) > is useless. > I think we can add a configuration in cube, which limit the max dimensions > that cuboid includes. > What's more, we can config which level(number of dimension) need to > calculate. in above scene, we only calculate leve 1,2,3,4,5. and skip level 5+ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (KYLIN-2363) support limit of dimensions in a cuboid
[ https://issues.apache.org/jira/browse/KYLIN-2363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15807070#comment-15807070 ] XIE FAN commented on KYLIN-2363: I agree with this idea. And I think we can extend this idea a little: allow users to select exactly which cuboids they need in a visual way. Users can choose what cuboids they need in the front-end and only these cuboids will be materialized. For example, users can choose to calculate all the 1-D and 2-D cuboids and part of the 3-D, 4-D cuboids and exclude the other. > support limit of dimensions in a cuboid > --- > > Key: KYLIN-2363 > URL: https://issues.apache.org/jira/browse/KYLIN-2363 > Project: Kylin > Issue Type: Improvement >Reporter: fengYu > > the scene like this: > I have 20+ dimensions, However the query will only use at most 5 dimensions > in all dimensions, so cuboid that contains 5+ dimensions(except base cuboid) > is useless. > I think we can add a configuration in cube, which limit the max dimensions > that cuboid includes. > What's more, we can config which level(number of dimension) need to > calculate. in above scene, we only calculate leve 1,2,3,4,5. and skip level 5+ -- This message was sent by Atlassian JIRA (v6.3.4#6332)