[jira] [Created] (CARBONDATA-4174) Handle exception for desc column
SHREELEKHYA GAMPA created CARBONDATA-4174: - Summary: Handle exception for desc column Key: CARBONDATA-4174 URL: https://issues.apache.org/jira/browse/CARBONDATA-4174 Project: CarbonData Issue Type: Bug Reporter: SHREELEKHYA GAMPA Validation not present for children column in desc column for a primitive datatype and higher level non existing children column desc column for a complex datatype drop table if exists complexcarbontable; create table complexcarbontable (deviceInformationId int,channelsId string,ROMSize string,purchasedate string,mobile struct,MAC array,gamePointId map,contractNumber double) STORED AS carbondata; describe column deviceInformationId.x on complexcarbontable; describe column channelsId.x on complexcarbontable; describe column mobile.imei.x on complexcarbontable; describe column MAC.item.x on complexcarbontable; describe column gamePointId.key.x on complexcarbontable; [Expected Result] :- Validation should be provided for children column in desc column for a primitive datatype and higher level non existing children column desc column for a complex datatype. Command execution should fail. [Actual Issue] : - Validation not present for children column in desc column for a primitive datatype and higher level non existing children column desc column for a complex datatype. As a result the command execution is successful. [!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/21/c71035/7a3b04d78ceb4a489e6c038f4bb257db/image.png!|https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/21/c71035/7a3b04d78ceb4a489e6c038f4bb257db/image.png] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4173) Fix inverted index query issue
SHREELEKHYA GAMPA created CARBONDATA-4173: - Summary: Fix inverted index query issue Key: CARBONDATA-4173 URL: https://issues.apache.org/jira/browse/CARBONDATA-4173 Project: CarbonData Issue Type: Bug Reporter: SHREELEKHYA GAMPA select query with filter column which is present in inverted_index column does not return any value >From Spark beeline/SQL/Shell execute the following queries drop table if exists uniqdata6; CREATE TABLE uniqdata6(cust_id int,cust_name string,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 int)stored as carbondata TBLPROPERTIES ('sort_columns'='CUST_ID,CUST_NAME', 'inverted_index'='CUST_ID,CUST_NAME','sort_scope'='global_sort'); LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table uniqdata6 OPTIONS ('FILEHEADER'='CUST_ID,CUST_NAME ,ACTIVE_EMUI_VERSION,DOB,DOJ, BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE'); select cust_name from uniqdata6 limit 5; select * from uniqdata6 where CUST_NAME='CUST_NAME_2'; select * from uniqdata6 where CUST_NAME='CUST_NAME_3'; [Expected Result] :- select query with filter column which is present in inverted_index column should return correct value [Actual Issue] : - select query with filter column which is present in inverted_index column does not return any value [!https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/15/c71035/05443c9a9c11457e947645f1cf0ad347/image.png!|https://clouddevops.huawei.com/vision-file-storage/api/file/download/upload-v2/2021/3/15/c71035/05443c9a9c11457e947645f1cf0ad347/image.png] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4172) Select query having parent and child struct column in projection returns incorrect results
Indhumathi Muthumurugesh created CARBONDATA-4172: Summary: Select query having parent and child struct column in projection returns incorrect results Key: CARBONDATA-4172 URL: https://issues.apache.org/jira/browse/CARBONDATA-4172 Project: CarbonData Issue Type: Bug Reporter: Indhumathi Muthumurugesh struct column: col1 struct insert: named_struct('a',1,'b',2,'c','a') Query : select col1,col1.a from table; Result: col1 col1.a {a:1,b:null,c:null} 1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4037) Improve the table status and segment file writing
[ https://issues.apache.org/jira/browse/CARBONDATA-4037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash R Nilugal resolved CARBONDATA-4037. - Fix Version/s: 2.2.0 Resolution: Fixed > Improve the table status and segment file writing > - > > Key: CARBONDATA-4037 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4037 > Project: CarbonData > Issue Type: Improvement >Reporter: SHREELEKHYA GAMPA >Priority: Minor > Fix For: 2.2.0 > > Attachments: Improve table status and segment file writing_1.docx > > Time Spent: 27.5h > Remaining Estimate: 0h > > Currently, we update table status and segment files multiple times for a > single iud/merge/compact operation and delete the index files immediately > after merge. When concurrent queries are run, there may be situations like > user query is trying to access the segment index files and they are not > present, which is availability issue. > * To solve above issue, we can make mergeindex files generation mandatory > and fail load/compaction if mergeindex fails. Then if merge index is success, > update table status file and can delete index files immediately. However, in > legacy stores when alter segment merge is called, after merge index success, > do not delete index files immediately as it may cause issues for parallel > queries. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4162) Leverage Secondary Index till segment level with SI as datamap and SI with plan rewrite
[ https://issues.apache.org/jira/browse/CARBONDATA-4162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nihal kumar ojha updated CARBONDATA-4162: - Summary: Leverage Secondary Index till segment level with SI as datamap and SI with plan rewrite (was: Leverage Secondary Index till segment level with Spark plan rewrite) > Leverage Secondary Index till segment level with SI as datamap and SI with > plan rewrite > --- > > Key: CARBONDATA-4162 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4162 > Project: CarbonData > Issue Type: New Feature >Reporter: Nihal kumar ojha >Priority: Major > Attachments: Support SI at segment level.pdf > > Time Spent: 1.5h > Remaining Estimate: 0h > > *Background:* > Secondary index tables are created as indexes and managed as child tables > internally by Carbondata. In the existing architecture, if the parent(main) > table and SI table don’t > have the same valid segments then we disable the SI table. And then from the > next query onwards, we scan and prune only the parent table until we trigger > the next load or REINDEX command (as these commands will make the > parent and SI table segments in sync). Because of this, queries take more > time to give the result when SI is disabled. > *Proposed Solution:* > We are planning to leverage SI till the segment level. It means at place > of disabling the SI table(when parent and child table segments are not in > sync) > we will do pruning on SI tables for all the valid segments(segments with > status > success, marked for update and load partial success) and the rest of the > segments will be pruned by the parent table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4170) Support dropping of parent complex columns(array/struct/map)
[ https://issues.apache.org/jira/browse/CARBONDATA-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akshay updated CARBONDATA-4170: --- Description: Drop complex columns(array/struct/map) from carbon table. For example - arr1 array, struct1 struct, map1 map Command - ALTER TABLE DROP COLUMNS(arr1, struct1, map1) Design document - [https://docs.google.com/document/d/1DhhkVXM8rMvOuKDZeccJpFEfO3VkA9C0c7JHCV88NXI/edit] was: Drop complex columns(array/struct/map) from carbon table. For example - arr1 array, struct1 struct, map1 map Command - ALTER TABLE DROP COLUMNS(arr1, struct1,map1) Design document - [https://docs.google.com/document/d/1DhhkVXM8rMvOuKDZeccJpFEfO3VkA9C0c7JHCV88NXI/edit] > Support dropping of parent complex columns(array/struct/map) > > > Key: CARBONDATA-4170 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4170 > Project: CarbonData > Issue Type: Sub-task > Components: spark-integration >Reporter: Akshay >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Drop complex columns(array/struct/map) from carbon table. For example - > arr1 array, struct1 struct, map1 map > Command - > ALTER TABLE DROP COLUMNS(arr1, struct1, map1) > Design document - > [https://docs.google.com/document/d/1DhhkVXM8rMvOuKDZeccJpFEfO3VkA9C0c7JHCV88NXI/edit] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4170) Support dropping of parent complex columns(array/struct/map)
[ https://issues.apache.org/jira/browse/CARBONDATA-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akshay updated CARBONDATA-4170: --- Description: Drop complex columns(array/struct/map) from carbon table. For example - arr1 array, struct1 struct, map1 map Command - ALTER TABLE DROP COLUMNS(arr1, struct1,map1) Design document - [https://docs.google.com/document/d/1DhhkVXM8rMvOuKDZeccJpFEfO3VkA9C0c7JHCV88NXI/edit] was: Drop complex columns(only array and struct) from carbon table. For example - arr1 array, struct1 struct Command - ALTER TABLE DROP COLUMNS(arr1, struct1) Design document - [https://docs.google.com/document/d/1DhhkVXM8rMvOuKDZeccJpFEfO3VkA9C0c7JHCV88NXI/edit] > Support dropping of parent complex columns(array/struct/map) > > > Key: CARBONDATA-4170 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4170 > Project: CarbonData > Issue Type: Sub-task > Components: spark-integration >Reporter: Akshay >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Drop complex columns(array/struct/map) from carbon table. For example - > arr1 array, struct1 struct, map1 map > Command - > ALTER TABLE DROP COLUMNS(arr1, struct1,map1) > Design document - > [https://docs.google.com/document/d/1DhhkVXM8rMvOuKDZeccJpFEfO3VkA9C0c7JHCV88NXI/edit] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4170) Support dropping of parent complex columns(array/struct/map)
[ https://issues.apache.org/jira/browse/CARBONDATA-4170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akshay updated CARBONDATA-4170: --- Summary: Support dropping of parent complex columns(array/struct/map) (was: Support dropping of single & multi-level complex columns(array/struct)) > Support dropping of parent complex columns(array/struct/map) > > > Key: CARBONDATA-4170 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4170 > Project: CarbonData > Issue Type: Sub-task > Components: spark-integration >Reporter: Akshay >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > Drop complex columns(only array and struct) from carbon table. For example - > arr1 array, struct1 struct > Command - > ALTER TABLE DROP COLUMNS(arr1, struct1) > Design document - > [https://docs.google.com/document/d/1DhhkVXM8rMvOuKDZeccJpFEfO3VkA9C0c7JHCV88NXI/edit] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4160) Alter carbon schema related to complex columns
[ https://issues.apache.org/jira/browse/CARBONDATA-4160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akshay updated CARBONDATA-4160: --- Summary: Alter carbon schema related to complex columns(was: Alter carbon schema related to complex columns(array/struct) ) > Alter carbon schema related to complex columns > > > Key: CARBONDATA-4160 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4160 > Project: CarbonData > Issue Type: New Feature > Components: spark-integration >Reporter: Akshay >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > Add complex columns(only array and struct) to carbon table. For example - > array, struct > Command - > ALTER TABLE ADD COLUMNS(arr array) > Design document - > [https://docs.google.com/document/d/1DhhkVXM8rMvOuKDZeccJpFEfO3VkA9C0c7JHCV88NXI/edit] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-4158) Make Secondary Index as a coarse grain datamap and use secondary indexes for Presto queries
[ https://issues.apache.org/jira/browse/CARBONDATA-4158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor resolved CARBONDATA-4158. -- Fix Version/s: 2.2.0 Resolution: Fixed > Make Secondary Index as a coarse grain datamap and use secondary indexes for > Presto queries > --- > > Key: CARBONDATA-4158 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4158 > Project: CarbonData > Issue Type: New Feature >Reporter: Venugopal Reddy K >Priority: Minor > Fix For: 2.2.0 > > Time Spent: 13h 10m > Remaining Estimate: 0h > > *Background:* > Secondary Indexes are created as carbon tables and are managed as child > tables to the main table. And these indexes are leveraged for query pruning > via spark plan modification during optimizer/execution phases of query > execution. In order to make use of Secondary Indexes for queries from engines > other than spark like presto etc, it is not feasible to modify the engine > specific query execution plans as we desire in the current approach. It makes > Secondary Indexes not usable for presto query pruning. Thus need arises for > an engine agnostic approach to use Secondary Indexes for presto queries. > *Description:* > Current Secondary Index pruning is tightly coupled with spark because the > query plan modification is specific to the spark engine. It is hard to reuse > the solution for presto queries. Need a new solution to use secondary indexes > with Presto queries. And it shouldn’t affect the existing customer using > secondary index with spark. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4171) Transaction Manager, time travel and segment interface refactoring
[ https://issues.apache.org/jira/browse/CARBONDATA-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajantha Bhat updated CARBONDATA-4171: - Attachment: Transaction manager, time travel, segment interface_v1.pdf > Transaction Manager, time travel and segment interface refactoring > -- > > Key: CARBONDATA-4171 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4171 > Project: CarbonData > Issue Type: Improvement >Reporter: Ajantha Bhat >Priority: Major > Attachments: Transaction manager, time travel, segment > interface_v1.pdf > > > *Goals:* > *1) Implement a “Transaction Manager” with optimistic concurrency to provide > within a table transaction / versioning.* (interfaces should also be flexible > enough to support across table transactions) > *2) Support time travel in carbonData.* > *3) Decouple and clean up segment interfaces.* (which should also help in > supporting segment concepts to other open format under carbondata metadata > service) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Closed] (CARBONDATA-2827) Refactor Segment Status Manager Interface
[ https://issues.apache.org/jira/browse/CARBONDATA-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajantha Bhat closed CARBONDATA-2827. Resolution: Duplicate > Refactor Segment Status Manager Interface > - > > Key: CARBONDATA-2827 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2827 > Project: CarbonData > Issue Type: Improvement >Reporter: Ravindra Pesala >Priority: Major > Attachments: Segment Management interface design_V3.pdf, Segment > Status Management interface design_V1.docx, Segment Status Management > interface design_V1_Ramana_reviewed.docx, Segment Status Management interface > design_V2.pdf > > > Carbon uses tablestatus file to record segment status and details of each > segment during each load. This tablestatus enables carbon to support > concurrent loads and reads without data inconsistency or corruption. > So it is very important feature of carbondata and we should have clean > interfaces to maintain it. Current tablestatus updation is shattered to > multiple places and there is no clean interface, so I am proposing to > refactor current SegmentStatusManager interface and bringing all tablestatus > operations to single interface. > This new interface allows to add table status to any other storage like DB. > This is needed for S3 type object stores as these are eventually consistent. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CARBONDATA-2827) Refactor Segment Status Manager Interface
[ https://issues.apache.org/jira/browse/CARBONDATA-2827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17327125#comment-17327125 ] Ajantha Bhat commented on CARBONDATA-2827: -- will be handled as part of *CARBONDATA-4171* *https://issues.apache.org/jira/browse/CARBONDATA-4171* > Refactor Segment Status Manager Interface > - > > Key: CARBONDATA-2827 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2827 > Project: CarbonData > Issue Type: Improvement >Reporter: Ravindra Pesala >Priority: Major > Attachments: Segment Management interface design_V3.pdf, Segment > Status Management interface design_V1.docx, Segment Status Management > interface design_V1_Ramana_reviewed.docx, Segment Status Management interface > design_V2.pdf > > > Carbon uses tablestatus file to record segment status and details of each > segment during each load. This tablestatus enables carbon to support > concurrent loads and reads without data inconsistency or corruption. > So it is very important feature of carbondata and we should have clean > interfaces to maintain it. Current tablestatus updation is shattered to > multiple places and there is no clean interface, so I am proposing to > refactor current SegmentStatusManager interface and bringing all tablestatus > operations to single interface. > This new interface allows to add table status to any other storage like DB. > This is needed for S3 type object stores as these are eventually consistent. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-4171) Transaction Manager, time travel and segment interface refactoring
[ https://issues.apache.org/jira/browse/CARBONDATA-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajantha Bhat updated CARBONDATA-4171: - Description: *Goals:* *1) Implement a “Transaction Manager” with optimistic concurrency to provide within a table transaction / versioning.* (interfaces should also be flexible enough to support across table transactions) *2) Support time travel in carbonData.* *3) Decouple and clean up segment interfaces.* (which should also help in supporting segment concepts to other open format under carbondata metadata service) was: *Goals:* *1) Implement a “Transaction Manager” with optimistic concurrency to provide within a table transaction / versioning.* (interfaces should also be flexible enough to support across table transactions) *2) Support time travel in carbonData.* ***3) Decouple and clean up segment interfaces.* (which should also help in supporting segment concepts to other open format under carbondata metadata service) > Transaction Manager, time travel and segment interface refactoring > -- > > Key: CARBONDATA-4171 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4171 > Project: CarbonData > Issue Type: Improvement >Reporter: Ajantha Bhat >Priority: Major > > *Goals:* > *1) Implement a “Transaction Manager” with optimistic concurrency to provide > within a table transaction / versioning.* (interfaces should also be flexible > enough to support across table transactions) > *2) Support time travel in carbonData.* > *3) Decouple and clean up segment interfaces.* (which should also help in > supporting segment concepts to other open format under carbondata metadata > service) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (CARBONDATA-4171) Transaction Manager, time travel and segment interface refactoring
Ajantha Bhat created CARBONDATA-4171: Summary: Transaction Manager, time travel and segment interface refactoring Key: CARBONDATA-4171 URL: https://issues.apache.org/jira/browse/CARBONDATA-4171 Project: CarbonData Issue Type: Improvement Reporter: Ajantha Bhat *Goals:* *1) Implement a “Transaction Manager” with optimistic concurrency to provide within a table transaction / versioning.* (interfaces should also be flexible enough to support across table transactions) *2) Support time travel in carbonData.* ***3) Decouple and clean up segment interfaces.* (which should also help in supporting segment concepts to other open format under carbondata metadata service) -- This message was sent by Atlassian Jira (v8.3.4#803005)