[jira] [Resolved] (CARBONDATA-3233) JVM is getting crashed during dataload while compressing in snappy
[ https://issues.apache.org/jira/browse/CARBONDATA-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3233. -- Resolution: Fixed Fix Version/s: 1.5.2 > JVM is getting crashed during dataload while compressing in snappy > -- > > Key: CARBONDATA-3233 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3233 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 1.5.2 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > when huge dataload is done, some times dataload is failed and jvm is crashed > during snappy compression > > Below is the logs: > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) > j org.xerial.snappy.SnappyNative.rawCompress(JJJ)J+0 > j > org.apache.carbondata.core.datastore.compression.SnappyCompressor.rawCompress(JIJ)J+9 > j > org.apache.carbondata.core.datastore.page.UnsafeFixLengthColumnPage.compress(Lorg/apache/carbondata/core/datastore/compression/Compressor[B+50 > j > org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveCodec.encodeAndCompressPage(Lorg/apache/carbondata/core/datastore/page/ColumnPage;Lorg/apache/carbondata/core/datastore/page/ColumnPageValueConverter;Lorg/apache/carbondata/core/datastore/compression/Compressor[B+85 > j > org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveDeltaIntegralCodec$1.encodeData(Lorg/apache/carbondata/core/datastore/page/ColumnPage[B+45 > j > org.apache.carbondata.core.datastore.page.encoding.ColumnPageEncoder.encode(Lorg/apache/carbondata/core/datastore/page/ColumnPage;)Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+2 > j > org.apache.carbondata.processing.store.TablePage.encodeAndCompressMeasures()[Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+54 > j org.apache.carbondata.processing.store.TablePage.encode()V+6 > j > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.processDataRows(Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+86 > j > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.access$500(Lorg/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar;Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+2 > j > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Void;+8 > j > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Object;+1 > j java.util.concurrent.FutureTask.run()V+42 > j > java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95 > j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5 > j java.lang.Thread.run()V+11 > v ~StubRoutines::call_stub -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3241) Refactor the requested scan columns and the projection columns
[ https://issues.apache.org/jira/browse/CARBONDATA-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3241. -- Resolution: Fixed Fix Version/s: 1.5.2 > Refactor the requested scan columns and the projection columns > -- > > Key: CARBONDATA-3241 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3241 > Project: CarbonData > Issue Type: Improvement >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Trivial > Fix For: 1.5.2 > > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3223) Datasize and Indexsize showing 0B for 1.1 store when show segments is done
[ https://issues.apache.org/jira/browse/CARBONDATA-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3223. -- Resolution: Fixed Fix Version/s: 1.5.3 > Datasize and Indexsize showing 0B for 1.1 store when show segments is done > -- > > Key: CARBONDATA-3223 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3223 > Project: CarbonData > Issue Type: Bug >Reporter: MANISH NALLA >Assignee: MANISH NALLA >Priority: Minor > Fix For: 1.5.3 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > # Create table and load in 1.1 store. > # Refresh and Load in 1.5.1 version. > # Show Segments on the table will give 0B for the older segment. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3217) Optimize implicit filter expression performance by removing extra serialization
Manish Gupta created CARBONDATA-3217: Summary: Optimize implicit filter expression performance by removing extra serialization Key: CARBONDATA-3217 URL: https://issues.apache.org/jira/browse/CARBONDATA-3217 Project: CarbonData Issue Type: Bug Reporter: Manish Gupta # Currently all the filter values are getting serialized for all the tasks which is increasing the schedular delay thereby impacting the query performance. # For each task 2 times deserialization is taking place in the executor side which is not required. 1 time is suficient -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3202) updated schema is not updated in session catalog after add, drop or rename column.
[ https://issues.apache.org/jira/browse/CARBONDATA-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3202. -- Resolution: Fixed Assignee: Akash R Nilugal Fix Version/s: 1.5.3 > updated schema is not updated in session catalog after add, drop or rename > column. > --- > > Key: CARBONDATA-3202 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3202 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Minor > Fix For: 1.5.3 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > updated schema is not updated in session catalog after add, drop or rename > column. > > Spark does not support drop column , rename column, and supports add column > from spark2.2 onwards, so after rename, or add or drop column, the new > updated schema is not updated in catalog -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3203) Compaction failing for table which is retstructured
[ https://issues.apache.org/jira/browse/CARBONDATA-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3203. -- Resolution: Fixed Fix Version/s: 1.5.2 > Compaction failing for table which is retstructured > --- > > Key: CARBONDATA-3203 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3203 > Project: CarbonData > Issue Type: Bug >Reporter: MANISH NALLA >Assignee: MANISH NALLA >Priority: Minor > Fix For: 1.5.2 > > > Steps to reproduce: > # Create table with complex and primitive types. > # Load data 2-3 times. > # Drop one column. > # Trigger Compaction. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3196) Compaction Failing for Complex datatypes with Dictionary Include
[ https://issues.apache.org/jira/browse/CARBONDATA-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3196. -- Resolution: Fixed Fix Version/s: 1.5.2 > Compaction Failing for Complex datatypes with Dictionary Include > > > Key: CARBONDATA-3196 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3196 > Project: CarbonData > Issue Type: Bug >Reporter: MANISH NALLA >Assignee: MANISH NALLA >Priority: Minor > Fix For: 1.5.2 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Steps to reproduce: > # Create Table with Complex type and Dictionary Include Complex type. > # Load data into the table 2-3 times. > # Alter table compact 'major' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-45) Support MAP type
[ https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-45. Resolution: Fixed Assignee: Manish Gupta (was: Venkata Ramana G) Fix Version/s: 1.5.2 > Support MAP type > > > Key: CARBONDATA-45 > URL: https://issues.apache.org/jira/browse/CARBONDATA-45 > Project: CarbonData > Issue Type: New Feature > Components: core, sql >Reporter: cen yuhai >Assignee: Manish Gupta >Priority: Major > Fix For: 1.5.2 > > Attachments: MAP DATA-TYPE SUPPORT.pdf > > > {code:sql} > >>CREATE TABLE table1 ( > deviceInformationId int, > channelsId string, > props map) > STORED BY 'org.apache.carbondata.format' > >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root') > {code} > format of data to be read from csv, with '$' as level 1 delimiter and map > keys terminated by '#' > {code:sql} > >>load data local inpath '/tmp/data.csv' into table1 options > >>('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', > >>'COMPLEX_DELIMITER_FOR_KEY'='#') > 20,channel2,2#user2$100#usercommon > 30,channel3,3#user3$100#usercommon > 40,channel4,4#user3$100#usercommon > >>select channelId, props[100] from table1 where deviceInformationId > 10; > 20, usercommon > 30, usercommon > 40, usercommon > >>select channelId, props from table1 where props[2] = 'user2'; > 20, {2,'user2', 100, 'usercommon'} > {code} > Following cases needs to be handled: > ||Sub feature||Pending activity||Remarks|| > |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, > select * from maptable| > |Maptype lookup in projection and filter|Develop|Projection and filters needs > execution at spark| > |NULL values, UDFs, Describe support|Develop|| > |Compaction support | Test + fix | As compaction works at byte level, no > changes required. Needs to add test-cases| > |Insert into table| Develop | Source table data containing Map data needs to > convert from spark datatype to string , as carbon takes string as input row | > |Support DDL for Map fields Dictionary include and Dictionary Exclude | > Develop | Also needs to handle CarbonDictionaryDecoder to handle the same. | > |Support multilevel Map | Develop | currently DDL is validated to allow only > 2 levels, remove this restriction| > |Support Map value to be a measure | Develop | Currently array and struct > supports only dimensions which needs change| > |Support Alter table to add and remove Map column | Develop | implement DDL > and requires default value handling | > |Projections of Map loopup push down to carbon | Develop | this is an > optimization, when more number of values are present in Map | > |Filter map loolup push down to carbon | Develop | this is an optimization, > when more number of values are present in Map | > |Update Map values | Develop | update map value| > h4. Design suggestion: > Map can be represented internally stored as Array>, So that > conversion of data is required to Map data type while giving to spark. Schema > will have new column of map type similar to Array. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3017) Create DDL Support for Map Type
[ https://issues.apache.org/jira/browse/CARBONDATA-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3017. -- Resolution: Fixed Fix Version/s: 1.5.2 > Create DDL Support for Map Type > --- > > Key: CARBONDATA-3017 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3017 > Project: CarbonData > Issue Type: Sub-task >Reporter: MANISH NALLA >Assignee: MANISH NALLA >Priority: Major > Fix For: 1.5.2 > > Time Spent: 13h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3134) Wrong result when a column is dropped and added using alter with blocklet cache.
[ https://issues.apache.org/jira/browse/CARBONDATA-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3134. -- Resolution: Fixed Fix Version/s: 1.5.1 > Wrong result when a column is dropped and added using alter with blocklet > cache. > > > Key: CARBONDATA-3134 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3134 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Major > Fix For: 1.5.1 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > *Steps to reproduce:* > spark.sql("drop table if exists tile") > spark.sql("create table tile(b int, s int,bi bigint, t timestamp) partitioned > by (i int) stored by 'carbondata' TBLPROPERTIES > ('DICTIONARY_EXCLUDE'='b,s,i,bi,t','SORT_COLUMS'='b,s,i,bi,t', > 'cache_level'='blocklet')") > spark.sql("load data inpath 'C:/Users/k00475610/Documents/en_all.csv' into > table tile options('fileheader'='b,s,i,bi,t','DELIMITER'=',')") > spark.sql("select * from tile") > spark.sql("alter table tile drop columns(t)") > spark.sql("alter table tile add columns(t timestamp)") > spark.sql("load data inpath 'C:/Users/k00475610/Documents/en_all.csv' into > table tile options('fileheader'='b,s,i,bi,t','DELIMITER'=',')") > spark.sql("select * from tile").show() > > *Result:* > *+---+-+---+++* > *| b| s| bi| t| i|* > +---+-+---+++ > |100|2|93405673097|null|1644| > |100|2|93405673097|null|1644| > +---+-+---+++ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3113) Fixed Local Dictionary Query Performance and Added reusable buffer for direct flow
[ https://issues.apache.org/jira/browse/CARBONDATA-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3113. -- Resolution: Fixed Fix Version/s: 1.5.1 > Fixed Local Dictionary Query Performance and Added reusable buffer for > direct flow > --- > > Key: CARBONDATA-3113 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3113 > Project: CarbonData > Issue Type: Improvement >Reporter: kumar vishal >Assignee: kumar vishal >Priority: Major > Fix For: 1.5.1 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > 1. Added reusable buffer for direct flow > In query for each page each column it is creating a byte array, when number > of columns are high it is causing lots of minor gc and degrading query > performance, as each page is getting uncompressed one by one we can use same > buffer for all the columns and based on requested size it will resize. > 2. Fixed Local Dictionary performance issue. > Reverted back #2895 and fixed NPE issue by setting null for local dictionary > to vector In safe and Unsafe VariableLengthDataChunkStore -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3112) Optimise decompressing while filling the vector during conversion of primitive types
[ https://issues.apache.org/jira/browse/CARBONDATA-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3112. -- Resolution: Fixed Assignee: Ravindra Pesala Fix Version/s: 1.5.1 > Optimise decompressing while filling the vector during conversion of > primitive types > > > Key: CARBONDATA-3112 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3112 > Project: CarbonData > Issue Type: Improvement >Reporter: Ravindra Pesala >Assignee: Ravindra Pesala >Priority: Major > Fix For: 1.5.1 > > Time Spent: 20m > Remaining Estimate: 0h > > We can possibly avoid one copy by filling the vector during the conversion of > primitive types in codecs. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3088) enhance compaction performance by using prefetch
[ https://issues.apache.org/jira/browse/CARBONDATA-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3088. -- Resolution: Fixed Fix Version/s: 1.5.1 > enhance compaction performance by using prefetch > > > Key: CARBONDATA-3088 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3088 > Project: CarbonData > Issue Type: Improvement >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Major > Fix For: 1.5.1 > > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3106) Written_BY_APPNAME is not serialized in executor with GlobalSort
[ https://issues.apache.org/jira/browse/CARBONDATA-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3106. -- Resolution: Fixed Fix Version/s: 1.5.1 > Written_BY_APPNAME is not serialized in executor with GlobalSort > > > Key: CARBONDATA-3106 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3106 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthumurugesh >Assignee: Indhumathi Muthumurugesh >Priority: Major > Fix For: 1.5.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Problem: > Written_By_APPNAME when added in carbonproperty is not serialized in executor > with global sort > Steps to Reproduce: > # Create table and set sort_scope='global_sort' > # Load data into table and find the exception > *Exception: There is an unexpected error: null* > NOTE: This issue is reproducible only if driver and executor are running in a > different JVM Process > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3098) Negative value exponents giving wrong results
[ https://issues.apache.org/jira/browse/CARBONDATA-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3098. -- Resolution: Fixed Assignee: MANISH NALLA Fix Version/s: 1.5.1 > Negative value exponents giving wrong results > - > > Key: CARBONDATA-3098 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3098 > Project: CarbonData > Issue Type: Bug >Reporter: MANISH NALLA >Assignee: MANISH NALLA >Priority: Major > Fix For: 1.5.1 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Problem: When the value of exponent is a negative number then the data is > incorrect due to loss of precision of Floating point values and wrong > calculation of the count of decimal points. > > Steps to reproduce: > -> "create table float_c(f float) using carbon" > -> "insert into float_c select '1.4E-38' " -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3081) NPE when boolean column has null values with Vectorized SDK reader
[ https://issues.apache.org/jira/browse/CARBONDATA-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3081. -- Resolution: Fixed Assignee: Kunal Kapoor Fix Version/s: 1.5.1 > NPE when boolean column has null values with Vectorized SDK reader > -- > > Key: CARBONDATA-3081 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3081 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Major > Fix For: 1.5.1 > > Time Spent: 4h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3077) Fixed query failure in fileformat due stale cache issue
[ https://issues.apache.org/jira/browse/CARBONDATA-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-3077: - Attachment: 20181102101536.jpg > Fixed query failure in fileformat due stale cache issue > --- > > Key: CARBONDATA-3077 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3077 > Project: CarbonData > Issue Type: Bug >Reporter: Manish Gupta >Assignee: Manish Gupta >Priority: Major > Attachments: 20181102101536.jpg > > > *Problem* > While using FileFormat API, if a table created, dropped and then recreated > with the same name the query fails because of schema mismatch issue -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3077) Fixed query failure in fileformat due stale cache issue
Manish Gupta created CARBONDATA-3077: Summary: Fixed query failure in fileformat due stale cache issue Key: CARBONDATA-3077 URL: https://issues.apache.org/jira/browse/CARBONDATA-3077 Project: CarbonData Issue Type: Bug Reporter: Manish Gupta Assignee: Manish Gupta *Problem* While using FileFormat API, if a table created, dropped and then recreated with the same name the query fails because of schema mismatch issue -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3057) Implement Vectorized CarbonReader for SDK
[ https://issues.apache.org/jira/browse/CARBONDATA-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3057. -- Resolution: Fixed Assignee: Naman Rastogi Fix Version/s: 1.5.1 > Implement Vectorized CarbonReader for SDK > - > > Key: CARBONDATA-3057 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3057 > Project: CarbonData > Issue Type: Sub-task >Reporter: Naman Rastogi >Assignee: Naman Rastogi >Priority: Minor > Fix For: 1.5.1 > > Time Spent: 16h 10m > Remaining Estimate: 0h > > Implement Vectorized Reader and expose a API for the user to switch > between CarbonReader/Vectorized reader. Additionally an API would be > provided for the user to extract the columnar batch instead of rows. This > would allow the user to have a deeper integration with carbon. > Additionally the reduction in method calls for vector reader would improve > the read time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3066) ADD documentation for new APIs in SDK
[ https://issues.apache.org/jira/browse/CARBONDATA-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3066. -- Resolution: Fixed Fix Version/s: 1.5.1 > ADD documentation for new APIs in SDK > - > > Key: CARBONDATA-3066 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3066 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Minor > Fix For: 1.5.1 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > ADD documentation for new APIs in SDK -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3062) Fix Compatibility issue with cache_level as blocklet
[ https://issues.apache.org/jira/browse/CARBONDATA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3062. -- Resolution: Fixed Fix Version/s: 1.5.1 > Fix Compatibility issue with cache_level as blocklet > > > Key: CARBONDATA-3062 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3062 > Project: CarbonData > Issue Type: Improvement >Reporter: Indhumathi Muthumurugesh >Assignee: Indhumathi Muthumurugesh >Priority: Major > Fix For: 1.5.1 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Please find below steps to reproduce the issue: > # Create table and load data in legacy store > # In new store, load data and alter table set table properties > 'CACHE_LEVEL'='BLOCKLET' > # Perform Filter operation on that table and find below Exception > > |*Error: java.io.IOException: Problem in loading segment blocks. > (state=,code=0)* > | -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-3062) Fix Compatibility issue with cache_level as blocklet
[ https://issues.apache.org/jira/browse/CARBONDATA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-3062: - Issue Type: Bug (was: Improvement) > Fix Compatibility issue with cache_level as blocklet > > > Key: CARBONDATA-3062 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3062 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthumurugesh >Assignee: Indhumathi Muthumurugesh >Priority: Major > Fix For: 1.5.1 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Please find below steps to reproduce the issue: > # Create table and load data in legacy store > # In new store, load data and alter table set table properties > 'CACHE_LEVEL'='BLOCKLET' > # Perform Filter operation on that table and find below Exception > > |*Error: java.io.IOException: Problem in loading segment blocks. > (state=,code=0)* > | -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3054) Dictionary file cannot be read in S3a with CarbonDictionaryDecoder.doConsume() codeGen
[ https://issues.apache.org/jira/browse/CARBONDATA-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3054. -- Resolution: Fixed Fix Version/s: 1.5.1 > Dictionary file cannot be read in S3a with > CarbonDictionaryDecoder.doConsume() codeGen > -- > > Key: CARBONDATA-3054 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3054 > Project: CarbonData > Issue Type: Improvement >Reporter: Ajantha Bhat >Assignee: Ajantha Bhat >Priority: Major > Fix For: 1.5.1 > > Time Spent: 5h 20m > Remaining Estimate: 0h > > problem: In S3a environment, when quiried the data which has dictionary files, > Dictionary file cannot be read in S3a with > CarbonDictionaryDecoder.doConsume() codeGen even though file is present. > > cause: CarbonDictionaryDecoder.doConsume() codeGen doesn't set hadoop conf in > thread local variable, only doExecute() sets it. > Hence, when getDictionaryWrapper() called from doConsume() codeGen, > AbstractDictionaryCache.getDictionaryMetaCarbonFile() returns false for > fileExists() operation. > > solution: > In CarbonDictionaryDecoder.doConsume() codeGen, set hadoop conf in thread > local variable -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3061) Add validation for supported format version and Encoding type to throw proper exception to the user while reading a file
Manish Gupta created CARBONDATA-3061: Summary: Add validation for supported format version and Encoding type to throw proper exception to the user while reading a file Key: CARBONDATA-3061 URL: https://issues.apache.org/jira/browse/CARBONDATA-3061 Project: CarbonData Issue Type: Improvement Reporter: Manish Gupta Assignee: Manish Gupta This jira is raised to handle forward compatibility. Through this PR if any data file is read using a lower version (>=1.5.1), a proper exception will be thrown if columnar format version or any encoding type is not supported for read in that version -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3042) Column Schema objects are present in Driver and Executor even after dropping table
[ https://issues.apache.org/jira/browse/CARBONDATA-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3042. -- Resolution: Fixed Assignee: Indhumathi Muthumurugesh Fix Version/s: 1.5.1 > Column Schema objects are present in Driver and Executor even after dropping > table > -- > > Key: CARBONDATA-3042 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3042 > Project: CarbonData > Issue Type: Improvement >Reporter: Indhumathi Muthumurugesh >Assignee: Indhumathi Muthumurugesh >Priority: Major > Fix For: 1.5.1 > > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-3052) Improve drop table performance by reducing the namenode RPC calls during physical deletion of files
Manish Gupta created CARBONDATA-3052: Summary: Improve drop table performance by reducing the namenode RPC calls during physical deletion of files Key: CARBONDATA-3052 URL: https://issues.apache.org/jira/browse/CARBONDATA-3052 Project: CarbonData Issue Type: Improvement Reporter: Manish Gupta Assignee: Manish Gupta Current drop table command takes more than 1 minute to delete 3000 files during drop table operation from HDFS. This Jira is raised to improve the drop table operation performance -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2977) Write uncompress_size to ChunkCompressMeta in the file
[ https://issues.apache.org/jira/browse/CARBONDATA-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2977. -- Resolution: Fixed Assignee: Jacky Li Fix Version/s: 1.5.1 > Write uncompress_size to ChunkCompressMeta in the file > -- > > Key: CARBONDATA-2977 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2977 > Project: CarbonData > Issue Type: New Feature >Reporter: Jacky Li >Assignee: Jacky Li >Priority: Major > Fix For: 1.5.1 > > Time Spent: 6h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2998) Refresh column schema for old store(before V3) for SORT_COLUMNS option
[ https://issues.apache.org/jira/browse/CARBONDATA-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2998. -- Resolution: Fixed Fix Version/s: 1.5.1 > Refresh column schema for old store(before V3) for SORT_COLUMNS option > -- > > Key: CARBONDATA-2998 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2998 > Project: CarbonData > Issue Type: Bug >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Minor > Fix For: 1.5.1 > > Time Spent: 3h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-3022) Refactor ColumnPageWrapper
[ https://issues.apache.org/jira/browse/CARBONDATA-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-3022. -- Resolution: Fixed Fix Version/s: 1.5.1 > Refactor ColumnPageWrapper > -- > > Key: CARBONDATA-3022 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3022 > Project: CarbonData > Issue Type: Improvement >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Minor > Fix For: 1.5.1 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2995) Queries slow down after some time due to broadcast issue
Manish Gupta created CARBONDATA-2995: Summary: Queries slow down after some time due to broadcast issue Key: CARBONDATA-2995 URL: https://issues.apache.org/jira/browse/CARBONDATA-2995 Project: CarbonData Issue Type: Bug Reporter: Manish Gupta Assignee: Manish Gupta *Problem Description* It is observed that during consecutive run of queries after some time queries are slowing down. This is causing the degrade in query performance. No exception is thrown in driver and executor logs but as observed from the logs the time to broadcast hadoop conf is increasing after every query run. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2876) Support Avro datatype conversion to Carbon Format
[ https://issues.apache.org/jira/browse/CARBONDATA-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2876. -- Resolution: Fixed Assignee: Indhumathi Muthumurugesh Fix Version/s: 1.5.0 > Support Avro datatype conversion to Carbon Format > - > > Key: CARBONDATA-2876 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2876 > Project: CarbonData > Issue Type: Improvement >Reporter: Indhumathi Muthumurugesh >Assignee: Indhumathi Muthumurugesh >Priority: Major > Fix For: 1.5.0 > > Time Spent: 12h 50m > Remaining Estimate: 0h > > 1.Support Avro Complex Types: Enum, Union, Fixed with Carbon. > 2.Support Avro Logical Types: TimeMillis, TimeMicros, Decimal with Carbon. > > Please find the design document in the below link: > https://docs.google.com/document/d/1Jne8vNZ3OSYmJ_72hTIk_5I4EeIVtxGNE5mN_hBlnVE/edit?usp=sharing -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2990) JVM crashes when rebuilding the datamap.
[ https://issues.apache.org/jira/browse/CARBONDATA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2990. -- Resolution: Fixed Assignee: Ravindra Pesala Fix Version/s: 1.5.0 > JVM crashes when rebuilding the datamap. > > > Key: CARBONDATA-2990 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2990 > Project: CarbonData > Issue Type: Bug >Reporter: Ravindra Pesala >Assignee: Ravindra Pesala >Priority: Major > Fix For: 1.5.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > create table brinjal (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('table_blocksize'='1'); > > LOAD DATA INPATH '/home/root1/Downloads/vardhandaterestruct.csv' INTO TABLE > brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); > CREATE DATAMAP dm_brinjal ON TABLE brinjal USING 'bloomfilter' DMPROPERTIES > ('INDEX_COLUMNS' = 'AMSize', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1'); > > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in > stage 4.0 (TID 13, 192.168.0.12, executor 11): ExecutorLostFailure (executor > 11 exited caused by one of the running tasks) Reason: Remote RPC client > disassociated. Likely due to containers exceeding thresholds, or network > issues. Check driver logs for WARN messages. > Driver stacktrace: (state=,code=0) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2982) CarbonSchemaReader don't support Array
[ https://issues.apache.org/jira/browse/CARBONDATA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2982. -- Resolution: Fixed > CarbonSchemaReader don't support Array > -- > > Key: CARBONDATA-2982 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2982 > Project: CarbonData > Issue Type: Bug > Components: other >Affects Versions: 1.5.0 >Reporter: xubo245 >Assignee: xubo245 >Priority: Major > Fix For: 1.5.0 > > Time Spent: 4h 20m > Remaining Estimate: 0h > > CarbonSchemaReader don't support Array > When we read schema from index file and the data include array data > type > run org.apache.carbondata.examples.sdk.CarbonReaderExample : > {code:java} > Schema schema = CarbonSchemaReader > .readSchemaInIndexFile(dataFiles[0].getAbsolutePath()) > .asOriginOrder(); > // Transform the schema > String[] strings = new String[schema.getFields().length]; > for (int i = 0; i < schema.getFields().length; i++) { > strings[i] = (schema.getFields())[i].getFieldName(); > System.out.println(strings[i] + "\t" + > schema.getFields()[i].getSchemaOrdinal()); > } > {code} > and throw some exception: > {code:java} > log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more > info. > arrayfield.val0 -1 > stringfield 0 > shortfield1 > intfield 2 > longfield 3 > doublefield 4 > boolfield 5 > datefield 6 > timefield 7 > decimalfield 8 > varcharfield 9 > arrayfield10 > Complex child columns projection NOT supported through CarbonReader > java.lang.UnsupportedOperationException: Complex child columns projection NOT > supported through CarbonReader > at > org.apache.carbondata.sdk.file.CarbonReaderBuilder.build(CarbonReaderBuilder.java:155) > at > org.apache.carbondata.examples.sdk.CarbonReaderExample.main(CarbonReaderExample.java:110) > {code} > It print arrayfield.val0 -1, child schema -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2980) clear bloomindex cache when dropping datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2980. -- Resolution: Fixed Fix Version/s: 1.5.0 > clear bloomindex cache when dropping datamap > > > Key: CARBONDATA-2980 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2980 > Project: CarbonData > Issue Type: Bug >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Major > Fix For: 1.5.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > should clear the bloomindex cache when we drop datamap, otherwise query will > fail if we drop and recreate a brand new table and datamap and the stale > cache still exists. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (CARBONDATA-2979) select count fails when carbondata file is written through SDK and read through sparkfileformat for complex datatype map(struct->array->map)
[ https://issues.apache.org/jira/browse/CARBONDATA-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta reassigned CARBONDATA-2979: Assignee: Manish Gupta > select count fails when carbondata file is written through SDK and read > through sparkfileformat for complex datatype map(struct->array->map) > > > Key: CARBONDATA-2979 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2979 > Project: CarbonData > Issue Type: Bug > Components: file-format >Affects Versions: 1.5.0 >Reporter: Rahul Singha >Assignee: Manish Gupta >Priority: Minor > Attachments: MapSchema_15_int.avsc > > > *Steps:* > create carabondata and carbonindex file using SDK > place the files in a hdfs location > Read files using spark file format > create table schema15_int using carbon location > 'hdfs://hacluster/user/rahul/map/mapschema15_int'; > Select count(*) from schema15_int; > *Actual Result:* > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 0 in stage 24.0 failed 4 times, most recent failure: Lost task 0.3 in > stage 24.0 (TID 34, BLR114238, executor 3): java.io.IOException: All the > files doesn't have same schema. Unsupported operation on nonTransactional > table. Check logs. > at > org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.updateColumns(AbstractQueryExecutor.java:276) > at > org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.getDataBlocks(AbstractQueryExecutor.java:234) > at > org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.initQuery(AbstractQueryExecutor.java:141) > at > org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.getBlockExecutionInfos(AbstractQueryExecutor.java:401) > at > org.apache.carbondata.core.scan.executor.impl.VectorDetailQueryExecutor.execute(VectorDetailQueryExecutor.java:44) > at > org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.initialize(VectorizedCarbonRecordReader.java:143) > at > org.apache.spark.sql.carbondata.execution.datasources.SparkCarbonFileFormat$$anonfun$buildReaderWithPartitionValues$2.apply(SparkCarbonFileFormat.scala:395) > at > org.apache.spark.sql.carbondata.execution.datasources.SparkCarbonFileFormat$$anonfun$buildReaderWithPartitionValues$2.apply(SparkCarbonFileFormat.scala:361) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:124) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:174) > at > org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:105) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown > Source) > at > org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown > Source) > at > org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) > at > org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96) > at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53) > at org.apache.spark.scheduler.Task.run(Task.scala:108) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2972) Debug Logs and a function for type of Adaptive Encoding
[ https://issues.apache.org/jira/browse/CARBONDATA-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2972. -- Resolution: Fixed Assignee: MANISH NALLA Fix Version/s: 1.5.0 > Debug Logs and a function for type of Adaptive Encoding > --- > > Key: CARBONDATA-2972 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2972 > Project: CarbonData > Issue Type: Improvement >Reporter: MANISH NALLA >Assignee: MANISH NALLA >Priority: Major > Fix For: 1.5.0 > > Time Spent: 3h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2973) Add Documentation for complex Columns for Local Dictionary Support
[ https://issues.apache.org/jira/browse/CARBONDATA-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2973. -- Resolution: Fixed Fix Version/s: 1.5.0 > Add Documentation for complex Columns for Local Dictionary Support > -- > > Key: CARBONDATA-2973 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2973 > Project: CarbonData > Issue Type: Improvement >Reporter: Praveen M P >Assignee: Praveen M P >Priority: Minor > Fix For: 1.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2896) Adaptive encoding for primitive data types
[ https://issues.apache.org/jira/browse/CARBONDATA-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2896. -- Resolution: Fixed Fix Version/s: 1.5.0 > Adaptive encoding for primitive data types > -- > > Key: CARBONDATA-2896 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2896 > Project: CarbonData > Issue Type: New Feature >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Major > Fix For: 1.5.0 > > Time Spent: 22h 20m > Remaining Estimate: 0h > > Currently Encoding and Decoding is present only for Dictionary, Measure > Columns, but for no dictionary Primitive types encoding is *absent.* > Encoding is a technique used to reduce the storage size and after all these > encoding, result will be compressed with snappy compression to further reduce > the storage size. > With this feature, we support encoding on the no dictionary primitive data > types also. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2943) Add configurable min max writing support for streaming table
Manish Gupta created CARBONDATA-2943: Summary: Add configurable min max writing support for streaming table Key: CARBONDATA-2943 URL: https://issues.apache.org/jira/browse/CARBONDATA-2943 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Add configurable min max writing support for streaming table -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2942) Add read and write support for writing min max based on configurable bytes count
Manish Gupta created CARBONDATA-2942: Summary: Add read and write support for writing min max based on configurable bytes count Key: CARBONDATA-2942 URL: https://issues.apache.org/jira/browse/CARBONDATA-2942 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta Add read and write support for writing min max based on configurable bytes count for transactional and non transactional table which covers standard carbon table, File format and SDK -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2941) Support decision based min max writing for a column
Manish Gupta created CARBONDATA-2941: Summary: Support decision based min max writing for a column Key: CARBONDATA-2941 URL: https://issues.apache.org/jira/browse/CARBONDATA-2941 Project: CarbonData Issue Type: Improvement Reporter: Manish Gupta Assignee: Manish Gupta *Background* Currently we are storing min max for all the columns. Currently we are storing page min max, blocklet min max in filefooter and all the blocklet metadata entries in the shard. Consider the case where each column data size is more than 1 characters. In this case if we write min max then min max will be written 3 times for each column and it will lead to increase in store size which will impact the query performance. *Design proposal* 1. We will introduce a configurable system level property for max characters *"carbon.string.allowed.character.count".* If the data crosses this limit then min max will not be stored for that column. 2. If a page does not contain min max for a column, then blocklet min max will also not contain the entry for min max of that column. 3. Thrift file will be modified to introduce a option Boolean flag which will used in query to identify whether min max is stored for the filter column or not. 4. As of now it will be supported only for dimensions of string/varchar type. We can extend it further to support bigDecimal type measures also in future if required. 5. Block and blocklet dataMap cache will also include storing min max Boolean flag for dimensions column based on which filter pruning will be done. If min max is not written for any column then isScanRequired will return true in driver pruning. 6. In executor again page and blocklet level min max will be checked for filter column. If min max is not written then complete page data will be scanned. *Backward compatibility* 1. For stores prior to 1.5.0 min max flag for all the columns will be set to true during loading dataMap in query flow. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2924) Fix parsing issue for map as a nested array child and change the error message in sort column validation for SDK
Manish Gupta created CARBONDATA-2924: Summary: Fix parsing issue for map as a nested array child and change the error message in sort column validation for SDK Key: CARBONDATA-2924 URL: https://issues.apache.org/jira/browse/CARBONDATA-2924 Project: CarbonData Issue Type: Bug Reporter: Manish Gupta Assignee: Manish Gupta Attachments: issue1.png, issue2.png *Issue1:* Parsing exception thrown while parsing map as child of nested array type like array> and struct> (Image attached) *Issue2:* Wrong error message is displayed when map type is specified as sort column while writing through SDK (Image attached) *Issue3:* When complex type data length is more than short data type length for one row during loading then NegativeArraySize exception is thrown java.lang.NegativeArraySizeException at org.apache.carbondata.processing.loading.sort.SortStepRowHandler.unpackNoSortFromBytes(SortStepRowHandler.java:271) at org.apache.carbondata.processing.loading.sort.SortStepRowHandler.readRowFromMemoryWithNoSortFieldConvert(SortStepRowHandler.java:461) at org.apache.carbondata.processing.loading.sort.unsafe.UnsafeCarbonRowPage.getRow(UnsafeCarbonRowPage.java:93) at org.apache.carbondata.processing.loading.sort.unsafe.holder.UnsafeInmemoryHolder.readRow(UnsafeInmemoryHolder.java:61) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2910) Support backward compatability in fileformat and support different sort colums per load
[ https://issues.apache.org/jira/browse/CARBONDATA-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2910. -- Resolution: Fixed Assignee: Ravindra Pesala Fix Version/s: 1.5.0 > Support backward compatability in fileformat and support different sort > colums per load > --- > > Key: CARBONDATA-2910 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2910 > Project: CarbonData > Issue Type: Bug >Reporter: Ravindra Pesala >Assignee: Ravindra Pesala >Priority: Major > Fix For: 1.5.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > Currently if the data is loaded by old version with all dictionary exclude > carbon fileformat cannot read. > And also if the sort columns are given different per load while loading > through SDK does not work, -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2894) Add support for complex map type through spark carbon file format API
Manish Gupta created CARBONDATA-2894: Summary: Add support for complex map type through spark carbon file format API Key: CARBONDATA-2894 URL: https://issues.apache.org/jira/browse/CARBONDATA-2894 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-45) Support MAP type
[ https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-45: --- Attachment: MAP DATA-TYPE SUPPORT.pdf > Support MAP type > > > Key: CARBONDATA-45 > URL: https://issues.apache.org/jira/browse/CARBONDATA-45 > Project: CarbonData > Issue Type: New Feature > Components: core, sql >Reporter: cen yuhai >Assignee: Venkata Ramana G >Priority: Major > Attachments: MAP DATA-TYPE SUPPORT.pdf > > > {code:sql} > >>CREATE TABLE table1 ( > deviceInformationId int, > channelsId string, > props map) > STORED BY 'org.apache.carbondata.format' > >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root') > {code} > format of data to be read from csv, with '$' as level 1 delimiter and map > keys terminated by '#' > {code:sql} > >>load data local inpath '/tmp/data.csv' into table1 options > >>('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', > >>'COMPLEX_DELIMITER_FOR_KEY'='#') > 20,channel2,2#user2$100#usercommon > 30,channel3,3#user3$100#usercommon > 40,channel4,4#user3$100#usercommon > >>select channelId, props[100] from table1 where deviceInformationId > 10; > 20, usercommon > 30, usercommon > 40, usercommon > >>select channelId, props from table1 where props[2] = 'user2'; > 20, {2,'user2', 100, 'usercommon'} > {code} > Following cases needs to be handled: > ||Sub feature||Pending activity||Remarks|| > |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, > select * from maptable| > |Maptype lookup in projection and filter|Develop|Projection and filters needs > execution at spark| > |NULL values, UDFs, Describe support|Develop|| > |Compaction support | Test + fix | As compaction works at byte level, no > changes required. Needs to add test-cases| > |Insert into table| Develop | Source table data containing Map data needs to > convert from spark datatype to string , as carbon takes string as input row | > |Support DDL for Map fields Dictionary include and Dictionary Exclude | > Develop | Also needs to handle CarbonDictionaryDecoder to handle the same. | > |Support multilevel Map | Develop | currently DDL is validated to allow only > 2 levels, remove this restriction| > |Support Map value to be a measure | Develop | Currently array and struct > supports only dimensions which needs change| > |Support Alter table to add and remove Map column | Develop | implement DDL > and requires default value handling | > |Projections of Map loopup push down to carbon | Develop | this is an > optimization, when more number of values are present in Map | > |Filter map loolup push down to carbon | Develop | this is an optimization, > when more number of values are present in Map | > |Update Map values | Develop | update map value| > h4. Design suggestion: > Map can be represented internally stored as Array>, So that > conversion of data is required to Map data type while giving to spark. Schema > will have new column of map type similar to Array. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (CARBONDATA-2869) SDK support for Map DataType
[ https://issues.apache.org/jira/browse/CARBONDATA-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta reassigned CARBONDATA-2869: Assignee: Manish Gupta > SDK support for Map DataType > > > Key: CARBONDATA-2869 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2869 > Project: CarbonData > Issue Type: Sub-task >Reporter: Indhumathi Muthumurugesh >Assignee: Manish Gupta >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2829) Fix creating merge index on older V1 V2 store
[ https://issues.apache.org/jira/browse/CARBONDATA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2829: - Issue Type: Bug (was: Improvement) > Fix creating merge index on older V1 V2 store > - > > Key: CARBONDATA-2829 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2829 > Project: CarbonData > Issue Type: Bug >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 2h > Remaining Estimate: 0h > > Block creating merge index on older V1 V2 version -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2829) Fix creating merge index on older V1 V2 store
[ https://issues.apache.org/jira/browse/CARBONDATA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2829. -- Resolution: Fixed Fix Version/s: 1.4.1 > Fix creating merge index on older V1 V2 store > - > > Key: CARBONDATA-2829 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2829 > Project: CarbonData > Issue Type: Bug >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 2h > Remaining Estimate: 0h > > Block creating merge index on older V1 V2 version -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2832) Block loading error for select query executed after merge index command executed on V1/V2 store table
[ https://issues.apache.org/jira/browse/CARBONDATA-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2832. -- Resolution: Fixed Fix Version/s: 1.4.1 > Block loading error for select query executed after merge index command > executed on V1/V2 store table > - > > Key: CARBONDATA-2832 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2832 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.4.1 > Environment: Spark 2.1 >Reporter: Chetan Bhat >Assignee: dhatchayani >Priority: Minor > Fix For: 1.4.1 > > > Steps : > *Create and load data in V1/V2 carbon store:* > create table brinjal (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('table_blocksize'='1'); > LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE > brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge'); > *In 1.4.1* > refresh table brinjal; > alter table brinjal compact 'segment_index'; > select * from brinjal where AMSize='8RAM size'; > > *Issue : Block loading error for select query executed after merge index > command executed on V1/V2 store table.* > 0: jdbc:hive2://10.18.98.101:22550/default> select * from brinjal where > AMSize='8RAM size'; > *Error: java.io.IOException: Problem in loading segment blocks. > (state=,code=0)* > *Expected :* select query executed after merge index command executed on > V1/V2 store table should return correct result set without error** -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2813) Major compaction on partition table created in 1.3.x store is throwing Unable to get file status error.
[ https://issues.apache.org/jira/browse/CARBONDATA-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2813. -- Resolution: Fixed Fix Version/s: 1.4.1 > Major compaction on partition table created in 1.3.x store is throwing Unable > to get file status error. > --- > > Key: CARBONDATA-2813 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2813 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Steps to reproduce: > # Create a partitioned table in 1.3.x version. > # Load data into the table. > # move the table to current version cluster(1.4.x). > # Load data into table on 1.4.x version > # Run major compaction -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2805) Wrong order in custom compaction
[ https://issues.apache.org/jira/browse/CARBONDATA-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2805. -- Resolution: Fixed Fix Version/s: 1.4.1 > Wrong order in custom compaction > > > Key: CARBONDATA-2805 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2805 > Project: CarbonData > Issue Type: Bug >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 1.4.1 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > when we have segments from 0 to 6 and i give 1, 2, 3 for custom compaction, > then it should create 1.1 as compacted segment, but sometimes it will create > 3.1 as compacted segment which is wrong. > +-+-+++-+---+ > |SegmentSequenceId| Status| Load Start Time| Load End Time|Merged To|File > Format| > +-+-+++-+---+ > | 4| Success|2018-07-27 07:25:...|2018-07-27 07:25:...| NA|COLUMNAR_V3| > | 3.1| Success|2018-07-27 07:25:...|2018-07-27 07:25:...| NA|COLUMNAR_V3| > | 3|Compacted|2018-07-27 07:25:...|2018-07-27 07:25:...| 3.1|COLUMNAR_V3| > | 2|Compacted|2018-07-27 07:25:...|2018-07-27 07:25:...| 3.1|COLUMNAR_V3| > | 1|Compacted|2018-07-27 07:25:...|2018-07-27 07:25:...| 3.1|COLUMNAR_V3| > | 0| Success|2018-07-27 07:25:...|2018-07-27 07:25:...| NA|COLUMNAR_V3| > +-+-+++-+---+ > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2788) Fix bugs in incorrect query result with bloom datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2788. -- Resolution: Fixed Fix Version/s: 1.4.1 > Fix bugs in incorrect query result with bloom datamap > - > > Key: CARBONDATA-2788 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2788 > Project: CarbonData > Issue Type: Sub-task >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Major > Fix For: 1.4.1 > > Time Spent: 13h 20m > Remaining Estimate: 0h > > revert modification in PR2539 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2778) Empty result in query after IUD delete operation
[ https://issues.apache.org/jira/browse/CARBONDATA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2778. -- Resolution: Fixed Fix Version/s: 1.4.1 > Empty result in query after IUD delete operation > > > Key: CARBONDATA-2778 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2778 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > # drop table if exists t1 > # create table t1 (c1 int,c2 string) STORED BY > 'org.apache.carbondata.format' TBLPROPERTIES('table_blocksize'='1', > 'dictionary_exclude'='c2') > # LOAD DATA LOCAL INPATH 'test.csv' INTO table t1 > options('fileheader'='c1,c2') > # run delete command which should delete a whole block > # Run clean file operation. > # select from t1. > > *NOTE*: Disable mergeindex property -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2778) Empty result in query after IUD delete operation
[ https://issues.apache.org/jira/browse/CARBONDATA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2778: - Priority: Minor (was: Major) > Empty result in query after IUD delete operation > > > Key: CARBONDATA-2778 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2778 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > # drop table if exists t1 > # create table t1 (c1 int,c2 string) STORED BY > 'org.apache.carbondata.format' TBLPROPERTIES('table_blocksize'='1', > 'dictionary_exclude'='c2') > # LOAD DATA LOCAL INPATH 'test.csv' INTO table t1 > options('fileheader'='c1,c2') > # run delete command which should delete a whole block > # Run clean file operation. > # select from t1. > > *NOTE*: Disable mergeindex property -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2779) Filter query is failing for store created with V1/V2 format
[ https://issues.apache.org/jira/browse/CARBONDATA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2779: - Issue Type: Bug (was: Improvement) > Filter query is failing for store created with V1/V2 format > --- > > Key: CARBONDATA-2779 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2779 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal >Priority: Major > Fix For: 1.4.1 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Filter query is failing for store created with V1/V2 format with > Arrayindexoutofbound exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2779) Filter query is failing for store created with V1/V2 format
[ https://issues.apache.org/jira/browse/CARBONDATA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2779. -- Resolution: Fixed Fix Version/s: 1.4.1 > Filter query is failing for store created with V1/V2 format > --- > > Key: CARBONDATA-2779 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2779 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal >Priority: Major > Fix For: 1.4.1 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > Filter query is failing for store created with V1/V2 format with > Arrayindexoutofbound exception -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache
[ https://issues.apache.org/jira/browse/CARBONDATA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2638. -- Resolution: Fixed Fix Version/s: 1.4.1 > Implement driver min max caching for specified columns and segregate block > and blocklet cache > - > > Key: CARBONDATA-2638 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2638 > Project: CarbonData > Issue Type: New Feature >Reporter: Manish Gupta >Assignee: Manish Gupta >Priority: Major > Fix For: 1.4.1 > > Attachments: Driver_Block_Cache.docx > > > *Background* > Current implementation of Blocklet dataMap caching in driver is that it > caches the min and max values of all the columns in schema by default. > *Problem* > Problem with this implementation is that as the number of loads increases > the memory required to hold min and max values also increases considerably. > We know that in most of the scenarios there is a single driver and memory > configured for driver is less as compared to executor. With continuous > increase in memory requirement driver can even go out of memory which makes > the situation further worse. > *Solution* > 1. Cache only the required columns in Driver > 2. Segregation of block and Blocklet level cache** > For more details please check the attached document -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CARBONDATA-2651) Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties
[ https://issues.apache.org/jira/browse/CARBONDATA-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16555654#comment-16555654 ] Manish Gupta commented on CARBONDATA-2651: -- https://github.com/apache/carbondata/pull/2558 > Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties > --- > > Key: CARBONDATA-2651 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2651 > Project: CarbonData > Issue Type: Sub-task >Reporter: Manish Gupta >Assignee: Manish Gupta >Priority: Minor > Fix For: 1.4.1 > > > Update document for caching properties -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2651) Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties
[ https://issues.apache.org/jira/browse/CARBONDATA-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2651. -- Resolution: Fixed Assignee: Gururaj Shetty (was: Manish Gupta) Fix Version/s: 1.4.1 > Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties > --- > > Key: CARBONDATA-2651 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2651 > Project: CarbonData > Issue Type: Sub-task >Reporter: Manish Gupta >Assignee: Gururaj Shetty >Priority: Minor > Fix For: 1.4.1 > > > Update document for caching properties -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2621) Lock problem in index datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2621. -- Resolution: Fixed Fix Version/s: (was: 1.5.0) 1.4.1 > Lock problem in index datamap > - > > Key: CARBONDATA-2621 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2621 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.4.0 >Reporter: Mohammad Shahid Khan >Assignee: Mohammad Shahid Khan >Priority: Major > Fix For: 1.4.1 > > Time Spent: 9h 10m > Remaining Estimate: 0h > > The locking for the index Datamap is not correct. > The HDFS lock will not work properly, because the lock is getting created the > the local filesystem instead of HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2753) Fix Compatibility issues
[ https://issues.apache.org/jira/browse/CARBONDATA-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2753. -- Resolution: Fixed Assignee: dhatchayani (was: Indhumathi Muthumurugesh) Fix Version/s: 1.4.1 > Fix Compatibility issues > > > Key: CARBONDATA-2753 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2753 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthumurugesh >Assignee: dhatchayani >Priority: Major > Fix For: 1.4.1 > > Time Spent: 9h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2754) fix failing UT for HiveMetastore
[ https://issues.apache.org/jira/browse/CARBONDATA-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2754. -- Resolution: Fixed Fix Version/s: 1.4.1 > fix failing UT for HiveMetastore > > > Key: CARBONDATA-2754 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2754 > Project: CarbonData > Issue Type: Improvement >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Major > Fix For: 1.4.1 > > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2710) Refactor CarbonSparkSqlParser for better code reuse.
[ https://issues.apache.org/jira/browse/CARBONDATA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2710. -- Resolution: Fixed Assignee: Mohammad Shahid Khan Fix Version/s: 1.4.1 > Refactor CarbonSparkSqlParser for better code reuse. > > > Key: CARBONDATA-2710 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2710 > Project: CarbonData > Issue Type: Improvement >Reporter: Mohammad Shahid Khan >Assignee: Mohammad Shahid Khan >Priority: Major > Fix For: 1.4.1 > > Time Spent: 4.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2704) Index file size in describe formatted command is not updated correctly with the segment file
[ https://issues.apache.org/jira/browse/CARBONDATA-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2704. -- Resolution: Fixed Fix Version/s: 1.4.1 > Index file size in describe formatted command is not updated correctly with > the segment file > > > Key: CARBONDATA-2704 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2704 > Project: CarbonData > Issue Type: Bug >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 6h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2684) Code Generator Error is thrown when Select filter contains more than one count of distinct of ComplexColumn with group by Clause
[ https://issues.apache.org/jira/browse/CARBONDATA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2684. -- Resolution: Fixed Fix Version/s: 1.4.1 > Code Generator Error is thrown when Select filter contains more than one > count of distinct of ComplexColumn with group by Clause > > > Key: CARBONDATA-2684 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2684 > Project: CarbonData > Issue Type: Bug >Reporter: Indhumathi Muthumurugesh >Assignee: Indhumathi Muthumurugesh >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2701) Refactor code to store minimal required info in Block and Blocklet Cache
Manish Gupta created CARBONDATA-2701: Summary: Refactor code to store minimal required info in Block and Blocklet Cache Key: CARBONDATA-2701 URL: https://issues.apache.org/jira/browse/CARBONDATA-2701 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta Refactor code to store minimal required info in Block and Blocklet Cache -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache
[ https://issues.apache.org/jira/browse/CARBONDATA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2638: - Attachment: Driver_Block_Cache.docx > Implement driver min max caching for specified columns and segregate block > and blocklet cache > - > > Key: CARBONDATA-2638 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2638 > Project: CarbonData > Issue Type: New Feature >Reporter: Manish Gupta >Assignee: Manish Gupta >Priority: Major > Attachments: Driver_Block_Cache.docx > > > *Background* > Current implementation of Blocklet dataMap caching in driver is that it > caches the min and max values of all the columns in schema by default. > *Problem* > Problem with this implementation is that as the number of loads increases > the memory required to hold min and max values also increases considerably. > We know that in most of the scenarios there is a single driver and memory > configured for driver is less as compared to executor. With continuous > increase in memory requirement driver can even go out of memory which makes > the situation further worse. > *Solution* > 1. Cache only the required columns in Driver > 2. Segregation of block and Blocklet level cache** > For more details please check the attached document -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache
[ https://issues.apache.org/jira/browse/CARBONDATA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2638: - Attachment: (was: Driver_Block_Cache.docx) > Implement driver min max caching for specified columns and segregate block > and blocklet cache > - > > Key: CARBONDATA-2638 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2638 > Project: CarbonData > Issue Type: New Feature >Reporter: Manish Gupta >Assignee: Manish Gupta >Priority: Major > Attachments: Driver_Block_Cache.docx > > > *Background* > Current implementation of Blocklet dataMap caching in driver is that it > caches the min and max values of all the columns in schema by default. > *Problem* > Problem with this implementation is that as the number of loads increases > the memory required to hold min and max values also increases considerably. > We know that in most of the scenarios there is a single driver and memory > configured for driver is less as compared to executor. With continuous > increase in memory requirement driver can even go out of memory which makes > the situation further worse. > *Solution* > 1. Cache only the required columns in Driver > 2. Segregation of block and Blocklet level cache** > For more details please check the attached document -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2651) Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties
Manish Gupta created CARBONDATA-2651: Summary: Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties Key: CARBONDATA-2651 URL: https://issues.apache.org/jira/browse/CARBONDATA-2651 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta Update document for caching properties -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2649) Add code for caching min/max only for specified columns
Manish Gupta created CARBONDATA-2649: Summary: Add code for caching min/max only for specified columns Key: CARBONDATA-2649 URL: https://issues.apache.org/jira/browse/CARBONDATA-2649 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta Add code for caching min/max only for specified columns -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2648) Add support for COLUMN_META_CACHE in create table and alter table properties
Manish Gupta created CARBONDATA-2648: Summary: Add support for COLUMN_META_CACHE in create table and alter table properties Key: CARBONDATA-2648 URL: https://issues.apache.org/jira/browse/CARBONDATA-2648 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta Add support for COLUMN_META_CACHE in create table and alter table properties -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2647) Add support for CACHE_LEVEL in create table and alter table properties
Manish Gupta created CARBONDATA-2647: Summary: Add support for CACHE_LEVEL in create table and alter table properties Key: CARBONDATA-2647 URL: https://issues.apache.org/jira/browse/CARBONDATA-2647 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta Add support for CACHE_LEVEL in create table and alter table properties -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2645) Segregate block and blocklet cache
Manish Gupta created CARBONDATA-2645: Summary: Segregate block and blocklet cache Key: CARBONDATA-2645 URL: https://issues.apache.org/jira/browse/CARBONDATA-2645 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta Separate block and blocklet cache using the cache level configuration -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache
Manish Gupta created CARBONDATA-2638: Summary: Implement driver min max caching for specified columns and segregate block and blocklet cache Key: CARBONDATA-2638 URL: https://issues.apache.org/jira/browse/CARBONDATA-2638 Project: CarbonData Issue Type: New Feature Reporter: Manish Gupta Assignee: Manish Gupta Attachments: Driver_Block_Cache.docx *Background* Current implementation of Blocklet dataMap caching in driver is that it caches the min and max values of all the columns in schema by default. *Problem* Problem with this implementation is that as the number of loads increases the memory required to hold min and max values also increases considerably. We know that in most of the scenarios there is a single driver and memory configured for driver is less as compared to executor. With continuous increase in memory requirement driver can even go out of memory which makes the situation further worse. *Solution* 1. Cache only the required columns in Driver 2. Segregation of block and Blocklet level cache** For more details please check the attached document -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2623) Add DataMap Pre and Pevent listener
[ https://issues.apache.org/jira/browse/CARBONDATA-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2623. -- Resolution: Fixed Assignee: Mohammad Shahid Khan Fix Version/s: 1.4.1 > Add DataMap Pre and Pevent listener > --- > > Key: CARBONDATA-2623 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2623 > Project: CarbonData > Issue Type: Bug >Reporter: Mohammad Shahid Khan >Assignee: Mohammad Shahid Khan >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2623) Add DataMap Pre and Pevent listener
[ https://issues.apache.org/jira/browse/CARBONDATA-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2623: - Issue Type: Bug (was: Improvement) > Add DataMap Pre and Pevent listener > --- > > Key: CARBONDATA-2623 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2623 > Project: CarbonData > Issue Type: Bug >Reporter: Mohammad Shahid Khan >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2623) Add DataMap Pre and Pevent listener
[ https://issues.apache.org/jira/browse/CARBONDATA-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2623: - Priority: Minor (was: Major) > Add DataMap Pre and Pevent listener > --- > > Key: CARBONDATA-2623 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2623 > Project: CarbonData > Issue Type: Bug >Reporter: Mohammad Shahid Khan >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2617) Invalid tuple and block id getting formed for non partition table
[ https://issues.apache.org/jira/browse/CARBONDATA-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2617. -- Resolution: Fixed Fix Version/s: 1.4.1 > Invalid tuple and block id getting formed for non partition table > - > > Key: CARBONDATA-2617 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2617 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.4.1 >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Major > Fix For: 1.4.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > While creating a partition table a segment file was written in the Metadata > folder under table structure. This was introduced during development of > partition table feature. At that time segment file was written only for > partition table and it was used to distinguish between parition and non > partition table in the code. > But later the code was modified to > write the segment file for both parititon and non partition table and the > code to distinguish partition and non partition table was not modified which > is causing this incorrect formation of block and tuple id. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2617) Invalid tuple and block id getting formed for non partition table
[ https://issues.apache.org/jira/browse/CARBONDATA-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2617: - Issue Type: Bug (was: Improvement) > Invalid tuple and block id getting formed for non partition table > - > > Key: CARBONDATA-2617 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2617 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.4.1 >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > While creating a partition table a segment file was written in the Metadata > folder under table structure. This was introduced during development of > partition table feature. At that time segment file was written only for > partition table and it was used to distinguish between parition and non > partition table in the code. > But later the code was modified to > write the segment file for both parititon and non partition table and the > code to distinguish partition and non partition table was not modified which > is causing this incorrect formation of block and tuple id. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2617) Invalid tuple and block id getting formed for non partition table
[ https://issues.apache.org/jira/browse/CARBONDATA-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2617: - Affects Version/s: 1.4.1 > Invalid tuple and block id getting formed for non partition table > - > > Key: CARBONDATA-2617 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2617 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.4.1 >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > While creating a partition table a segment file was written in the Metadata > folder under table structure. This was introduced during development of > partition table feature. At that time segment file was written only for > partition table and it was used to distinguish between parition and non > partition table in the code. > But later the code was modified to > write the segment file for both parititon and non partition table and the > code to distinguish partition and non partition table was not modified which > is causing this incorrect formation of block and tuple id. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2604) getting ArrayIndexOutOfBoundException during compaction after IUD in cluster
[ https://issues.apache.org/jira/browse/CARBONDATA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2604. -- Resolution: Fixed Fix Version/s: 1.4.1 > getting ArrayIndexOutOfBoundException during compaction after IUD in cluster > > > Key: CARBONDATA-2604 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2604 > Project: CarbonData > Issue Type: Bug >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 2h > Remaining Estimate: 0h > > *Exception :* > !image-2018-06-12-19-19-05-257.png! > *To reproduce the issue follow the following steps :* > {quote} * *create table brinjal (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('table_blocksize'='2000','sort_columns'='imei');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *insert into brinjal select * from brinjal;* > * *update brinjal set (AMSize)= ('8RAM size') where AMSize='4RAM size';* > * *delete from brinjal where AMSize='8RAM size';* > * *delete from table brinjal where segment.id IN(0);* > * *clean files for table brinjal;* > * *alter table brinjal compact 'minor';* > * *alter table brinjal compact 'major';*{quote} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2604) getting ArrayIndexOutOfBoundException during compaction after IUD in cluster
[ https://issues.apache.org/jira/browse/CARBONDATA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2604: - Issue Type: Bug (was: Improvement) > getting ArrayIndexOutOfBoundException during compaction after IUD in cluster > > > Key: CARBONDATA-2604 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2604 > Project: CarbonData > Issue Type: Bug >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > *Exception :* > !image-2018-06-12-19-19-05-257.png! > *To reproduce the issue follow the following steps :* > {quote} * *create table brinjal (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('table_blocksize'='2000','sort_columns'='imei');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *insert into brinjal select * from brinjal;* > * *update brinjal set (AMSize)= ('8RAM size') where AMSize='4RAM size';* > * *delete from brinjal where AMSize='8RAM size';* > * *delete from table brinjal where segment.id IN(0);* > * *clean files for table brinjal;* > * *alter table brinjal compact 'minor';* > * *alter table brinjal compact 'major';*{quote} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2604) getting ArrayIndexOutOfBoundException during compaction after IUD in cluster
[ https://issues.apache.org/jira/browse/CARBONDATA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2604: - Priority: Minor (was: Major) > getting ArrayIndexOutOfBoundException during compaction after IUD in cluster > > > Key: CARBONDATA-2604 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2604 > Project: CarbonData > Issue Type: Bug >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Minor > Time Spent: 2h > Remaining Estimate: 0h > > *Exception :* > !image-2018-06-12-19-19-05-257.png! > *To reproduce the issue follow the following steps :* > {quote} * *create table brinjal (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('table_blocksize'='2000','sort_columns'='imei');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal > OPTIONS('DELIMITER'=',', 'QUOTECHAR'= > '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= > 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');* > * *insert into brinjal select * from brinjal;* > * *update brinjal set (AMSize)= ('8RAM size') where AMSize='4RAM size';* > * *delete from brinjal where AMSize='8RAM size';* > * *delete from table brinjal where segment.id IN(0);* > * *clean files for table brinjal;* > * *alter table brinjal compact 'minor';* > * *alter table brinjal compact 'major';*{quote} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2571) Calculating the carbonindex and carbondata file size of a table is wrong
[ https://issues.apache.org/jira/browse/CARBONDATA-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2571. -- Resolution: Fixed Fix Version/s: 1.4.1 > Calculating the carbonindex and carbondata file size of a table is wrong > > > Key: CARBONDATA-2571 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2571 > Project: CarbonData > Issue Type: Bug >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2538) No exception is thrown if writer path has only lock files
[ https://issues.apache.org/jira/browse/CARBONDATA-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2538. -- Resolution: Fixed Fix Version/s: 1.4.0 > No exception is thrown if writer path has only lock files > - > > Key: CARBONDATA-2538 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2538 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Major > Fix For: 1.4.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Steps to reproduce: > # Create external table > # Manually delete the index and carbon files > # Describe table (lock files would be created) > # Select from table -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2514) Duplicate columns in CarbonWriter is throwing NullPointerException
[ https://issues.apache.org/jira/browse/CARBONDATA-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2514. -- Resolution: Fixed Assignee: Kunal Kapoor Fix Version/s: 1.4.1 > Duplicate columns in CarbonWriter is throwing NullPointerException > -- > > Key: CARBONDATA-2514 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2514 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Minor > Fix For: 1.4.1 > > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2503) Data write fails if empty value is provided for sort columns in sdk
[ https://issues.apache.org/jira/browse/CARBONDATA-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2503. -- Resolution: Fixed Fix Version/s: 1.4.0 > Data write fails if empty value is provided for sort columns in sdk > --- > > Key: CARBONDATA-2503 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2503 > Project: CarbonData > Issue Type: Bug >Reporter: Rahul Kumar >Assignee: Rahul Kumar >Priority: Major > Fix For: 1.4.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > *Reproduce step :* > Use SDK to write data where empty value is provided for sort columns > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2496) Chnage the bloom implementation to hadoop for better performance and compression
[ https://issues.apache.org/jira/browse/CARBONDATA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2496. -- Resolution: Fixed Assignee: Ravindra Pesala Fix Version/s: 1.4.0 > Chnage the bloom implementation to hadoop for better performance and > compression > > > Key: CARBONDATA-2496 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2496 > Project: CarbonData > Issue Type: Improvement >Reporter: Ravindra Pesala >Assignee: Ravindra Pesala >Priority: Major > Fix For: 1.4.0 > > Time Spent: 6.5h > Remaining Estimate: 0h > > The current implementation of bloom does not give better performance and > compression, And also it adds new guava dependency to carbon. So remove the > guava dependency and add hadoop bloom. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2227) Add Partition Values and Location information in describe formatted for Standard partition feature
[ https://issues.apache.org/jira/browse/CARBONDATA-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2227. -- Resolution: Fixed Fix Version/s: 1.4.0 > Add Partition Values and Location information in describe formatted for > Standard partition feature > -- > > Key: CARBONDATA-2227 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2227 > Project: CarbonData > Issue Type: Improvement >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Minor > Fix For: 1.4.0 > > Time Spent: 5h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2448) Adding compacted segments to load and alter events
[ https://issues.apache.org/jira/browse/CARBONDATA-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2448. -- Resolution: Fixed Assignee: dhatchayani Fix Version/s: 1.4.0 > Adding compacted segments to load and alter events > -- > > Key: CARBONDATA-2448 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2448 > Project: CarbonData > Issue Type: Improvement >Reporter: dhatchayani >Assignee: dhatchayani >Priority: Minor > Fix For: 1.4.0 > > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2436) Block pruning problem post the carbon schema restructure.
[ https://issues.apache.org/jira/browse/CARBONDATA-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2436. -- Resolution: Fixed Assignee: Mohammad Shahid Khan Fix Version/s: 1.4.0 > Block pruning problem post the carbon schema restructure. > - > > Key: CARBONDATA-2436 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2436 > Project: CarbonData > Issue Type: Bug >Reporter: Mohammad Shahid Khan >Assignee: Mohammad Shahid Khan >Priority: Major > Fix For: 1.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently datamap is pruning with segmentproperties from the 0th blcok of > BlockletDataMap is not > correct. As post restructure if the table is updated then all the block will > not have > symetric schema within the same segments. > Fix: It must be ensured the block could be pruned with the same schema. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2433) Executor OOM because of GC when blocklet pruning is done using Lucene datamap
Manish Gupta created CARBONDATA-2433: Summary: Executor OOM because of GC when blocklet pruning is done using Lucene datamap Key: CARBONDATA-2433 URL: https://issues.apache.org/jira/browse/CARBONDATA-2433 Project: CarbonData Issue Type: Sub-task Affects Versions: 1.4.0 Reporter: Manish Gupta Assignee: Manish Gupta -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (CARBONDATA-2433) Executor OOM because of GC when blocklet pruning is done using Lucene datamap
[ https://issues.apache.org/jira/browse/CARBONDATA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta updated CARBONDATA-2433: - Description: While seraching using lucene it creates a PriorityQueue to hold the documents. As size is not specified by default the PriorityQueue size is equal to the number of lucene documents. As the docuemnts start getting added to the heap the GC time increases and after some time task fails due to excessive GC and executor OOM occurs. Reference blog: *http://lucene.472066.n3.nabble.com/Optimization-of-memory-usage-in-PriorityQueue-td590355.html* > Executor OOM because of GC when blocklet pruning is done using Lucene datamap > - > > Key: CARBONDATA-2433 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2433 > Project: CarbonData > Issue Type: Sub-task >Affects Versions: 1.4.0 >Reporter: Manish Gupta >Assignee: Manish Gupta >Priority: Major > > While seraching using lucene it creates a PriorityQueue to hold the > documents. As size is not specified by default the PriorityQueue size is > equal to the number of lucene documents. As the docuemnts start getting added > to the heap the GC time increases and after some time task fails due to > excessive GC and executor OOM occurs. > Reference blog: > *http://lucene.472066.n3.nabble.com/Optimization-of-memory-usage-in-PriorityQueue-td590355.html* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2410) Error message correction when column value length exceeds 320000 charactor
[ https://issues.apache.org/jira/browse/CARBONDATA-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2410. -- Resolution: Fixed Assignee: Mohammad Shahid Khan Fix Version/s: 1.4.0 > Error message correction when column value length exceeds 32 charactor > -- > > Key: CARBONDATA-2410 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2410 > Project: CarbonData > Issue Type: Bug >Reporter: Mohammad Shahid Khan >Assignee: Mohammad Shahid Khan >Priority: Trivial > Fix For: 1.4.0 > > Time Spent: 3h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2396) Add CTAS support for using DataSource Syntax
[ https://issues.apache.org/jira/browse/CARBONDATA-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2396. -- Resolution: Fixed Fix Version/s: 1.4.0 > Add CTAS support for using DataSource Syntax > > > Key: CARBONDATA-2396 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2396 > Project: CarbonData > Issue Type: Improvement >Reporter: Indhumathi Muthumurugesh >Assignee: Indhumathi Muthumurugesh >Priority: Minor > Fix For: 1.4.0 > > Time Spent: 7h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2406) Dictionary Server and Dictionary Client MD5 Validation failed with hive.server2.enable.doAs = true
[ https://issues.apache.org/jira/browse/CARBONDATA-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2406. -- Resolution: Fixed Assignee: Mohammad Shahid Khan Fix Version/s: 1.4.0 > Dictionary Server and Dictionary Client MD5 Validation failed with > hive.server2.enable.doAs = true > > > Key: CARBONDATA-2406 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2406 > Project: CarbonData > Issue Type: Bug >Reporter: Mohammad Shahid Khan >Assignee: Mohammad Shahid Khan >Priority: Major > Fix For: 1.4.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > With conf hive.server2.enable.doAs = true, the dictionary server is started > with the user who submit the load request. But the dictionary client run as > the user who started the executor process. Due to this dictionary client can > not successfully communicate with the dictionary server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2275) Query Failed for 0 byte deletedelta file
[ https://issues.apache.org/jira/browse/CARBONDATA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2275. -- Resolution: Fixed Assignee: Babulal Fix Version/s: 1.4.0 > Query Failed for 0 byte deletedelta file > - > > Key: CARBONDATA-2275 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2275 > Project: CarbonData > Issue Type: Bug > Components: data-query, sql >Affects Versions: 1.3.0 >Reporter: Babulal >Assignee: Babulal >Priority: Major > Fix For: 1.4.0 > > Time Spent: 11h 10m > Remaining Estimate: 0h > > When delete is failed on write step because of any exception from hdfs . > Currently 0 bye deletedelta file is created and not getting cleaned up . > So when any Select Query is triggered , Select Query is failed with Exception > Problem in loading segment blocks. > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.getLocations(AbstractDFSCarbonFile.java:514) > > at > org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:142) > > ... 109 more > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (CARBONDATA-2251) Refactored sdv failures running on different environment
[ https://issues.apache.org/jira/browse/CARBONDATA-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta resolved CARBONDATA-2251. -- Resolution: Fixed Fix Version/s: 1.4.0 > Refactored sdv failures running on different environment > > > Key: CARBONDATA-2251 > URL: https://issues.apache.org/jira/browse/CARBONDATA-2251 > Project: CarbonData > Issue Type: Improvement > Components: test >Affects Versions: 1.3.1 >Reporter: Jatin >Assignee: Jatin >Priority: Minor > Fix For: 1.4.0 > > Attachments: h2.PNG, hi.PNG > > Time Spent: 4h 10m > Remaining Estimate: 0h > > # MergeIndex testcase in sdv fails if executed with different number of > executors or in standalone spark. > # Changes testcase having Hive UDAF like histogram_numeric having > unexpected behaviour. so recommended way to write testcase using aggregation. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2405) Implement columnar filling during query result preparation in DictionaryBasedResultCollector
Manish Gupta created CARBONDATA-2405: Summary: Implement columnar filling during query result preparation in DictionaryBasedResultCollector Key: CARBONDATA-2405 URL: https://issues.apache.org/jira/browse/CARBONDATA-2405 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta When the number of columns in a query are greater than 100 then DictionaryBasedResultCollector is selected for result preparation. DictionaryBasedResultCollector fills the result row wise which reduces the query performance. Same as compaction we need to implement columnar filling of results in DictionaryBasedResultCollector so as to improve the query performance when the number of columns in a query are greater than 100. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CARBONDATA-2391) Thread leak in compaction operation if prefetch is enabled and compaction process is killed
Manish Gupta created CARBONDATA-2391: Summary: Thread leak in compaction operation if prefetch is enabled and compaction process is killed Key: CARBONDATA-2391 URL: https://issues.apache.org/jira/browse/CARBONDATA-2391 Project: CarbonData Issue Type: Bug Reporter: Manish Gupta Assignee: Manish Gupta Problem Thread leak in compaction operation if prefetch is enabled and compaction process is killed Analysis During compaction if prefetch is enabled RawResultIterator launches an executor service for prefetching the data. If compaction fails or the process is killed it can lead to thread leak due to executor service still in running state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)