[jira] [Updated] (CARBONDATA-1419) Add adaptive encoding for Double data type
[ https://issues.apache.org/jira/browse/CARBONDATA-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1419: - Description: Add a new encoding for Double data type: 1. AdaptiveFloatingCodec, it will multiple the column value by Math.pow(10, decimalCount) and do type cast from double to target data type like byte, short, int was: Add two new encoding for Double data type: 1. AdaptiveFloatingCodec, it will multiple the column value by Math.pow(10, decimalCount) and do type cast from double to target data type like byte, short, int 2. AdaptiveDeltaFloatingCodec, it will first calculate the delta of column value and maximum value and multiple by Math.pow(10, decimalCount) and do type cast from double to target data type like byte, short, int > Add adaptive encoding for Double data type > -- > > Key: CARBONDATA-1419 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1419 > Project: CarbonData > Issue Type: New Feature >Reporter: Jacky Li > Time Spent: 20m > Remaining Estimate: 0h > > Add a new encoding for Double data type: > 1. AdaptiveFloatingCodec, it will multiple the column value by Math.pow(10, > decimalCount) and do type cast from double to target data type like byte, > short, int -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1419) Add adaptive encoding for Double data type
[ https://issues.apache.org/jira/browse/CARBONDATA-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1419: - Description: Add two new encoding for Double data type: 1. AdaptiveFloatingCodec, it will multiple the column value by Math.pow(10, decimalCount) and do type cast from double to target data type like byte, short, int 2. AdaptiveDeltaFloatingCodec, it will first calculate the delta of column value and maximum value and multiple by Math.pow(10, decimalCount) and do type cast from double to target data type like byte, short, int > Add adaptive encoding for Double data type > -- > > Key: CARBONDATA-1419 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1419 > Project: CarbonData > Issue Type: New Feature >Reporter: Jacky Li > Time Spent: 10m > Remaining Estimate: 0h > > Add two new encoding for Double data type: > 1. AdaptiveFloatingCodec, it will multiple the column value by Math.pow(10, > decimalCount) and do type cast from double to target data type like byte, > short, int > 2. AdaptiveDeltaFloatingCodec, it will first calculate the delta of column > value and maximum value and multiple by Math.pow(10, decimalCount) and do > type cast from double to target data type like byte, short, int -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1422) Major and Minor Compaction Failing
[ https://issues.apache.org/jira/browse/CARBONDATA-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1422. -- Resolution: Fixed > Major and Minor Compaction Failing > -- > > Key: CARBONDATA-1422 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1422 > Project: CarbonData > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Pallavi Singh >Assignee: Ravindra Pesala > Fix For: 1.2.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The major and minor compaction is failing. > The Compaction is failing in the default scenario where the table property > dictionary_include is not specified which is the default behaviour. > Please find the error logs below : > 0: jdbc:hive2://localhost:1> show segments for table uniqdata; > ++--+--+--+--+ > | SegmentSequenceId | Status | Load Start Time | Load End > Time | > ++--+--+--+--+ > | 9 | Success | 2017-08-29 11:17:29.927 | 2017-08-29 > 11:17:30.555 | > | 8 | Success | 2017-08-29 11:17:27.572 | 2017-08-29 > 11:17:28.363 | > | 7 | Success | 2017-08-29 11:17:23.583 | 2017-08-29 > 11:17:25.112 | > | 6 | Success | 2017-08-29 11:17:07.966 | 2017-08-29 > 11:17:09.322 | > | 5 | Success | 2017-08-29 10:38:15.727 | 2017-08-29 > 10:38:16.548 | > | 4 | Success | 2017-08-29 10:37:13.053 | 2017-08-29 > 10:37:13.888 | > | 3 | Success | 2017-08-29 10:36:57.851 | 2017-08-29 > 10:36:59.08 | > | 2 | Success | 2017-08-29 10:36:49.439 | 2017-08-29 > 10:36:50.373 | > | 1 | Success | 2017-08-29 10:36:37.365 | 2017-08-29 > 10:36:38.768 | > | 0 | Success | 2017-08-29 10:36:21.011 | 2017-08-29 > 10:36:26.1| > ++--+--+--+--+ > 10 rows selected (0.081 seconds) > 0: jdbc:hive2://localhost:1> ALTER TABLE uniqdata COMPACT 'MINOR'; > Error: java.lang.RuntimeException: Compaction failed. Please check logs for > more info. Exception in compaction java.lang.Exception: Compaction Failure in > Merger Rdd. (state=,code=0) > 0: jdbc:hive2://localhost:1> ALTER TABLE uniqdata COMPACT 'MAJOR'; > Error: java.lang.RuntimeException: Compaction failed. Please check logs for > more info. Exception in compaction java.lang.Exception: Compaction Failure in > Merger Rdd. (state=,code=0) > 0: jdbc:hive2://localhost:1> -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1539) Change DataType from enum to class
Jacky Li created CARBONDATA-1539: Summary: Change DataType from enum to class Key: CARBONDATA-1539 URL: https://issues.apache.org/jira/browse/CARBONDATA-1539 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Fix For: 1.3.0 DataType should be java class instead of enum, it enables data type object to hold more information for decimal and complex type. This is required so that: 1. ColumnPage does not need to store extra information for decimal and complex type. It can process for all datatype in unified way. 2. It is needed to decouple carbon core and spark, so core does not depend on spark. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1533) Fixed decimal data load fail issue and restricted max characters per column
[ https://issues.apache.org/jira/browse/CARBONDATA-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1533. -- Resolution: Fixed Fix Version/s: 1.3.0 > Fixed decimal data load fail issue and restricted max characters per column > --- > > Key: CARBONDATA-1533 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1533 > Project: CarbonData > Issue Type: Bug >Reporter: Manish Gupta >Assignee: Manish Gupta >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > 1. Data load fails when both precision and data falls in integer range for > decimal data type with exception as below: > 17/09/28 16:36:28 ERROR CarbonFactDataHandlerColumnar: pool-20-thread-1 Error > in producer > java.lang.RuntimeException: internal error: > org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveIntegralCodec[src > type: DECIMAL, target type: INT, > stats(org.apache.carbondata.core.datastore.page.statistics.PrimitivePageStatsCollector@2afaa54e)] > at > org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveIntegralCodec$3.encode(AdaptiveIntegralCodec.java:142) > at > org.apache.carbondata.core.datastore.page.SafeDecimalColumnPage.convertValue(SafeDecimalColumnPage.java:209) > at > org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveIntegralCodec$1.encodeData(AdaptiveIntegralCodec.java:67) > at > org.apache.carbondata.core.datastore.page.encoding.ColumnPageEncoder.encode(ColumnPageEncoder.java:57) > at > org.apache.carbondata.processing.store.TablePage.encodeAndCompressMeasures(TablePage.java:284) > at > org.apache.carbondata.processing.store.TablePage.encode(TablePage.java:269) > at > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.processDataRows(CarbonFactDataHandlerColumnar.java:350) > at > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.access$500(CarbonFactDataHandlerColumnar.java:62) > at > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call(CarbonFactDataHandlerColumnar.java:724) > at > org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call(CarbonFactDataHandlerColumnar.java:701) > 2. Negative Array size exception is thrown when max characters per column is > greater than Short max value during data load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1352) Test case Execute while creating Carbondata jar.
[ https://issues.apache.org/jira/browse/CARBONDATA-1352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1352. -- Resolution: Fixed Fix Version/s: 1.3.0 > Test case Execute while creating Carbondata jar. > > > Key: CARBONDATA-1352 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1352 > Project: CarbonData > Issue Type: Bug > Components: other > Environment: Spark 2.1 >Reporter: Vinod Rohilla >Assignee: Srigopal Mohanty >Priority: Minor > Fix For: 1.3.0 > > Attachments: TestCaseExecution.png, TestcaseExecution.png > > > Steps to Reproduce: > 1: Run the command : mvn -DskipTests -Pspark-2.1 -Dspark.version=2.1.0 clean > package > 2: Check the attached screenshots. > Expected Result: > 1: All the test cases should be skipped while creating a jar. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1597) Remove spark1 integration
Jacky Li created CARBONDATA-1597: Summary: Remove spark1 integration Key: CARBONDATA-1597 URL: https://issues.apache.org/jira/browse/CARBONDATA-1597 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li As voted by community, spark1 integration with carbon can be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1594) Add scale and decimal to DecimalType
Jacky Li created CARBONDATA-1594: Summary: Add scale and decimal to DecimalType Key: CARBONDATA-1594 URL: https://issues.apache.org/jira/browse/CARBONDATA-1594 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Fix For: 1.3.0 DecimalType should include scale and precision, and all scale and precision class member outside DecimalType should be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1817) Reject create datamap on streaming table
Jacky Li created CARBONDATA-1817: Summary: Reject create datamap on streaming table Key: CARBONDATA-1817 URL: https://issues.apache.org/jira/browse/CARBONDATA-1817 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 Since streaming segment does not support building index and pre-aggregate yet, so streaming table should not support create datamap operation -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1812) provide API to get table dynamic information(table size and last modified time)
[ https://issues.apache.org/jira/browse/CARBONDATA-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1812. -- Resolution: Fixed Fix Version/s: 1.3.0 > provide API to get table dynamic information(table size and last modified > time) > --- > > Key: CARBONDATA-1812 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1812 > Project: CarbonData > Issue Type: Improvement >Reporter: QiangCai >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > provide API to get table dynamic information(table size and last modified > time) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1811) Use TableInfo and StructType to create the table
Jacky Li created CARBONDATA-1811: Summary: Use TableInfo and StructType to create the table Key: CARBONDATA-1811 URL: https://issues.apache.org/jira/browse/CARBONDATA-1811 Project: CarbonData Issue Type: Bug Reporter: Jacky Li Fix For: 1.3.0 CarbonCreateTableCommand and CarbonAlterTableAddColumnCommand should use TableInfo and StructType to create the table. It is required to implement CREATE TABLE AS SELECT syntax. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1804) Make FileOperations Pluggable
[ https://issues.apache.org/jira/browse/CARBONDATA-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1804. -- Resolution: Fixed Fix Version/s: 1.3.0 > Make FileOperations Pluggable > - > > Key: CARBONDATA-1804 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1804 > Project: CarbonData > Issue Type: Improvement > Components: core >Reporter: Manohar Vanam >Assignee: Manohar Vanam > Fix For: 1.3.0 > > Time Spent: 4h > Remaining Estimate: 0h > > 1. Refactor FileFactory based on FileType to support plug-gable file handlers > so that custom file handlers can have their specific logic. > Example : User can provide his own implementations by extending existing > FileTypes -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1833) Should fix BindException in TestStreamingTableOperation
[ https://issues.apache.org/jira/browse/CARBONDATA-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1833. -- Resolution: Fixed Fix Version/s: 1.3.0 > Should fix BindException in TestStreamingTableOperation > --- > > Key: CARBONDATA-1833 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1833 > Project: CarbonData > Issue Type: Bug >Reporter: QiangCai >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Test case TestStreamingTableOperation throwing BindException: Address already > in use -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1832) Table cache should be cleared when dropping table
Jacky Li created CARBONDATA-1832: Summary: Table cache should be cleared when dropping table Key: CARBONDATA-1832 URL: https://issues.apache.org/jira/browse/CARBONDATA-1832 Project: CarbonData Issue Type: Bug Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1592) Add Event Listener interface to Carbondata
[ https://issues.apache.org/jira/browse/CARBONDATA-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1592. -- Resolution: Fixed Fix Version/s: 1.3.0 > Add Event Listener interface to Carbondata > -- > > Key: CARBONDATA-1592 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1592 > Project: CarbonData > Issue Type: New Feature >Reporter: Manish Gupta >Assignee: Manish Gupta >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 10.5h > Remaining Estimate: 0h > > Add Event Listener interface to Carbondata. This will allow extending the > current functionality of various commands to perform various other operations. > Example: After completion of load process, if any aggregate tables are > created on that table, then data load operation need to be done for the > aggregate table also. In this case we can create a listener such as > AggregateLoadListener and register it as an event bus. Then this listener can > be called once the load operation is completed which will take care of > loading the aggregate table. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1756) Improve Boolean data compress rate by changing RLE to SNAPPY algorithm
[ https://issues.apache.org/jira/browse/CARBONDATA-1756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1756. -- Resolution: Fixed > Improve Boolean data compress rate by changing RLE to SNAPPY algorithm > -- > > Key: CARBONDATA-1756 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1756 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 1.2.0 >Reporter: xubo245 >Assignee: xubo245 > Fix For: 1.3.0 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Improve Boolean data compress rate by changing RLE to SNAPPY algorithm > Because Boolean data compress rate that uses RLE algorithm is lower than > SNAPPY algorithm in most scenario. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1820) Should extract CarbonTable.buildUniqueName method and re-factory code to invoke this method
[ https://issues.apache.org/jira/browse/CARBONDATA-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1820. -- Resolution: Fixed > Should extract CarbonTable.buildUniqueName method and re-factory code to > invoke this method > --- > > Key: CARBONDATA-1820 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1820 > Project: CarbonData > Issue Type: Improvement >Reporter: QiangCai >Assignee: QiangCai >Priority: Trivial > Time Spent: 50m > Remaining Estimate: 0h > > Should extract CarbonTable.buildUniqueName method and re-factory code to > invoke this method -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1811) Use TableInfo and StructType to create the table
[ https://issues.apache.org/jira/browse/CARBONDATA-1811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1811: - Issue Type: Improvement (was: Bug) > Use TableInfo and StructType to create the table > > > Key: CARBONDATA-1811 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1811 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li > Fix For: 1.3.0 > > > CarbonCreateTableCommand and CarbonAlterTableAddColumnCommand should use > TableInfo and StructType to create the table. > It is required to implement CREATE TABLE AS SELECT syntax. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1815) Add AtomicRunnableCommand abstraction
Jacky Li created CARBONDATA-1815: Summary: Add AtomicRunnableCommand abstraction Key: CARBONDATA-1815 URL: https://issues.apache.org/jira/browse/CARBONDATA-1815 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 Some CarbonData command need to process both metadata and data which should be in an atomic fashion. These commands need to support undo if any failure. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1586) Support handoff from row format to columnar format
[ https://issues.apache.org/jira/browse/CARBONDATA-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1586. -- Resolution: Fixed > Support handoff from row format to columnar format > -- > > Key: CARBONDATA-1586 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1586 > Project: CarbonData > Issue Type: Sub-task >Affects Versions: 1.2.0 >Reporter: Jacky Li >Assignee: QiangCai > Fix For: 1.3.0 > > Time Spent: 4h > Remaining Estimate: 0h > > When number of files in the streaming segment exceeds configured value, > system should support converting row files into columnar files to avoid too > many row files and improve query performance on streaming table continuously. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1810) Bad record path is not correct for UT
[ https://issues.apache.org/jira/browse/CARBONDATA-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1810. -- Resolution: Fixed > Bad record path is not correct for UT > - > > Key: CARBONDATA-1810 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1810 > Project: CarbonData > Issue Type: Bug > Components: test >Reporter: xubo245 >Assignee: xubo245 > Fix For: 1.3.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Bad record path is not correct for UT -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1818) Should make carbon.streaming.segment.max.size as configurable
[ https://issues.apache.org/jira/browse/CARBONDATA-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1818. -- Resolution: Fixed Fix Version/s: 1.3.0 > Should make carbon.streaming.segment.max.size as configurable > - > > Key: CARBONDATA-1818 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1818 > Project: CarbonData > Issue Type: Improvement >Reporter: QiangCai >Assignee: QiangCai >Priority: Trivial > Fix For: 1.3.0 > > Time Spent: 1h > Remaining Estimate: 0h > > Should make carbon.streaming.segment.max.size as configurable -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (CARBONDATA-1610) ALTER TABLE set streaming property
[ https://issues.apache.org/jira/browse/CARBONDATA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li reopened CARBONDATA-1610: -- > ALTER TABLE set streaming property > -- > > Key: CARBONDATA-1610 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1610 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: Jacky Li > Fix For: 1.3.0 > > > For existing table, user should be able to use > ALTER TABLE source SET TBLPROPERTIES('streaming'='true') > to set the table property so that this table can be streaming ingested -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1610) ALTER TABLE set streaming property
[ https://issues.apache.org/jira/browse/CARBONDATA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1610. -- Resolution: Fixed Assignee: QiangCai (was: Jacky Li) > ALTER TABLE set streaming property > -- > > Key: CARBONDATA-1610 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1610 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: QiangCai > Fix For: 1.3.0 > > > For existing table, user should be able to use > ALTER TABLE source SET TBLPROPERTIES('streaming'='true') > to set the table property so that this table can be streaming ingested -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1778) Support clean garbage segments for all
[ https://issues.apache.org/jira/browse/CARBONDATA-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1778. -- Resolution: Fixed Fix Version/s: 1.3.0 > Support clean garbage segments for all > -- > > Key: CARBONDATA-1778 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1778 > Project: CarbonData > Issue Type: Improvement >Reporter: chenerlu >Assignee: chenerlu >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1843) Add configuration to enable features to improve usability
Jacky Li created CARBONDATA-1843: Summary: Add configuration to enable features to improve usability Key: CARBONDATA-1843 URL: https://issues.apache.org/jira/browse/CARBONDATA-1843 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Assignee: Jacky Li Priority: Minor Fix For: 1.3.0 1. Add configuration for support dictionary and complex type 2. Block 'external', 'CTAS' syntax 3. Some other minor fix to catch exceptions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1895) Fix issue of create table if not exits
[ https://issues.apache.org/jira/browse/CARBONDATA-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1895. -- Resolution: Fixed Fix Version/s: 1.3.0 > Fix issue of create table if not exits > --- > > Key: CARBONDATA-1895 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1895 > Project: CarbonData > Issue Type: Bug >Reporter: chenerlu >Assignee: chenerlu > Fix For: 1.3.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1856) Support insert/load data for partition table.
[ https://issues.apache.org/jira/browse/CARBONDATA-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1856. -- Resolution: Fixed Fix Version/s: 1.3.0 > Support insert/load data for partition table. > -- > > Key: CARBONDATA-1856 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1856 > Project: CarbonData > Issue Type: Sub-task >Reporter: Ravindra Pesala > Fix For: 1.3.0 > > Time Spent: 6h 40m > Remaining Estimate: 0h > > Change carbonrelation to HadoopFSRelation inside optimizer for insert > statement in case of the partition table. And also update to HadoopFSRelation > even for Load command in case of the partition table. > Implement sparks Fileformat interface for carbon and use carbonoutputformat > inside. > Create partition.map file inside each segment for mapping between partition > and index file. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1805) Optimize pruning for dictionary loading
[ https://issues.apache.org/jira/browse/CARBONDATA-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1805. -- Resolution: Fixed > Optimize pruning for dictionary loading > --- > > Key: CARBONDATA-1805 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1805 > Project: CarbonData > Issue Type: Improvement > Components: data-load, spark-integration >Reporter: xuchuanyin >Assignee: xuchuanyin > Fix For: 1.3.0 > > Time Spent: 11h 10m > Remaining Estimate: 0h > > # SCENARIO > Recently I tried dictionary feature in Carbondata and found its dictionary > generating phase in data loading is quite slow. My scenario is as below: > + Input Data: 35.8GB CSV file with 199 columns and 126 Million lines > + Dictionary columns: 3 columns each containing 19213,4,9 distinct values > The whole data loading consumes about 2.9min for dictionary generating and > 4.6min for fact data loading -- about 39% of the time are spent on dictionary. > Having observed the nmon result, Ifound the CPU usage were quite high during > the dictionary generating phase and the Disk, Network were quite normal. > # ANALYZE > After I went through the dictionary generating related code, I found > Carbondata aleady prune non-dictionary columns before generating dictionary. > But the problem is that `the pruning comes after data file reading`, this > will cause some overhead, we can optimize it by `prune while reading data > file`. > # RESOLVE > Refactor the `loadDataFrame` method in `GlobalDictionaryUtil`, only pruning > the non-dictionary columns while reading the data file. > After implementing the above optimization, the dictionary generating costs > only `29s` -- `about 6 times better than before`(2.9min), and the fact data > loading costs the same as before(4.6min), about 10% of the time are spent on > dictionary. > # NOTE > + Currently only `load data file` will benefit from this optimization, while > `load data frame` will not. > + Before implementing this solution, I tried another solution -- cache > dataframe of the data file, the performance was even worse -- the dictionary > generating time was 5.6min. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1884) Add CTAS support to carbondata
[ https://issues.apache.org/jira/browse/CARBONDATA-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1884. -- Resolution: Fixed Fix Version/s: 1.3.0 > Add CTAS support to carbondata > -- > > Key: CARBONDATA-1884 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1884 > Project: CarbonData > Issue Type: New Feature >Reporter: Manish Gupta >Assignee: Manish Gupta > Fix For: 1.3.0 > > Attachments: Create_Table_As_Select_Design.docx > > Time Spent: 6.5h > Remaining Estimate: 0h > > Implement create table as select (CTAS) feature in carbondata -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1897) Remove column group information in DESC TABLE command
Jacky Li created CARBONDATA-1897: Summary: Remove column group information in DESC TABLE command Key: CARBONDATA-1897 URL: https://issues.apache.org/jira/browse/CARBONDATA-1897 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Fix For: 1.3.0 column group information is not valid, remove it from DESC TABLE output -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1906) Update registerTempTable method because it was marked deprecated
[ https://issues.apache.org/jira/browse/CARBONDATA-1906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1906. -- Resolution: Fixed Fix Version/s: 1.3.0 > Update registerTempTable method because it was marked deprecated > > > Key: CARBONDATA-1906 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1906 > Project: CarbonData > Issue Type: Improvement >Reporter: xubo245 >Assignee: xubo245 > Fix For: 1.3.0 > > Time Spent: 3h > Remaining Estimate: 0h > > Update registerTempTable method because it was marked deprecated -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1880) Global Sort maybe generates many small files
[ https://issues.apache.org/jira/browse/CARBONDATA-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1880. -- Resolution: Fixed Fix Version/s: 1.3.0 > Global Sort maybe generates many small files > > > Key: CARBONDATA-1880 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1880 > Project: CarbonData > Issue Type: Improvement >Reporter: QiangCai >Assignee: QiangCai >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 5h 40m > Remaining Estimate: 0h > > Global Sort maybe generates many small files without option > "carbon.load.global.sort.partitions", > It makes the select query be slower, -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1855) Add outputformat in carbon.
[ https://issues.apache.org/jira/browse/CARBONDATA-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1855. -- Resolution: Fixed Fix Version/s: 1.3.0 > Add outputformat in carbon. > --- > > Key: CARBONDATA-1855 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1855 > Project: CarbonData > Issue Type: Sub-task >Reporter: Ravindra Pesala > Fix For: 1.3.0 > > Time Spent: 12h 40m > Remaining Estimate: 0h > > Support standard Hadoop outputformat interface for carbon. It will be helpful > for integrations to execution engines like the spark, hive, and presto. > It should maintain segment management as well while writing the data to > support incremental loading feature. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1752) There are some scalastyle error should be optimized in CarbonData
[ https://issues.apache.org/jira/browse/CARBONDATA-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1752. -- Resolution: Fixed > There are some scalastyle error should be optimized in CarbonData > - > > Key: CARBONDATA-1752 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1752 > Project: CarbonData > Issue Type: Bug > Components: file-format >Affects Versions: 1.2.0 >Reporter: xubo245 >Assignee: xubo245 >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 40m > Remaining Estimate: 0h > > There are some scalastyle error should be optimized in CarbonData, including > removing useless import, optimizing method definition and so on -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1739) Clean up store path interface
Jacky Li created CARBONDATA-1739: Summary: Clean up store path interface Key: CARBONDATA-1739 URL: https://issues.apache.org/jira/browse/CARBONDATA-1739 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Fix For: 1.3.0 There are many getStorePath API, it should be unified in one place -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1746) Count Star optimization
Jacky Li created CARBONDATA-1746: Summary: Count Star optimization Key: CARBONDATA-1746 URL: https://issues.apache.org/jira/browse/CARBONDATA-1746 Project: CarbonData Issue Type: New Feature Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 Since carbon records number of row in metadata, count star query can leverage it to improve performance -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1741) Remove AKSK in Log
Jacky Li created CARBONDATA-1741: Summary: Remove AKSK in Log Key: CARBONDATA-1741 URL: https://issues.apache.org/jira/browse/CARBONDATA-1741 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Fix For: 1.3.0 In order to provide better security, AKSK credential information should be removed in log -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1745) Remove local metastore path
Jacky Li created CARBONDATA-1745: Summary: Remove local metastore path Key: CARBONDATA-1745 URL: https://issues.apache.org/jira/browse/CARBONDATA-1745 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 If user does not specify metastore path, use default metastore path from Hive -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1764) Fix issue of when create table with short data type
[ https://issues.apache.org/jira/browse/CARBONDATA-1764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1764. -- Resolution: Fixed > Fix issue of when create table with short data type > --- > > Key: CARBONDATA-1764 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1764 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.2.0 >Reporter: xubo245 >Assignee: xubo245 > Fix For: 1.3.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Fix issue of when create table with short data type -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1651) Unsupported Spark2 BooleanType
[ https://issues.apache.org/jira/browse/CARBONDATA-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1651. -- Resolution: Fixed Fix Version/s: 1.3.0 > Unsupported Spark2 BooleanType > -- > > Key: CARBONDATA-1651 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1651 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.3.0 >Reporter: Roman Timrov >Assignee: anubhav tarar > Fix For: 1.3.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > Unable to save Dataset if it contains field with BooleanType > class CarbonDataFrameWriter > method convertToCarbonType doesn't support it -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1750) SegmentStatusManager.readLoadMetadata showing NPE if tablestatus file is empty
[ https://issues.apache.org/jira/browse/CARBONDATA-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1750. -- Resolution: Fixed Assignee: QiangCai Fix Version/s: 1.3.0 > SegmentStatusManager.readLoadMetadata showing NPE if tablestatus file is empty > -- > > Key: CARBONDATA-1750 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1750 > Project: CarbonData > Issue Type: Bug >Reporter: QiangCai >Assignee: QiangCai >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 40m > Remaining Estimate: 0h > > SegmentStatusManager.readLoadMetadata showing NPE if tablestatus file is empty -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1614) SHOW SEGMENT should include the streaming property
[ https://issues.apache.org/jira/browse/CARBONDATA-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1614. -- Resolution: Implemented > SHOW SEGMENT should include the streaming property > -- > > Key: CARBONDATA-1614 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1614 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1576) Support create and drop datamap by SQL
[ https://issues.apache.org/jira/browse/CARBONDATA-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1576. -- Resolution: Fixed Assignee: Ravindra Pesala > Support create and drop datamap by SQL > -- > > Key: CARBONDATA-1576 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1576 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: Ravindra Pesala > Fix For: 1.3.0 > > Time Spent: 17h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Reopened] (CARBONDATA-1614) SHOW SEGMENT should include the streaming property
[ https://issues.apache.org/jira/browse/CARBONDATA-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li reopened CARBONDATA-1614: -- Assignee: Jacky Li > SHOW SEGMENT should include the streaming property > -- > > Key: CARBONDATA-1614 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1614 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: Jacky Li > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1707) Log the taken time of each stream batch and fix StreamExample issue
[ https://issues.apache.org/jira/browse/CARBONDATA-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1707. -- Resolution: Fixed > Log the taken time of each stream batch and fix StreamExample issue > --- > > Key: CARBONDATA-1707 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1707 > Project: CarbonData > Issue Type: Improvement >Reporter: QiangCai >Priority: Trivial > Time Spent: 50m > Remaining Estimate: 0h > > Log the taken time of each stream batch and fix StreamExample issue -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (CARBONDATA-1615) DELETE SEGMENT BY DATE should ignore the streaming segment
[ https://issues.apache.org/jira/browse/CARBONDATA-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li reassigned CARBONDATA-1615: Assignee: Jacky Li > DELETE SEGMENT BY DATE should ignore the streaming segment > -- > > Key: CARBONDATA-1615 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1615 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: Jacky Li > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (CARBONDATA-1612) Block DELETE SEGMENT BY ID for streaming table
[ https://issues.apache.org/jira/browse/CARBONDATA-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li reassigned CARBONDATA-1612: Assignee: Jacky Li > Block DELETE SEGMENT BY ID for streaming table > -- > > Key: CARBONDATA-1612 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1612 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: Jacky Li > Fix For: 1.3.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Streaming segment should be managed by carbon internally and it should not be > deleted by user -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1579) Support show/describe datamap information by SQL
[ https://issues.apache.org/jira/browse/CARBONDATA-1579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1579. -- Resolution: Fixed Assignee: Ravindra Pesala > Support show/describe datamap information by SQL > > > Key: CARBONDATA-1579 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1579 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: Ravindra Pesala > Fix For: 1.3.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1765) Remove repeat code of Boolean
[ https://issues.apache.org/jira/browse/CARBONDATA-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1765. -- Resolution: Fixed > Remove repeat code of Boolean > - > > Key: CARBONDATA-1765 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1765 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 1.2.0 >Reporter: xubo245 >Assignee: xubo245 >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1768) Upgrade univocity parser to 2.2.1
Jacky Li created CARBONDATA-1768: Summary: Upgrade univocity parser to 2.2.1 Key: CARBONDATA-1768 URL: https://issues.apache.org/jira/browse/CARBONDATA-1768 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Assignee: Jacky Li Univocity CSV parser has improved performance in 2.2.1, upgrade dependency to use it -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1767) Remove dependency of Java 1.8
[ https://issues.apache.org/jira/browse/CARBONDATA-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1767: - Summary: Remove dependency of Java 1.8 (was: Make carbon compatible with Java 1.7) > Remove dependency of Java 1.8 > - > > Key: CARBONDATA-1767 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1767 > Project: CarbonData > Issue Type: Bug >Reporter: Jacky Li >Assignee: Jacky Li > Fix For: 1.3.0 > > Time Spent: 10m > Remaining Estimate: 0h > > carbon should b enable to compile with Java 1.7 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1767) Make carbon compatible with Java 1.7
Jacky Li created CARBONDATA-1767: Summary: Make carbon compatible with Java 1.7 Key: CARBONDATA-1767 URL: https://issues.apache.org/jira/browse/CARBONDATA-1767 Project: CarbonData Issue Type: Bug Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 carbon should b enable to compile with Java 1.7 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1326) Fixed high priority findbug issues
[ https://issues.apache.org/jira/browse/CARBONDATA-1326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1326. -- Resolution: Fixed Fix Version/s: 1.3.0 > Fixed high priority findbug issues > -- > > Key: CARBONDATA-1326 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1326 > Project: CarbonData > Issue Type: Bug >Reporter: Manish Gupta >Assignee: Manish Gupta >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 23h 50m > Remaining Estimate: 0h > > Currently there are lot if find bug issues in the carbondata code. These need > to be priortized and fixed. So through this jira all high priority findbug > issues are addressed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1732) Add S3 support in FileFactory
[ https://issues.apache.org/jira/browse/CARBONDATA-1732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1732: - Fix Version/s: 1.3.0 > Add S3 support in FileFactory > - > > Key: CARBONDATA-1732 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1732 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Assignee: Jacky Li > Fix For: 1.3.0 > > > Add S3 file prefix support to FileFactory -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1732) Add S3 support in FileFactory
Jacky Li created CARBONDATA-1732: Summary: Add S3 support in FileFactory Key: CARBONDATA-1732 URL: https://issues.apache.org/jira/browse/CARBONDATA-1732 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Assignee: Jacky Li Add S3 file prefix support to FileFactory -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1668) Remove isTableSplitPartition while loading
Jacky Li created CARBONDATA-1668: Summary: Remove isTableSplitPartition while loading Key: CARBONDATA-1668 URL: https://issues.apache.org/jira/browse/CARBONDATA-1668 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 This option is always false, related code can be removed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1666) Clean up redundant code
Jacky Li created CARBONDATA-1666: Summary: Clean up redundant code Key: CARBONDATA-1666 URL: https://issues.apache.org/jira/browse/CARBONDATA-1666 Project: CarbonData Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Jacky Li Fix For: 1.3.0 There are some removed feature in carbon project, it is better to remove redundant code for better readability. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1667) Remove DirectLoad feature
Jacky Li created CARBONDATA-1667: Summary: Remove DirectLoad feature Key: CARBONDATA-1667 URL: https://issues.apache.org/jira/browse/CARBONDATA-1667 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li Currently carbon is always using DirectLoad from CSV. So this option can be removed in the code -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (CARBONDATA-1666) Clean up redundant code
[ https://issues.apache.org/jira/browse/CARBONDATA-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li reassigned CARBONDATA-1666: Assignee: Jacky Li > Clean up redundant code > --- > > Key: CARBONDATA-1666 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1666 > Project: CarbonData > Issue Type: Improvement >Affects Versions: 1.2.0 >Reporter: Jacky Li >Assignee: Jacky Li > Fix For: 1.3.0 > > > There are some removed feature in carbon project, it is better to remove > redundant code for better readability. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1669) Clean up code in CarbonDataRDDFactory
Jacky Li created CARBONDATA-1669: Summary: Clean up code in CarbonDataRDDFactory Key: CARBONDATA-1669 URL: https://issues.apache.org/jira/browse/CARBONDATA-1669 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 Inside CarbonDataRDDFactory.loadCarbonData, there are many function defined inside function, makes the loading logic very hard to read -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1572) Support Streaming Ingest
[ https://issues.apache.org/jira/browse/CARBONDATA-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1572: - Attachment: CarbonData Streaming Ingest_v1.4.pdf > Support Streaming Ingest > > > Key: CARBONDATA-1572 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1572 > Project: CarbonData > Issue Type: New Feature >Reporter: QiangCai >Assignee: QiangCai > Attachments: CarbonData Streaming Ingest_v1.1.pdf, CarbonData > Streaming Ingest_v1.4.pdf > > Time Spent: 4h > Remaining Estimate: 0h > > CarbonData should support streaming ingest. > [^CarbonData Streaming Ingest_v1.1.pdf] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1572) Support Streaming Ingest
[ https://issues.apache.org/jira/browse/CARBONDATA-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1572: - Attachment: (was: CarbonData Streaming Ingest_v1.3.pdf) > Support Streaming Ingest > > > Key: CARBONDATA-1572 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1572 > Project: CarbonData > Issue Type: New Feature >Reporter: QiangCai >Assignee: QiangCai > Attachments: CarbonData Streaming Ingest_v1.1.pdf, CarbonData > Streaming Ingest_v1.4.pdf > > Time Spent: 4h > Remaining Estimate: 0h > > CarbonData should support streaming ingest. > [^CarbonData Streaming Ingest_v1.1.pdf] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1572) Support Streaming Ingest
[ https://issues.apache.org/jira/browse/CARBONDATA-1572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1572: - Attachment: (was: CarbonData Streaming Ingest_v1.2.pdf) > Support Streaming Ingest > > > Key: CARBONDATA-1572 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1572 > Project: CarbonData > Issue Type: New Feature >Reporter: QiangCai >Assignee: QiangCai > Attachments: CarbonData Streaming Ingest_v1.1.pdf, CarbonData > Streaming Ingest_v1.4.pdf > > Time Spent: 4h > Remaining Estimate: 0h > > CarbonData should support streaming ingest. > [^CarbonData Streaming Ingest_v1.1.pdf] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1611) Block UPDATE/DELETE command for streaming table
[ https://issues.apache.org/jira/browse/CARBONDATA-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1611. -- Resolution: Fixed > Block UPDATE/DELETE command for streaming table > --- > > Key: CARBONDATA-1611 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1611 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: Jacky Li > Fix For: 1.3.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > In streaming table, row file format is used, which is not updatable. So > UPDATE/DELETE command should be rejected -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Closed] (CARBONDATA-1610) ALTER TABLE set streaming property
[ https://issues.apache.org/jira/browse/CARBONDATA-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li closed CARBONDATA-1610. Resolution: Invalid > ALTER TABLE set streaming property > -- > > Key: CARBONDATA-1610 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1610 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li >Assignee: Jacky Li > Fix For: 1.3.0 > > > For existing table, user should be able to use > ALTER TABLE source SET TBLPROPERTIES('streaming'='true') > to set the table property so that this table can be streaming ingested -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1624) If SORT_SCOPE is non-GLOBAL_SORT with Spark, set 'carbon.number.of.cores.while.loading' dynamically as per the available executor cores
[ https://issues.apache.org/jira/browse/CARBONDATA-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1624. -- Resolution: Fixed Fix Version/s: 1.3.0 > If SORT_SCOPE is non-GLOBAL_SORT with Spark, set > 'carbon.number.of.cores.while.loading' dynamically as per the available > executor cores > > > Key: CARBONDATA-1624 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1624 > Project: CarbonData > Issue Type: Improvement > Components: data-load, spark-integration >Affects Versions: 1.3.0 >Reporter: Zhichao Zhang >Assignee: Zhichao Zhang >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 12h 40m > Remaining Estimate: 0h > > If we are using carbondata + spark to load data, we can set > carbon.number.of.cores.while.loading to the number of executor cores. > For example, when set the number of executor cores to 6, it shows that there > are at > least 6 cores per node for loading data, so we can set > carbon.number.of.cores.while.loading to 6 automatically. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1692) Refactor Segment abstraction
Jacky Li created CARBONDATA-1692: Summary: Refactor Segment abstraction Key: CARBONDATA-1692 URL: https://issues.apache.org/jira/browse/CARBONDATA-1692 Project: CarbonData Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Jacky Li Since streaming segment is added, there should be common operation and interface abstraction for both segment including streaming segment and batch segment. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1693) Change SegmentStatus from String the enum
Jacky Li created CARBONDATA-1693: Summary: Change SegmentStatus from String the enum Key: CARBONDATA-1693 URL: https://issues.apache.org/jira/browse/CARBONDATA-1693 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1596) ClassCastException is thrown by IntermediateFileMerger for decimal columns
[ https://issues.apache.org/jira/browse/CARBONDATA-1596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1596. -- Resolution: Fixed Fix Version/s: 1.3.0 > ClassCastException is thrown by IntermediateFileMerger for decimal columns > -- > > Key: CARBONDATA-1596 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1596 > Project: CarbonData > Issue Type: Bug >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > When intermediate file merger tries to merge the sort files it converts the > row data to their appropriate datatypes. > While converting decimal types ClassCastException was being thrown. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1617) Merging carbonindex files for each segment.
[ https://issues.apache.org/jira/browse/CARBONDATA-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1617. -- Resolution: Fixed Fix Version/s: 1.3.0 > Merging carbonindex files for each segment. > --- > > Key: CARBONDATA-1617 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1617 > Project: CarbonData > Issue Type: New Feature >Reporter: Ravindra Pesala >Priority: Major > Fix For: 1.3.0 > > Time Spent: 10h 20m > Remaining Estimate: 0h > > Hi, > Problem : > The first-time query of carbon becomes very slow. It is because of reading > many small carbonindex files and cache to the driver at the first time. > Many carbonindex files are created in below case > Loading data in large cluster >For example, if the cluster size is 100 nodes then for each load 100 index > files are created per segment. So after 100 loads, the number of carbonindex > files becomes 1. . > It will be slower to read all the files from the driver since a lot of > namenode calls and IO operations. > Solution : > Merge the carbonindex files in two levels.so that we can reduce the IO calls > to namenode and improves the read performance. > Merge within a segment. > Merge the carbonindex files to single file immediately after load completes > within the segment. It would be named as a .carbonindexmerge file. It is > actually not a true data merging but a simple file merge. So that the current > structure of carbonindex files does not change. While reading we just read > one file instead of many carbonindex files within the segment. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1808) (Carbon1.3.0 - Alter Table) Inconsistency in create table and alter table usage for char and varchar column
[ https://issues.apache.org/jira/browse/CARBONDATA-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1808. -- Resolution: Fixed Fix Version/s: 1.3.0 > (Carbon1.3.0 - Alter Table) Inconsistency in create table and alter table > usage for char and varchar column > --- > > Key: CARBONDATA-1808 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1808 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.3.0 > Environment: 3 node ant cluster >Reporter: Chetan Bhat >Assignee: anubhav tarar >Priority: Minor > Labels: Functional > Fix For: 1.3.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Steps: > User creates a table with char datatype --> Create table is success. > 0: jdbc:hive2://10.18.98.34:23040> CREATE TABLE > sensor_reading_blockblank_false(id char) STORED BY 'carbondata'; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.688 seconds) > User tries to alter the table using column name for char datatype in the same > way. > alter table sensor_reading_blockblank_false add columns(id1 char); > Issue : Alter table fails with parsing error as shown below > 0: jdbc:hive2://10.18.98.34:23040> alter table > sensor_reading_blockblank_false add columns(id1 char); > Error: java.lang.RuntimeException: > BaseSqlParser > Operation not allowed: alter table add columns(line 1, pos 0) > == SQL == > alter table sensor_reading_blockblank_false add columns(id1 char) > ^^^ > CarbonSqlParser [1.65] failure: ``('' expected but `)' found > alter table sensor_reading_blockblank_false add columns(id1 char) > ^ > (state=,code=0) > Similar consistency issue is observed for varchar data type create table and > alter table usage. > 0: jdbc:hive2://10.18.98.34:23040> CREATE TABLE > sensor_reading_blockblank_false(id varchar) STORED BY 'carbondata'; > +-+--+ > | Result | > +-+--+ > +-+--+ > No rows selected (0.244 seconds) > 0: jdbc:hive2://10.18.98.34:23040> alter table > sensor_reading_blockblank_false add columns(id1 varchar); > Error: java.lang.RuntimeException: > BaseSqlParser > Operation not allowed: alter table add columns(line 1, pos 0) > == SQL == > alter table sensor_reading_blockblank_false add columns(id1 varchar) > ^^^ > CarbonSqlParser [1.68] failure: ``('' expected but `)' found > alter table sensor_reading_blockblank_false add columns(id1 varchar) >^ > (state=,code=0) > Expected : The create table and alter table output should be consistent for > char and varchar types for similar syntax usage. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1838) Refactor SortStepRowUtil to make it more readable
[ https://issues.apache.org/jira/browse/CARBONDATA-1838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1838. -- Resolution: Fixed Fix Version/s: 1.3.0 > Refactor SortStepRowUtil to make it more readable > - > > Key: CARBONDATA-1838 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1838 > Project: CarbonData > Issue Type: Improvement > Components: data-load >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > Refactor and optimize `SortRowStepUtil` to make it efficient and more > readable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1831) Carbon 1.3.0 - BAD_RECORDS: Data Loading with Action as Redirect & logger enable is not logging the logs in the defined path.
[ https://issues.apache.org/jira/browse/CARBONDATA-1831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1831. -- Resolution: Fixed Fix Version/s: 1.3.0 > Carbon 1.3.0 - BAD_RECORDS: Data Loading with Action as Redirect & logger > enable is not logging the logs in the defined path. > - > > Key: CARBONDATA-1831 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1831 > Project: CarbonData > Issue Type: Bug >Reporter: Ayushi Sharma >Assignee: dhatchayani > Fix For: 1.3.0 > > Time Spent: 3.5h > Remaining Estimate: 0h > > Steps: > 1. CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format' > 2. LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/InsertData/2000_UniqData.csv' > into table uniqdata OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', > 'BAD_RECORDS_ACTION'='REDIRECT','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1') > Issue: > Data Load is not creating the logs when bad_records location is specified in > carbon.properties. > Expected: > Bad_Records log should be created in the specified path. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1847) Add inputSize for row
Jacky Li created CARBONDATA-1847: Summary: Add inputSize for row Key: CARBONDATA-1847 URL: https://issues.apache.org/jira/browse/CARBONDATA-1847 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Assignee: Jacky Li -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1864) Using org.apache.spark.SPARK_VERSION instead of sparkSession.version
[ https://issues.apache.org/jira/browse/CARBONDATA-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1864. -- Resolution: Fixed Assignee: QiangCai Fix Version/s: 1.3.0 > Using org.apache.spark.SPARK_VERSION instead of sparkSession.version > > > Key: CARBONDATA-1864 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1864 > Project: CarbonData > Issue Type: Improvement >Reporter: QiangCai >Assignee: QiangCai >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > Using org.apache.spark.SPARK_VERSION instead of sparkSession.version -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1873) Refactor and annotate carbon property
[ https://issues.apache.org/jira/browse/CARBONDATA-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1873: - Summary: Refactor and annotate carbon property (was: Refactor carbon property for better maintenance) > Refactor and annotate carbon property > - > > Key: CARBONDATA-1873 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1873 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li > Fix For: 1.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1871) Add annotation for interface compatibility
Jacky Li created CARBONDATA-1871: Summary: Add annotation for interface compatibility Key: CARBONDATA-1871 URL: https://issues.apache.org/jira/browse/CARBONDATA-1871 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Fix For: 1.3.0 All use facing API should be annotated with proper stability level. InterfaceStability level includes: 1. Forever: API in this level is compatible across major version 2. Stable: API in this level is compatible across minor version, maybe break across major version 3. Evolving: API in this level is compatible across maintenance version, maybe break across minor version 4. Unstable: API in this level is not backward compatible guranteed -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1874) Add document for table property
Jacky Li created CARBONDATA-1874: Summary: Add document for table property Key: CARBONDATA-1874 URL: https://issues.apache.org/jira/browse/CARBONDATA-1874 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1873) Refactor carbon property for better maintenance
Jacky Li created CARBONDATA-1873: Summary: Refactor carbon property for better maintenance Key: CARBONDATA-1873 URL: https://issues.apache.org/jira/browse/CARBONDATA-1873 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1874) Refactor table property for better maintenance
[ https://issues.apache.org/jira/browse/CARBONDATA-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1874: - Summary: Refactor table property for better maintenance (was: Add document for table property) > Refactor table property for better maintenance > -- > > Key: CARBONDATA-1874 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1874 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li > Fix For: 1.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1872) Clean up unused constant in CarbonCommonConstant
Jacky Li created CARBONDATA-1872: Summary: Clean up unused constant in CarbonCommonConstant Key: CARBONDATA-1872 URL: https://issues.apache.org/jira/browse/CARBONDATA-1872 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1871) Add annotation for interface compatibility
[ https://issues.apache.org/jira/browse/CARBONDATA-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1871: - Description: All use facing API should be annotated with proper stability level. InterfaceStability level includes: 1. Forever: API in this level is compatible across major version 2. Stable: API in this level is compatible across minor version, maybe break across major version 3. Evolving: API in this level is compatible across maintenance version, maybe break across minor version 4. Unstable: API in this level is not backward compatible guranteed Since user mainly use SQL for carbondata, the API need to be annotated includes: 1. Table Property in create table 2. Load Option in load data and dataframe api 3. Carbon Property was: All use facing API should be annotated with proper stability level. InterfaceStability level includes: 1. Forever: API in this level is compatible across major version 2. Stable: API in this level is compatible across minor version, maybe break across major version 3. Evolving: API in this level is compatible across maintenance version, maybe break across minor version 4. Unstable: API in this level is not backward compatible guranteed > Add annotation for interface compatibility > -- > > Key: CARBONDATA-1871 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1871 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li > Fix For: 1.3.0 > > > All use facing API should be annotated with proper stability level. > InterfaceStability level includes: > 1. Forever: API in this level is compatible across major version > 2. Stable: API in this level is compatible across minor version, maybe break > across major version > 3. Evolving: API in this level is compatible across maintenance version, > maybe break across minor version > 4. Unstable: API in this level is not backward compatible guranteed > Since user mainly use SQL for carbondata, the API need to be annotated > includes: > 1. Table Property in create table > 2. Load Option in load data and dataframe api > 3. Carbon Property -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1874) Refactor and annotate table property
[ https://issues.apache.org/jira/browse/CARBONDATA-1874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1874: - Summary: Refactor and annotate table property (was: Refactor table property for better maintenance) > Refactor and annotate table property > > > Key: CARBONDATA-1874 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1874 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li > Fix For: 1.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1875) Refactor and annotate load options
Jacky Li created CARBONDATA-1875: Summary: Refactor and annotate load options Key: CARBONDATA-1875 URL: https://issues.apache.org/jira/browse/CARBONDATA-1875 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1843) Block CTAS and external table syntax
[ https://issues.apache.org/jira/browse/CARBONDATA-1843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1843: - Description: 1. Block 'external' syntax 2. Block 'CTAS' syntax was: 1. Add configuration for support dictionary and complex type 2. Block 'external', 'CTAS' syntax 3. Some other minor fix to catch exceptions Summary: Block CTAS and external table syntax (was: Add configuration to enable features to improve usability) > Block CTAS and external table syntax > > > Key: CARBONDATA-1843 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1843 > Project: CarbonData > Issue Type: Improvement >Reporter: Jacky Li >Assignee: Jacky Li >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > 1. Block 'external' syntax > 2. Block 'CTAS' syntax -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1848) Streaming sink should adapt spark 2.2
[ https://issues.apache.org/jira/browse/CARBONDATA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1848. -- Resolution: Fixed > Streaming sink should adapt spark 2.2 > - > > Key: CARBONDATA-1848 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1848 > Project: CarbonData > Issue Type: Bug >Reporter: QiangCai >Priority: Minor > Time Spent: 2h 10m > Remaining Estimate: 0h > > Streaming sink should adapt spark 2.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1848) Streaming sink should adapt spark 2.2
[ https://issues.apache.org/jira/browse/CARBONDATA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1848: - Fix Version/s: 1.3.0 > Streaming sink should adapt spark 2.2 > - > > Key: CARBONDATA-1848 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1848 > Project: CarbonData > Issue Type: Bug >Reporter: QiangCai >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Streaming sink should adapt spark 2.2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1837) Reusing old row to reduce memory consumption
[ https://issues.apache.org/jira/browse/CARBONDATA-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1837. -- Resolution: Fixed Fix Version/s: 1.3.0 > Reusing old row to reduce memory consumption > > > Key: CARBONDATA-1837 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1837 > Project: CarbonData > Issue Type: Improvement > Components: data-load >Reporter: xuchuanyin >Assignee: xuchuanyin >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > In data converting process of data loading, Carbondata will convert each row > to another row by batch. > Currently, it will create a new batch to store the converted rows, which I > think can be optimized to reuse the old row batch's space, thus will reduce > memory consumption and GC overhead. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1781) (Carbon1.3.0 - Streaming) Select * & select column fails but select count(*) is success when .streaming file is removed from HDFS
[ https://issues.apache.org/jira/browse/CARBONDATA-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1781. -- Resolution: Fixed Fix Version/s: 1.3.0 > (Carbon1.3.0 - Streaming) Select * & select column fails but select count(*) > is success when .streaming file is removed from HDFS > - > > Key: CARBONDATA-1781 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1781 > Project: CarbonData > Issue Type: Bug > Components: data-query >Affects Versions: 1.3.0 > Environment: 3 node ant cluster >Reporter: Chetan Bhat > Labels: DFX > Fix For: 1.3.0 > > Time Spent: 1h > Remaining Estimate: 0h > > *Steps :* > Thrift server is started using the command - bin/spark-submit --master > yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G > --num-executors 3 --class > org.apache.carbondata.spark.thriftserver.CarbonThriftServer > /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar > "hdfs://hacluster/user/hive/warehouse/carbon.store" > Spark shell is opened using the command - bin/spark-shell --master > yarn-client --executor-memory 10G --executor-cores 5 --driver-memory 5G > --num-executors 3 --jars > /srv/spark2.2Bigdata/install/spark/sparkJdbc/carbonlib/carbondata_2.11-1.3.0-SNAPSHOT-shade-hadoop2.7.2.jar > From spark shell the below code is executed - > import java.io.{File, PrintWriter} > import java.net.ServerSocket > import org.apache.spark.sql.{CarbonEnv, SparkSession} > import org.apache.spark.sql.hive.CarbonRelation > import org.apache.spark.sql.streaming.{ProcessingTime, StreamingQuery} > import org.apache.carbondata.core.constants.CarbonCommonConstants > import org.apache.carbondata.core.util.CarbonProperties > import org.apache.carbondata.core.util.path.{CarbonStorePath, CarbonTablePath} > CarbonProperties.getInstance().addProperty(CarbonCommonConstants.CARBON_TIMESTAMP_FORMAT, > "/MM/dd") > import org.apache.spark.sql.CarbonSession._ > val carbonSession = SparkSession. > builder(). > appName("StreamExample"). > > getOrCreateCarbonSession("hdfs://hacluster/user/hive/warehouse/carbon.store") > > carbonSession.sparkContext.setLogLevel("INFO") > def sql(sql: String) = carbonSession.sql(sql) > def writeSocket(serverSocket: ServerSocket): Thread = { > val thread = new Thread() { > override def run(): Unit = { > // wait for client to connection request and accept > val clientSocket = serverSocket.accept() > val socketWriter = new PrintWriter(clientSocket.getOutputStream()) > var index = 0 > for (_ <- 1 to 1000) { > // write 5 records per iteration > for (_ <- 0 to 100) { > index = index + 1 > socketWriter.println(index.toString + ",name_" + index >+ ",city_" + index + "," + (index * > 1.00).toString + >",school_" + index + ":school_" + index + > index + "$" + index) > } > socketWriter.flush() > Thread.sleep(2000) > } > socketWriter.close() > System.out.println("Socket closed") > } > } > thread.start() > thread > } > > def startStreaming(spark: SparkSession, tablePath: CarbonTablePath, > tableName: String, port: Int): Thread = { > val thread = new Thread() { > override def run(): Unit = { > var qry: StreamingQuery = null > try { > val readSocketDF = spark.readStream > .format("socket") > .option("host", "10.18.98.34") > .option("port", port) > .load() > qry = readSocketDF.writeStream > .format("carbondata") > .trigger(ProcessingTime("5 seconds")) > .option("checkpointLocation", tablePath.getStreamingCheckpointDir) > .option("tablePath", tablePath.getPath).option("tableName", > tableName) > .start() > qry.awaitTermination() > } catch { > case ex: Throwable => > ex.printStackTrace() > println("Done reading and writing streaming data") > } finally { > qry.stop() > } > } > } > thread.start() > thread > } > val streamTableName = "brinjal" > sql(s"drop table brinjal").show > sql(s"create table brinjal (imei string,AMSize string,channelsId > string,ActiveCountry string, Activecity string,gamePointId > double,deviceInformationId double,productionDate Timestamp,deliveryDate > timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' > TBLPROPERTIES('streaming'='true','table_blocksize'='1')") > sql(s"LOAD DATA INPATH
[jira] [Resolved] (CARBONDATA-1761) (Carbon1.3.0 - DELETE SEGMENT BY ID) In Progress Segment is marked for delete if respective id is given in delete segment by id query
[ https://issues.apache.org/jira/browse/CARBONDATA-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1761. -- Resolution: Fixed Fix Version/s: 1.3.0 > (Carbon1.3.0 - DELETE SEGMENT BY ID) In Progress Segment is marked for delete > if respective id is given in delete segment by id query > - > > Key: CARBONDATA-1761 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1761 > Project: CarbonData > Issue Type: Bug > Components: data-load >Affects Versions: 1.3.0 > Environment: 3 Node ant cluster > Description >Reporter: Ajeet Rai >Assignee: Akash R Nilugal > Labels: dfx > Fix For: 1.3.0 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > (Carbon1.3.0 - DELETE SEGMENT BY ID) In Progress Segment is marked for delete > if respective id is given in delete segment by id query. > 1: Create a table > CREATE TABLE IF NOT EXISTS flow_carbon_new999(txn_dte String,dt String,txn_bk > String,txn_br String,own_bk String,own_br String,opp_bk String,bus_opr_cde > String,opt_prd_cde String,cus_no String,cus_ac String,opp_ac_nme > String,opp_ac String,bv_no String,aco_ac String,ac_dte String,txn_cnt > int,jrn_par int,mfm_jrn_no String,cbn_jrn_no String,ibs_jrn_no String,vch_no > String,vch_seq String,srv_cde String,bus_cd_no String,id_flg String,bv_cde > String,txn_time String,txn_tlr String,ety_tlr String,ety_bk String,ety_br > String,bus_pss_no String,chk_flg String,chk_tlr String,chk_jrn_no String, > bus_sys_no String,txn_sub_cde String,fin_bus_cde String,fin_bus_sub_cde > String,chl String,tml_id String,sus_no String,sus_seq String, cho_seq String, > itm_itm String,itm_sub String,itm_sss String,dc_flg String,amt > decimal(15,2),bal decimal(15,2),ccy String,spv_flg String,vch_vld_dte > String,pst_bk String,pst_br String,ec_flg String,aco_tlr String,gen_flg > String,his_rec_sum_flg String,his_flg String,vch_typ String,val_dte > String,opp_ac_flg String,cmb_flg String,ass_vch_flg String,cus_pps_flg > String,bus_rmk_cde String,vch_bus_rmk String,tec_rmk_cde String,vch_tec_rmk > String,gems_last_upd_d String,maps_date String,maps_job String)STORED BY > 'org.apache.carbondata.format' > TBLPROPERTIES('DICTIONARY_INCLUDE'='txn_cnt,jrn_par,amt,bal','No_Inverted_Index'= > 'txn_dte,dt,txn_bk,txn_br,own_bk ,own_br ,opp_bk ,bus_opr_cde ,opt_prd_cde > ,cus_no ,cus_ac ,opp_ac_nme ,opp_ac ,bv_no ,aco_ac ,ac_dte ,txn_cnt ,jrn_par > ,mfm_jrn_no ,cbn_jrn_no ,ibs_jrn_no ,vch_no ,vch_seq ,srv_cde ,bus_cd_no > ,id_flg ,bv_cde ,txn_time ,txn_tlr ,ety_tlr ,ety_bk ,ety_br ,bus_pss_no > ,chk_flg ,chk_tlr ,chk_jrn_no , bus_sys_no ,txn_sub_cde ,fin_bus_cde > ,fin_bus_sub_cde ,chl ,tml_id ,sus_no ,sus_seq , cho_seq , itm_itm ,itm_sub > ,itm_sss ,dc_flg ,amt,bal,ccy ,spv_flg ,vch_vld_dte ,pst_bk ,pst_br ,ec_flg > ,aco_tlr ,gen_flg ,his_rec_sum_flg ,his_flg ,vch_typ ,val_dte ,opp_ac_flg > ,cmb_flg ,ass_vch_flg ,cus_pps_flg ,bus_rmk_cde ,vch_bus_rmk ,tec_rmk_cde > ,vch_tec_rmk ,gems_last_upd_d ,maps_date ,maps_job' ); > 2: start a data load. > LOAD DATA inpath 'hdfs://hacluster/user/test/20140101_1_1.csv' into > table flow_carbon_new999 options('DELIMITER'=',', > 'QUOTECHAR'='"','header'='false'); > 3: run a insert into/overwrite job > insert into table flow_carbon_new999 select * from flow_carbon_new666; > 4: show segments for table flow_carbon_new999; > 5: Observe that load/insert/overwrite job is started with new segment id > 6: now run a delete segment by id query with this id. > DELETE FROM TABLE ajeet.flow_carbon_new999 WHERE SEGMENT.ID IN (34) > 7: again run show segment and see this segment which is still in progress is > marked for delete. > 8: Observe that insert/load job is still running and after some time(in next > job of load/insert/overwrite), this job fails with below error: > Error: java.lang.RuntimeException: It seems insert overwrite has been issued > during load (state=,code=0) > This is not correct behaviour and it should be handled. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1516) Support pre-aggregate tables and timeseries in carbondata
[ https://issues.apache.org/jira/browse/CARBONDATA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1516: - Attachment: CarbonData Pre-aggregation Table_v1.1.pdf > Support pre-aggregate tables and timeseries in carbondata > - > > Key: CARBONDATA-1516 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1516 > Project: CarbonData > Issue Type: New Feature >Reporter: Ravindra Pesala > Attachments: CarbonData Pre-aggregation Table.pdf, CarbonData > Pre-aggregation Table_v1.1.pdf > > > Currently Carbondata has standard SQL capability on distributed data > sets.Carbondata should support pre-aggregating tables for timeseries and > improve query performance. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1628) Re-factory LoadTableCommand to reuse code for streaming ingest in the future
[ https://issues.apache.org/jira/browse/CARBONDATA-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1628. -- Resolution: Fixed Assignee: QiangCai Fix Version/s: 1.3.0 > Re-factory LoadTableCommand to reuse code for streaming ingest in the future > > > Key: CARBONDATA-1628 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1628 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration >Reporter: QiangCai >Assignee: QiangCai >Priority: Minor > Fix For: 1.3.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Re-factory LoadTableCommand to reuse code for streaming ingest in the future -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1615) DELETE SEGMENT BY DATE should ignore the streaming segment
Jacky Li created CARBONDATA-1615: Summary: DELETE SEGMENT BY DATE should ignore the streaming segment Key: CARBONDATA-1615 URL: https://issues.apache.org/jira/browse/CARBONDATA-1615 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1613) streaming table should support INSERT OVERWRITE
Jacky Li created CARBONDATA-1613: Summary: streaming table should support INSERT OVERWRITE Key: CARBONDATA-1613 URL: https://issues.apache.org/jira/browse/CARBONDATA-1613 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li INSERT OVERWRITE should take care of streaming segment when executing the command -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (CARBONDATA-1612) Block DELETE SEGMENT BY ID for streaming table
[ https://issues.apache.org/jira/browse/CARBONDATA-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li updated CARBONDATA-1612: - Fix Version/s: 1.3.0 > Block DELETE SEGMENT BY ID for streaming table > -- > > Key: CARBONDATA-1612 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1612 > Project: CarbonData > Issue Type: Sub-task >Reporter: Jacky Li > Fix For: 1.3.0 > > > Streaming segment should be managed by carbon internally and it should not be > deleted by user -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1612) Block DELETE SEGMENT BY ID for streaming table
Jacky Li created CARBONDATA-1612: Summary: Block DELETE SEGMENT BY ID for streaming table Key: CARBONDATA-1612 URL: https://issues.apache.org/jira/browse/CARBONDATA-1612 Project: CarbonData Issue Type: Sub-task Reporter: Jacky Li Streaming segment should be managed by carbon internally and it should not be deleted by user -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1662) Make ArrayType and StructType contain child DataType
Jacky Li created CARBONDATA-1662: Summary: Make ArrayType and StructType contain child DataType Key: CARBONDATA-1662 URL: https://issues.apache.org/jira/browse/CARBONDATA-1662 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (CARBONDATA-1658) Thread Leak Issue in No Sort
[ https://issues.apache.org/jira/browse/CARBONDATA-1658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-1658. -- Resolution: Fixed Fix Version/s: 1.3.0 > Thread Leak Issue in No Sort > > > Key: CARBONDATA-1658 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1658 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal > Fix For: 1.3.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Threads are not getting closed in case of no sort -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (CARBONDATA-1663) Decouple spark in carbon modules
Jacky Li created CARBONDATA-1663: Summary: Decouple spark in carbon modules Key: CARBONDATA-1663 URL: https://issues.apache.org/jira/browse/CARBONDATA-1663 Project: CarbonData Issue Type: Improvement Reporter: Jacky Li Assignee: Jacky Li Fix For: 1.3.0 carbon-core, carbon-processing, carbon-hadoop modules should not depend on spark -- This message was sent by Atlassian JIRA (v6.4.14#64029)