[jira] [Updated] (CARBONDATA-882) Add SORT_COLUMNS option support in dataframe writer

2017-04-06 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-882:

Description: User can should be able to specify SORT_COLUMNS option when 
using dataframe.write  (was: User can specify Not to sort during loading, by 
adding an option in dataframe.write)

> Add SORT_COLUMNS option support in dataframe writer
> ---
>
> Key: CARBONDATA-882
> URL: https://issues.apache.org/jira/browse/CARBONDATA-882
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
> Fix For: 1.2.0-incubating
>
>
> User can should be able to specify SORT_COLUMNS option when using 
> dataframe.write



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-882) Add SORT_COLUMNS option support in dataframe writer

2017-04-06 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-882:

Summary: Add SORT_COLUMNS option support in dataframe writer  (was: Add no 
sort support in dataframe writer)

> Add SORT_COLUMNS option support in dataframe writer
> ---
>
> Key: CARBONDATA-882
> URL: https://issues.apache.org/jira/browse/CARBONDATA-882
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
> Fix For: 1.2.0-incubating
>
>
> User can specify Not to sort during loading, by adding an option in 
> dataframe.write



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-882) Add no sort support in dataframe writer

2017-04-06 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-882:
---

 Summary: Add no sort support in dataframe writer
 Key: CARBONDATA-882
 URL: https://issues.apache.org/jira/browse/CARBONDATA-882
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
 Fix For: 1.2.0-incubating


User can specify Not to sort during loading, by adding an option in 
dataframe.write



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-830) Incorrect schedule for NewCarbonDataLoadRDD

2017-03-29 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-830.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Incorrect schedule for NewCarbonDataLoadRDD
> ---
>
> Key: CARBONDATA-830
> URL: https://issues.apache.org/jira/browse/CARBONDATA-830
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.0.0-incubating
> Environment: Spark 2.1 + Carbon 1.0.0
>Reporter: Weizhong
>Assignee: Weizhong
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently NewCarbonDataLoadRDD's getPreferredLocations will return all locs 
> rather than 1, then on Spark may pick the same node for two tasks, so one 
> node is getting over loaded with the task and one has no task to do, and 
> impacting the performance despite of any failure.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-821) Remove Kettle related code and flow from carbon.

2017-03-29 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-821.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Remove Kettle related code and flow from carbon.
> 
>
> Key: CARBONDATA-821
> URL: https://issues.apache.org/jira/browse/CARBONDATA-821
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Remove Kettle related code and flow from carbon. It becomes difficult to 
> developers to handle all bugs and features in both the flows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-832) Data loading is failing with duplicate header column in csv file

2017-03-29 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-832.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Data loading is failing with duplicate header column in csv file
> 
>
> Key: CARBONDATA-832
> URL: https://issues.apache.org/jira/browse/CARBONDATA-832
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
> Fix For: 1.1.0-incubating
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Problem : data mismatch issue when csv column having duplicate column header.
> Solution: row parser impl logic of getting indexes is having issue



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-829) DICTIONARY_EXCLUDE is not working when using Spark Datasource DDL

2017-03-27 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-829:
---

 Summary: DICTIONARY_EXCLUDE is not working when using Spark 
Datasource DDL
 Key: CARBONDATA-829
 URL: https://issues.apache.org/jira/browse/CARBONDATA-829
 Project: CarbonData
  Issue Type: Bug
Reporter: Jacky Li


When creating table for TCP-H, found that following operation will fail
create table car(  
L_SHIPDATE string,
L_SHIPMODE string,
L_SHIPINSTRUCT string,
L_RETURNFLAG string,
L_RECEIPTDATE string,
L_ORDERKEY string,
L_PARTKEY string,
L_SUPPKEY   string,
L_LINENUMBER int,
L_QUANTITY decimal,
L_EXTENDEDPRICE decimal,
L_DISCOUNT decimal,
L_TAX decimal,
L_LINESTATUS string,
L_COMMITDATE string,
L_COMMENT  string
) 
USING org.apache.spark.sql.CarbonSource
OPTIONS (tableName "car", DICTIONARY_EXCLUDE "L_ORDERKEY, L_PARTKEY, L_SUPPKEY, 
L_COMMENT");




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-827) Query statistics log format is incorrect

2017-03-27 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-827:
---

 Summary: Query statistics log format is incorrect
 Key: CARBONDATA-827
 URL: https://issues.apache.org/jira/browse/CARBONDATA-827
 Project: CarbonData
  Issue Type: Bug
Reporter: Jacky Li


The output log for query statistics has repeated numbers which is incorrect



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-696) NPE when select query run on measure having double data type without fraction.

2017-03-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-696.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> NPE  when select query run on measure having double data type without 
> fraction.
> ---
>
> Key: CARBONDATA-696
> URL: https://issues.apache.org/jira/browse/CARBONDATA-696
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
>Reporter: Babulal
>Assignee: Kunal Kapoor
> Fix For: 1.1.0-incubating
>
> Attachments: logs, oscon_10.csv
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Create table as below
> cc.sql("create table oscon_carbon_old  (CUST_PRFRD_FLG String,PROD_BRAND_NAME 
> String,PROD_COLOR String,CUST_LAST_RVW_DATE String,CUST_COUNTRY 
> String,CUST_CITY String,PRODUCT_NAME String,CUST_JOB_TITLE String,CUST_STATE 
> String,CUST_BUY_POTENTIAL String,PRODUCT_MODEL String,ITM_ID String,ITM_NAME 
> String,PRMTION_ID String,PRMTION_NAME String,SHP_MODE_ID String,SHP_MODE 
> String,DELIVERY_COUNTRY String,DELIVERY_STATE String,DELIVERY_CITY 
> String,DELIVERY_DISTRICT String,ACTIVE_EMUI_VERSION String,WH_NAME 
> String,STR_ORDER_DATE String,OL_ORDER_NO String,OL_ORDER_DATE String,OL_SITE 
> String,CUST_FIRST_NAME String,CUST_LAST_NAME String,CUST_BIRTH_DY 
> String,CUST_BIRTH_MM String,CUST_BIRTH_YR String,CUST_BIRTH_COUNTRY 
> String,CUST_SEX String,CUST_ADDRESS_ID String,CUST_STREET_NO 
> String,CUST_STREET_NAME String,CUST_AGE String,CUST_SUITE_NO String,CUST_ZIP 
> String,CUST_COUNTY String,PRODUCT_ID String,PROD_SHELL_COLOR 
> String,DEVICE_NAME String,PROD_SHORT_DESC String,PROD_LONG_DESC 
> String,PROD_THUMB String,PROD_IMAGE String,PROD_UPDATE_DATE String,PROD_LIVE 
> String,PROD_LOC String,PROD_RAM String,PROD_ROM String,PROD_CPU_CLOCK 
> String,PROD_SERIES String,ITM_REC_START_DATE String,ITM_REC_END_DATE 
> String,ITM_BRAND_ID String,ITM_BRAND String,ITM_CLASS_ID String,ITM_CLASS 
> String,ITM_CATEGORY_ID String,ITM_CATEGORY String,ITM_MANUFACT_ID 
> String,ITM_MANUFACT String,ITM_FORMULATION String,ITM_COLOR 
> String,ITM_CONTAINER String,ITM_MANAGER_ID String,PRM_START_DATE 
> String,PRM_END_DATE String,PRM_CHANNEL_DMAIL String,PRM_CHANNEL_EMAIL 
> String,PRM_CHANNEL_CAT String,PRM_CHANNEL_TV String,PRM_CHANNEL_RADIO 
> String,PRM_CHANNEL_PRESS String,PRM_CHANNEL_EVENT String,PRM_CHANNEL_DEMO 
> String,PRM_CHANNEL_DETAILS String,PRM_PURPOSE String,PRM_DSCNT_ACTIVE 
> String,SHP_CODE String,SHP_CARRIER String,SHP_CONTRACT String,CHECK_DATE 
> String,CHECK_YR String,CHECK_MM String,CHECK_DY String,CHECK_HOUR String,BOM 
> String,INSIDE_NAME String,PACKING_DATE String,PACKING_YR String,PACKING_MM 
> String,PACKING_DY String,PACKING_HOUR String,DELIVERY_PROVINCE 
> String,PACKING_LIST_NO String,ACTIVE_CHECK_TIME String,ACTIVE_CHECK_YR 
> String,ACTIVE_CHECK_MM String,ACTIVE_CHECK_DY String,ACTIVE_CHECK_HOUR 
> String,ACTIVE_AREA_ID String,ACTIVE_COUNTRY String,ACTIVE_PROVINCE 
> String,ACTIVE_CITY String,ACTIVE_DISTRICT String,ACTIVE_NETWORK 
> String,ACTIVE_FIRMWARE_VER String,ACTIVE_OS_VERSION String,LATEST_CHECK_TIME 
> String,LATEST_CHECK_YR String,LATEST_CHECK_MM String,LATEST_CHECK_DY 
> String,LATEST_CHECK_HOUR String,LATEST_AREAID String,LATEST_COUNTRY 
> String,LATEST_PROVINCE String,LATEST_CITY String,LATEST_DISTRICT 
> String,LATEST_FIRMWARE_VER String,LATEST_EMUI_VERSION 
> String,LATEST_OS_VERSION String,LATEST_NETWORK String,WH_ID 
> String,WH_STREET_NO String,WH_STREET_NAME String,WH_STREET_TYPE 
> String,WH_SUITE_NO String,WH_CITY String,WH_COUNTY String,WH_STATE 
> String,WH_ZIP String,WH_COUNTRY String,OL_SITE_DESC String,OL_RET_ORDER_NO 
> String,OL_RET_DATE String,PROD_MODEL_ID String,CUST_ID String,PROD_UNQ_MDL_ID 
> String,CUST_NICK_NAME String,CUST_LOGIN String,CUST_EMAIL_ADDR 
> String,PROD_UNQ_DEVICE_ADDR String,PROD_UQ_UUID String,PROD_BAR_CODE 
> String,TRACKING_NO String,STR_ORDER_NO String,CUST_DEP_COUNT 
> double,CUST_VEHICLE_COUNT double,CUST_ADDRESS_CNT double,CUST_CRNT_CDEMO_CNT 
> double,CUST_CRNT_HDEMO_CNT double,CUST_CRNT_ADDR_DM 
> double,CUST_FIRST_SHIPTO_CNT double,CUST_FIRST_SALES_CNT 
> double,CUST_GMT_OFFSET double,CUST_DEMO_CNT double,CUST_INCOME 
> double,PROD_UNLIMITED double,PROD_OFF_PRICE double,PROD_UNITS 
> double,TOTAL_PRD_COST double,TOTAL_PRD_DISC double,PROD_WEIGHT 
> double,REG_UNIT_PRICE double,EXTENDED_AMT double,UNIT_PRICE_DSCNT_PCT 
> double,DSCNT_AMT double,PROD_STD_CST double,TOTAL_TX_AMT double,FREIGHT_CHRG 
> double,WAITING_PERIOD double,DELIVERY_PERIOD double,ITM_CRNT_PRICE 
> double,ITM_UNITS double,ITM_WSLE_CST double,ITM_SIZE double,PRM_CST 
> double,PRM_RESPONSE_TARGET double,PRM_ITM_DM double,SHP_MODE_CNT 
> double,WH_GMT_OFFSET 

[jira] [Resolved] (CARBONDATA-818) The file_name stored in carbonindex is wrong

2017-03-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-818.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> The file_name stored in carbonindex is wrong
> 
>
> Key: CARBONDATA-818
> URL: https://issues.apache.org/jira/browse/CARBONDATA-818
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Yadong Qi
>Assignee: Yadong Qi
> Fix For: 1.1.0-incubating
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> The file_name stored in carbonindex is a local path which used on executor as 
> temp dir 
> {code}
> /tmp/6937581525189542/0/default/carbon_v3/Fact/Part0/Segment_0/0/part-0-0_batchno0-0-1490345609845.carbondata
> {code}
> But I think we want to store the actual carbondata path like
> {code}
> part-0-0_batchno0-0-1490345609845.carbondata
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-820) Redundant BitSet created in data load

2017-03-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-820.
-
Resolution: Fixed
  Assignee: Jacky Li

> Redundant BitSet created in data load
> -
>
> Key: CARBONDATA-820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-820
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.0.0-incubating
>Reporter: Jacky Li
>Assignee: Jacky Li
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In CarbonFactDataHandlerColumnar.getMeasureNullValueIndexBitSet method



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-823) Refactory of data write step

2017-03-26 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-823:
---

 Summary: Refactory of data write step
 Key: CARBONDATA-823
 URL: https://issues.apache.org/jira/browse/CARBONDATA-823
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
 Fix For: 1.1.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-783) Loading data with Single Pass 'true' option is throwing an exception

2017-03-26 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-783.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Loading data with Single Pass 'true' option is throwing an exception
> 
>
> Key: CARBONDATA-783
> URL: https://issues.apache.org/jira/browse/CARBONDATA-783
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.1.0-incubating
> Environment: spark 2.1
>Reporter: Geetika Gupta
>Assignee: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
> Attachments: 7000_UniqData.csv
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> I tried to create table using the following query:
> CREATE TABLE uniq_include_dictionary (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,Double_COLUMN2,DECIMAL_COLUMN2');
> Table creation was successfull but when I tried to load data into the table 
> It showed the following error:
> ERROR 16-03 13:41:32,354 - nioEventLoopGroup-8-2 
> java.lang.IndexOutOfBoundsException: readerIndex(64) + length(25) exceeds 
> writerIndex(80): UnpooledUnsafeDirectByteBuf(ridx: 64, widx: 80, cap: 80)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1161)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1155)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:694)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:702)
>   at 
> org.apache.carbondata.core.dictionary.generator.key.DictionaryMessage.readData(DictionaryMessage.java:70)
>   at 
> org.apache.carbondata.core.dictionary.server.DictionaryServerHandler.channelRead(DictionaryServerHandler.java:59)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
>   at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
>   at java.lang.Thread.run(Thread.java:745)
> ERROR 16-03 13:41:32,355 - nioEventLoopGroup-8-2 exceptionCaught
> java.lang.IndexOutOfBoundsException: readerIndex(64) + length(25) exceeds 
> writerIndex(80): UnpooledUnsafeDirectByteBuf(ridx: 64, widx: 80, cap: 80)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1161)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1155)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:694)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:702)
>   at 
> org.apache.carbondata.core.dictionary.generator.key.DictionaryMessage.readData(DictionaryMessage.java:70)
>   at 
> org.apache.carbondata.core.dictionary.server.DictionaryServerHandler.channelRead(DictionaryServerHandler.java:59)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerCont

[jira] [Resolved] (CARBONDATA-809) Union with alias is returning wrong result.

2017-03-26 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-809.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.1.0-incubating

> Union with alias is returning wrong result.
> ---
>
> Key: CARBONDATA-809
> URL: https://issues.apache.org/jira/browse/CARBONDATA-809
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Union with alias is returning wrong result.
> Testcase 
> {code}
> SELECT t.c1 a FROM (select c1 from  carbon_table1 union all  select c1 from  
> carbon_table2) t
> {code}
> The above query returns the data from only one table and also duplicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-812) make vectorized reader as default reader

2017-03-26 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-812.
-
Resolution: Fixed
  Assignee: Jacky Li

> make vectorized reader as default reader
> 
>
> Key: CARBONDATA-812
> URL: https://issues.apache.org/jira/browse/CARBONDATA-812
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-820) Redundant BitSet created in data load

2017-03-25 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-820:

Request participants:   (was: )
 Description: In 
CarbonFactDataHandlerColumnar.getMeasureNullValueIndexBitSet method

> Redundant BitSet created in data load
> -
>
> Key: CARBONDATA-820
> URL: https://issues.apache.org/jira/browse/CARBONDATA-820
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.0.0-incubating
>Reporter: Jacky Li
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>
> In CarbonFactDataHandlerColumnar.getMeasureNullValueIndexBitSet method



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-820) Redundant BitSet created in data load

2017-03-25 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-820:
---

 Summary: Redundant BitSet created in data load
 Key: CARBONDATA-820
 URL: https://issues.apache.org/jira/browse/CARBONDATA-820
 Project: CarbonData
  Issue Type: Bug
Affects Versions: 1.0.0-incubating
Reporter: Jacky Li
Priority: Minor
 Fix For: 1.1.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-812) make vectorized reader as default reader

2017-03-23 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-812:
---

 Summary: make vectorized reader as default reader
 Key: CARBONDATA-812
 URL: https://issues.apache.org/jira/browse/CARBONDATA-812
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
 Fix For: 1.1.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-742) Add batch sort to improve the loading performance

2017-03-19 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-742.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Add batch sort to improve the loading performance
> -
>
> Key: CARBONDATA-742
> URL: https://issues.apache.org/jira/browse/CARBONDATA-742
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
>  Time Spent: 8h 20m
>  Remaining Estimate: 0h
>
> Current Problem:
> Sort step is major issue as it is blocking step. It needs to receive all data 
> and write down the sort temp files to disk, after that only data writer step 
> can start.
> Solution: 
> Make sort step as non blocking step so it avoids waiting of Data writer step.
> Process the data in sort step in batches with size of in-memory capability of 
> the machine. For suppose if machine can allocate 4 GB to process data 
> in-memory, then Sort step can sorts the data with batch size of 2GB and gives 
> it to the data writer step. By the time data writer step consumes the data, 
> sort step receives and sorts the data. So here all steps are continuously 
> working and absolutely there is no disk IO in sort step.
> So there would not be any waiting of data writer step for sort step, As and 
> when sort step sorts the data in memory data writer can start writing it.
> It can significantly improves the performance.
> Advantages:
> Increases the loading performance as there is no intermediate IO and no 
> blocking of Sort step.
> There is no extra effort for compaction, the current flow can handle it.
> Disadvantages:
> Number of driver side btrees will increase. So the memory might increase but 
> it could be controlled by current LRU cache implementation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-775) Update Documentation for Supported Datatypes

2017-03-17 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-775.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Update Documentation for Supported Datatypes
> 
>
> Key: CARBONDATA-775
> URL: https://issues.apache.org/jira/browse/CARBONDATA-775
> Project: CarbonData
>  Issue Type: Improvement
>  Components: docs
>Reporter: Pallavi Singh
>Assignee: Pallavi Singh
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-730) unsupported type: DecimalType

2017-03-17 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-730.
-
Resolution: Fixed

> unsupported type: DecimalType
> -
>
> Key: CARBONDATA-730
> URL: https://issues.apache.org/jira/browse/CARBONDATA-730
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6.2 Hadoop 2.6
>Reporter: Sanoj MG
>Assignee: anubhav tarar
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Below exception is thrown while trying to save dataframe with a decimal 
> column type. 
> scala> df.printSchema
>  |-- account: integer (nullable = true)
>  |-- currency: integer (nullable = true)
>  |-- branch: integer (nullable = true)
>  |-- country: integer (nullable = true)
>  |-- date: date (nullable = true)
>  |-- fcbalance: decimal(16,3) (nullable = true)
>  |-- lcbalance: decimal(16,3) (nullable = true)
> scala> df.write.format("carbondata").option("tableName", 
> "accBal").option("compress", "true").mode(SaveMode.Overwrite).save()
> java.lang.RuntimeException: unsupported type: DecimalType(16,3)
> at scala.sys.package$.error(package.scala:27)
> at 
> org.apache.carbondata.spark.CarbonDataFrameWriter.org$apache$carbondata$spark$CarbonDataFrameWriter$$convertToCarbonType(CarbonDataFrameWriter.scala:172)
> at 
> org.apache.carbondata.spark.CarbonDataFrameWriter$$anonfun$2.apply(CarbonDataFrameWriter.scala:178)
> at 
> org.apache.carbondata.spark.CarbonDataFrameWriter$$anonfun$2.apply(CarbonDataFrameWriter.scala:177)
> at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> This is working fine with below change : 
> git diff
> diff --git 
> a/integration/spark/src/main/scala/org/apache/carbondata/spark/CarbonDataFrameWriter.scala
>  
> b/integration/spark/src/main/scala/org/apache/carbondata/spark/CarbonDataFrameWriter.scala
> index b843f59..cf9a775 100644
> --- 
> a/integration/spark/src/main/scala/org/apache/carbondata/spark/CarbonDataFrameWriter.scala
> +++ 
> b/integration/spark/src/main/scala/org/apache/carbondata/spark/CarbonDataFrameWriter.scala
> @@ -169,6 +169,7 @@ class CarbonDataFrameWriter(val dataFrame: DataFrame) {
>case BooleanType => CarbonType.DOUBLE.getName
>case TimestampType => CarbonType.TIMESTAMP.getName
>case DateType => CarbonType.DATE.getName
> +  case dt: DecimalType => 
> s"${CarbonType.DECIMAL.getName}(${dt.precision}, ${dt.scale})"
>case other => sys.error(s"unsupported type: $other")
>  }
>}
> Can I create a pull request?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-769) Support Codegen in CarbonDictionaryDecoder

2017-03-16 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-769.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Support Codegen in CarbonDictionaryDecoder
> --
>
> Key: CARBONDATA-769
> URL: https://issues.apache.org/jira/browse/CARBONDATA-769
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Support Codegen in CarbonDictionaryDecoder to leverage wholecodegen 
> performance of Spark2.1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-762) modify all schemaName->databaseName, cubeName->tableName

2017-03-16 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-762.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> modify all schemaName->databaseName, cubeName->tableName
> 
>
> Key: CARBONDATA-762
> URL: https://issues.apache.org/jira/browse/CARBONDATA-762
> Project: CarbonData
>  Issue Type: Bug
>Reporter: QiangCai
>Assignee: Cao, Lionel
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> modify all schemaName->databaseName, cubeName->tableName



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-786) Data mismatch if the data data is loaded across blocklet groups

2017-03-16 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-786.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.1.0-incubating

> Data mismatch if the data data is loaded across blocklet groups
> ---
>
> Key: CARBONDATA-786
> URL: https://issues.apache.org/jira/browse/CARBONDATA-786
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Data mismatch if the data data is loaded across blocklet groups and filter 
> applied on second column onwards.
> Follow testcase
> {code} 
> CarbonProperties.getInstance()
>   .addProperty("carbon.blockletgroup.size.in.mb", "16")
>   .addProperty("carbon.enable.vector.reader", "true")
>   .addProperty("enable.unsafe.sort", "true")
> val rdd = sqlContext.sparkContext
>   .parallelize(1 to 120, 4)
>   .map { x =>
> ("city" + x % 8, "country" + x % 1103, "planet" + x % 10007, 
> x.toString,
>   (x % 16).toShort, x / 2, (x << 1).toLong, x.toDouble / 13, 
> x.toDouble / 11)
>   }.map { x =>
>   Row(x._1, x._2, x._3, x._4, x._5, x._6, x._7, x._8, x._9)
> }
> val schema = StructType(
>   Seq(
> StructField("city", StringType, nullable = false),
> StructField("country", StringType, nullable = false),
> StructField("planet", StringType, nullable = false),
> StructField("id", StringType, nullable = false),
> StructField("m1", ShortType, nullable = false),
> StructField("m2", IntegerType, nullable = false),
> StructField("m3", LongType, nullable = false),
> StructField("m4", DoubleType, nullable = false),
> StructField("m5", DoubleType, nullable = false)
>   )
> )
> val input = sqlContext.createDataFrame(rdd, schema)
> sql(s"drop table if exists testBigData")
> input.write
>   .format("carbondata")
>   .option("tableName", "testBigData")
>   .option("tempCSV", "false")
>   .option("single_pass", "true")
>   .option("dictionary_exclude", "id") // id is high cardinality column
>   .mode(SaveMode.Overwrite)
>   .save()
> sql(s"select city, sum(m1) from testBigData " +
>   s"where country='country12' group by city order by city").show()
> {code}
> The above code supposed return following data, but not returning it.
> {code}
> +-+---+
> | city|sum(m1)|
> +-+---+
> |city0|544|
> |city1|680|
> |city2|816|
> |city3|952|
> |city4|   1088|
> |city5|   1224|
> |city6|   1360|
> |city7|   1496|
> +-+---+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-753) Fix Date and Timestamp format issues

2017-03-13 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-753.
-
Resolution: Fixed

> Fix Date and Timestamp format issues
> 
>
> Key: CARBONDATA-753
> URL: https://issues.apache.org/jira/browse/CARBONDATA-753
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, examples
>Affects Versions: 1.0.0-incubating
>Reporter: Liang Chen
>Assignee: Liang Chen
>Priority: Minor
> Fix For: 1.1.0-incubating, 1.0.1-incubating
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Fix Date and Timestamp format issues:
> 1.Optimize the description of CARBON_TIMESTAMP_FORMAT&CARBON_DATE_FORMAT  in 
> CarbonCommonConstants.java
> 2.Correct fields' definition of Date and Timestamp in examples.
> 3.Add example script how to show raw data's timestamp format. currently 
> spark.sql.show() by default using "-mm-dd hh:mm:ss.f" as 
> Timestamp.toString() format, users always wanting the show data same as raw 
> data format.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-756) RLE encoding isse

2017-03-13 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-756.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> RLE encoding isse
> -
>
> Key: CARBONDATA-756
> URL: https://issues.apache.org/jira/browse/CARBONDATA-756
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
> Fix For: 1.1.0-incubating
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> Problem: Rle index size is more than actual data size
> Solution : If rle index size is more than data size or it is more than 70 of 
> the data size then disable rle encoding for that column



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-751) Adding Header and making footer optional

2017-03-13 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-751.
-
   Resolution: Fixed
 Assignee: kumar vishal
Fix Version/s: 1.1.0-incubating

> Adding Header and making footer optional
> 
>
> Key: CARBONDATA-751
> URL: https://issues.apache.org/jira/browse/CARBONDATA-751
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
> Fix For: 1.1.0-incubating
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Currently carbon does not support appendable format, so below changes is to 
> support appendable format in V3 data file format by making footer option and 
> added header in V3 carbon data file .



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-736) Dictionary Loading issue in Decoder

2017-03-05 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-736.
-
   Resolution: Fixed
 Assignee: kumar vishal
Fix Version/s: 1.1.0-incubating

> Dictionary Loading issue in Decoder
> ---
>
> Key: CARBONDATA-736
> URL: https://issues.apache.org/jira/browse/CARBONDATA-736
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Problem:
> Currently in Carbon dictionary decoder it is loading the dictionary files, it 
> is using get api, when number of columns are high it can use getAll api to 
> load dictionary data concurrently 
> Solution:
> Use get All API



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-747) Add simple performance test for spark2.1 carbon integration

2017-03-05 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-747:
---

 Summary: Add simple performance test for spark2.1 carbon 
integration
 Key: CARBONDATA-747
 URL: https://issues.apache.org/jira/browse/CARBONDATA-747
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
 Fix For: 1.1.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-746) Support spark-sql CLI for spark2.1 carbon integration

2017-03-05 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-746:
---

 Summary: Support spark-sql CLI for spark2.1 carbon integration
 Key: CARBONDATA-746
 URL: https://issues.apache.org/jira/browse/CARBONDATA-746
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
Assignee: Jacky Li
 Fix For: 1.1.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-715) Optimize Single pass data load

2017-02-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-715.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Optimize Single pass data load
> --
>
> Key: CARBONDATA-715
> URL: https://issues.apache.org/jira/browse/CARBONDATA-715
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> 1. Upgrade to latest netty-4.1.8 
> 2. Optimize the serialization of key for passing in network.
> 3. Launch individual dictionary client for each loading thread.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-726) Update with V3 format for better IO and processing optimization.

2017-02-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-726.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Update with V3 format for better IO and processing optimization.
> 
>
> Key: CARBONDATA-726
> URL: https://issues.apache.org/jira/browse/CARBONDATA-726
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Problems in current format.
> 1. IO read is slower since it needs to go for multiple seeks on the file to 
> read column blocklets. Current size of blocklet is 12, so it needs to 
> read multiple times from file to scan the data on that column. Alternatively 
> we can increase the blocklet size but it suffers for filter queries as it 
> gets big blocklet to filter.
> 2. Decompression is slower in current format, we are using inverted index for 
> faster filter queries and using NumberCompressor to compress the inverted 
> index in bit wise packing. It becomes slower so we should avoid number 
> compressor. One alternative is to keep blocklet size with in 32000 so that 
> inverted index can be written with short, but IO read suffers a lot.
> To overcome from above 2 issues we are introducing new format V3.
> Here each blocklet has multiple pages with size 32000, number of pages in 
> blocklet is configurable. Since we keep the page with in short limit so no 
> need compress the inverted index here.
> And maintain the max/min for each page to further prune the filter queries.
> Read the blocklet with pages at once and keep in offheap memory.
> During filter first check the max/min range and if it is valid then go for 
> decompressing the page to filter further.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-692) Support scalar subquery in carbon

2017-02-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-692.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Support scalar subquery in carbon
> -
>
> Key: CARBONDATA-692
> URL: https://issues.apache.org/jira/browse/CARBONDATA-692
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Reporter: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Carbon cannot run scalar sub queries like below
> {code}
> select sum(salary) from scalarsubquery t1
> where ID < (select sum(ID) from scalarsubquery t2 where t1.name = t2.name
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-705) Make the partition distribution as configurable and keep spark distribution as default

2017-02-21 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-705.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Make the partition distribution as configurable and keep spark distribution 
> as default
> --
>
> Key: CARBONDATA-705
> URL: https://issues.apache.org/jira/browse/CARBONDATA-705
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Make the partition distribution as configurable and keep spark distribution 
> as default.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-325) Create table with columns contains spaces in name.

2017-02-20 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-325.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Create table with columns contains spaces in name.
> --
>
> Key: CARBONDATA-325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-325
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Harmeet Singh
>Assignee: Harmeet Singh
> Fix For: 1.1.0-incubating
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> I want to create table, using columns that contains spaces. I am using Thrift 
> Server and Beeline client for accessing carbon data. Whenever i am trying to 
> create a table, and their columns name contains spaces i am getting an error. 
> Below are the steps:
> Step 1:
> create table three (`first name` string, `age` int) stored by 'carbondata';
> Whenever i am executing above query, i am getting below error:
> Error: org.apache.carbondata.spark.exception.MalformedCarbonCommandException: 
> Unsupported data type : FieldSchema(name:first name, type:string, 
> comment:null).getType (state=,code=0)
> The above error is pretending to be wrong data types are using. 
> If I am removing `stored by 'carbondata'` from query, then this will work 
> fine because it is run on Hive.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-685) Able to create table with spaces using carbon source

2017-02-20 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-685.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Able to create table with spaces using carbon source
> 
>
> Key: CARBONDATA-685
> URL: https://issues.apache.org/jira/browse/CARBONDATA-685
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.0.0-incubating
> Environment: spark 2.1 single node cluster
>Reporter: anubhav tarar
>Assignee: Rahul Kumar
>Priority: Trivial
> Fix For: 1.1.0-incubating
>
>
> when using carbon source i am able to create table with spaces
> logs
> 0: jdbc:hive2://localhost:1> CREATE TABLE table (ID Int, date Timestamp, 
> country String, name String, phonetype String, serialname String,salary 
> Int) USING org.apache.spark.sql.CarbonSource OPTIONS("tableName"="t a b l e 
> ");
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows select
> here table with empty spaces is created in hdfs
> it should not allow this



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-690) Carbon data load fails with default option for USE_KETTLE(False)

2017-02-20 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-690.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Carbon data load fails with default option for USE_KETTLE(False)
> 
>
> Key: CARBONDATA-690
> URL: https://issues.apache.org/jira/browse/CARBONDATA-690
> Project: CarbonData
>  Issue Type: Bug
> Environment: Spark 2.1
>Reporter: Ramakrishna
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When load query is run with default option for USE_KETTLE, it fails at mdkey 
> generation.
> sample query and issue:
> LOAD DATA  inpath 
> 'hdfs://hacluster/user/OSCON/sparkhive/warehouse/communication.db/flow_text_1/20140113_0_120.csv'
>  into table flow_carbon options('USE_KETTLE'='FALSE', 'DELIMITER'=',', 
> 'QUOTECHAR'='"','FILEHEADER'='aco_ac,ac_dte,txn_cnt,jrn_par,mfm_jrn_no,cbn_jrn_no,ibs_jrn_no,vch_no,vch_seq,srv_cde,cus_no,bus_cd_no,id_flg,cus_ac,bv_cde,bv_no,txn_dte,txn_time,txn_tlr,txn_bk,txn_br,ety_tlr,ety_bk,ety_br,bus_pss_no,chk_flg,chk_tlr,chk_jrn_no,bus_sys_no,bus_opr_cde,txn_sub_cde,fin_bus_cde,fin_bus_sub_cde,opt_prd_cde,chl,tml_id,sus_no,sus_seq,cho_seq,itm_itm,itm_sub,itm_sss,dc_flg,amt,bal,ccy,spv_flg,vch_vld_dte,pst_bk,pst_br,ec_flg,aco_tlr,opp_ac,opp_ac_nme,opp_bk,gen_flg,his_rec_sum_flg,his_flg,vch_typ,val_dte,opp_ac_flg,cmb_flg,ass_vch_flg,cus_pps_flg,bus_rmk_cde,vch_bus_rmk,tec_rmk_cde,vch_tec_rmk,rsv_ara,own_br,own_bk,gems_last_upd_d,gems_last_upd_d_bat,maps_date,maps_job,dt');
> Error: java.lang.Exception: DataLoad failure: There is an unexpected error: 
> unable to generate the mdkey (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-222) Query issue for all dimensions are no dictionary columns

2017-02-20 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-222.
-
Resolution: Fixed
  Assignee: Gin-zhj

> Query issue for all dimensions are no dictionary columns
> 
>
> Key: CARBONDATA-222
> URL: https://issues.apache.org/jira/browse/CARBONDATA-222
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Gin-zhj
>Assignee: Gin-zhj
>Priority: Minor
> Fix For: 0.1.1-incubating, 0.2.0-incubating
>
>
> step 1:
> CREATE TABLE uniqdata_no (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='CUST_NAME,ACTIVE_EMUI_VERSION');
> step 2:
> LOAD DATA INPATH 'D:/download/3lakh_3.csv' into table uniqdata_no 
> OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> step 3:
> select * from uniqdata_no limit 5;
> the fact file is:
> ,,,0
> query failed, catch exception:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.carbondata.core.util.ByteUtil$UnsafeComparer.compareTo(ByteUtil.java:197)
>   at 
> org.apache.carbondata.core.carbon.datastore.impl.btree.BTreeDataRefNodeFinder.compareIndexes(BTreeDataRefNodeFinder.java:243)
>   at 
> org.apache.carbondata.core.carbon.datastore.impl.btree.BTreeDataRefNodeFinder.findFirstLeafNode(BTreeDataRefNodeFinder.java:121)
>   at 
> org.apache.carbondata.core.carbon.datastore.impl.btree.BTreeDataRefNodeFinder.findFirstDataBlock(BTreeDataRefNodeFinder.java:80)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getDataBlocksOfIndex(CarbonInputFormat.java:546)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:473)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getSplits(CarbonInputFormat.java:342)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getSplitsNonFilter(CarbonInputFormat.java:304)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getSplits(CarbonInputFormat.java:277)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Reopened] (CARBONDATA-222) Query issue for all dimensions are no dictionary columns

2017-02-20 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li reopened CARBONDATA-222:
-
  Assignee: (was: Gin-zhj)

> Query issue for all dimensions are no dictionary columns
> 
>
> Key: CARBONDATA-222
> URL: https://issues.apache.org/jira/browse/CARBONDATA-222
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Gin-zhj
>Priority: Minor
> Fix For: 0.2.0-incubating, 0.1.1-incubating
>
>
> step 1:
> CREATE TABLE uniqdata_no (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='CUST_NAME,ACTIVE_EMUI_VERSION');
> step 2:
> LOAD DATA INPATH 'D:/download/3lakh_3.csv' into table uniqdata_no 
> OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> step 3:
> select * from uniqdata_no limit 5;
> the fact file is:
> ,,,0
> query failed, catch exception:
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 4
>   at 
> org.apache.carbondata.core.util.ByteUtil$UnsafeComparer.compareTo(ByteUtil.java:197)
>   at 
> org.apache.carbondata.core.carbon.datastore.impl.btree.BTreeDataRefNodeFinder.compareIndexes(BTreeDataRefNodeFinder.java:243)
>   at 
> org.apache.carbondata.core.carbon.datastore.impl.btree.BTreeDataRefNodeFinder.findFirstLeafNode(BTreeDataRefNodeFinder.java:121)
>   at 
> org.apache.carbondata.core.carbon.datastore.impl.btree.BTreeDataRefNodeFinder.findFirstDataBlock(BTreeDataRefNodeFinder.java:80)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getDataBlocksOfIndex(CarbonInputFormat.java:546)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:473)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getSplits(CarbonInputFormat.java:342)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getSplitsNonFilter(CarbonInputFormat.java:304)
>   at 
> org.apache.carbondata.hadoop.CarbonInputFormat.getSplits(CarbonInputFormat.java:277)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-681) CSVReader related code improvement

2017-02-14 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-681.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> CSVReader related code improvement
> --
>
> Key: CARBONDATA-681
> URL: https://issues.apache.org/jira/browse/CARBONDATA-681
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: hadoop-integration
>Reporter: Jihong MA
>Assignee: Jihong MA
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> refactoring csv reader support during data loading, as well as replacing 
> relevant class out of Carbon Hadoop component into data loading component 
> (processing)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] (CARBONDATA-683) Reduce test time

2017-01-29 Thread Jacky Li (JIRA)
Title: Message Title
 
 
 
 
 
 
 
 
 
 
  
 
 Jacky Li created an issue 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 CarbonData /  CARBONDATA-683 
 
 
 
  Reduce test time  
 
 
 
 
 
 
 
 
 

Issue Type:
 
  Improvement 
 
 
 

Affects Versions:
 

 1.0.0-incubating 
 
 
 

Assignee:
 

 Unassigned 
 
 
 

Created:
 

 29/Jan/17 10:09 
 
 
 

Priority:
 
  Major 
 
 
 

Reporter:
 
 Jacky Li 
 
 
 
 
 
 
 
 
 
 
Reduce test time by: 1. remove all unnecessary print 2. make sample csv file size smaller 3. change logger.audit to initialize strings in constructor 
 
 
 
 
 
 
 
 
 
 
 
 

 
 Add Comment 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 
 

 This messa

[jira] [Resolved] (CARBONDATA-680) Add stats like rows processed in each step. And also fix unsafe sort enable issue.

2017-01-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-680.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.1.0-incubating

> Add stats like rows processed in each step. And also fix unsafe sort enable 
> issue.
> --
>
> Key: CARBONDATA-680
> URL: https://issues.apache.org/jira/browse/CARBONDATA-680
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Currently stats like number of rows processed in each step is not added in no 
> kettle flow. Please add the same.
> And also unsafe sort is not enabling even though user enable the sort in 
> property file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-682) Fix license header for FloatDataTypeTestCase.scala and DateTypeTest.scala

2017-01-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-682:

Fix Version/s: (was: 1.0.0-incubating)
   1.1.0-incubating

> Fix license header for FloatDataTypeTestCase.scala and DateTypeTest.scala
> -
>
> Key: CARBONDATA-682
> URL: https://issues.apache.org/jira/browse/CARBONDATA-682
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Liang Chen
>Assignee: Liang Chen
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Fix license header for FloatDataTypeTestCase.scala and DateTypeTest.scala



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-682) Fix license header for FloatDataTypeTestCase.scala and DateTypeTest.scala

2017-01-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-682.
-
   Resolution: Fixed
Fix Version/s: (was: 1.1.0-incubating)
   1.0.0-incubating

> Fix license header for FloatDataTypeTestCase.scala and DateTypeTest.scala
> -
>
> Key: CARBONDATA-682
> URL: https://issues.apache.org/jira/browse/CARBONDATA-682
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Liang Chen
>Assignee: Liang Chen
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Fix license header for FloatDataTypeTestCase.scala and DateTypeTest.scala



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-659) Should add WhitespaceAround and ParenPad to javastyle

2017-01-26 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-659.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Should add WhitespaceAround and ParenPad to javastyle
> -
>
> Key: CARBONDATA-659
> URL: https://issues.apache.org/jira/browse/CARBONDATA-659
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: QiangCai
>Assignee: QiangCai
>Priority: Trivial
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-676) Code clean

2017-01-22 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-676.
-
   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Code clean
> --
>
> Key: CARBONDATA-676
> URL: https://issues.apache.org/jira/browse/CARBONDATA-676
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: zhangshunyu
>Assignee: zhangshunyu
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> To clean some code:
> Correct the spelling mistake
> Remove unused function
> Iterate the Array instead of transform it to List.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-655) Make nokettle dataload flow as default in carbon

2017-01-19 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-655.
-
   Resolution: Fixed
Fix Version/s: 1.0.0-incubating

> Make nokettle dataload flow as default in carbon
> 
>
> Key: CARBONDATA-655
> URL: https://issues.apache.org/jira/browse/CARBONDATA-655
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Make nokettle dataload flow as default in carbon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (CARBONDATA-531) Eliminate spark dependency in carbon core

2017-01-16 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li closed CARBONDATA-531.
---
Resolution: Invalid

Because the code base has changed a lot, this improvement will be consider later

> Eliminate spark dependency in carbon core
> -
>
> Key: CARBONDATA-531
> URL: https://issues.apache.org/jira/browse/CARBONDATA-531
> Project: CarbonData
>  Issue Type: Improvement
>Affects Versions: 0.2.0-incubating
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Clean up the interface and take out Spark dependency on Carbon-core module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-617) Insert query not working with UNION

2017-01-15 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-617.
-
   Resolution: Fixed
Fix Version/s: 1.0.0-incubating

> Insert query not working with UNION
> ---
>
> Key: CARBONDATA-617
> URL: https://issues.apache.org/jira/browse/CARBONDATA-617
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.0.0-incubating
> Environment: Spark 1.6
> Hadoop 2.6
>Reporter: Deepti Bhardwaj
>Assignee: QiangCai
>Priority: Minor
> Fix For: 1.0.0-incubating
>
> Attachments: 2000_UniqData.csv, 
> thrift-error-log-during-insert-with-union
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> I created 3 table all having same schema
> Create table commands:
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> CREATE TABLE student (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> CREATE TABLE department (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format';
> and I loaded the uniqdata and department table with the attached 
> CSV(2000_UniqData.csv)
> and the insert query used to load data in student table was:
> insert into student select 
> CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1
>  from uniqdata UNION select 
> CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1
>  from department;
> When I try to insert data into student with union operation, it gives 
> java.lang.Exception: DataLoad failure.(attached below)
> The Union query works well when used alone but when insert is used with Union 
> it fails.
> Also, if I used hive tables instead of carbon tables insert does not work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-638) Move package in carbon-core module

2017-01-14 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-638:
---

 Summary: Move package in carbon-core module
 Key: CARBONDATA-638
 URL: https://issues.apache.org/jira/browse/CARBONDATA-638
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
Assignee: Jacky Li
 Fix For: 1.0.0-incubating


move org.apache.carbondata.core.carbon to org.apache.carbondata.core
move org.apache.carbondata.common.ext to org.apache.carbondata.core.service
move org.apache.carbondata.common.iudprocessor.iuddata to 
org.apache.carbondata.core.update
move org.apache.carbondata.core.partition to org.apache.carbondata.processing
move org.apache.carbondata.fileoperation to 
org.apache.carbondata.core.fileoperation
move org.apache.carbondata.locks to org.apache.carbondata.core.locks
move CarbonDataLoadSchema to carbon-processing
move all Ideintifier class to org.apaceh.carbondata.core.metadata
move org.apache.carbondata.core.datastorage to 
org.apache.carbondata.core.datastore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-637) Remove table_status file

2017-01-14 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-637:
---

 Summary: Remove table_status file
 Key: CARBONDATA-637
 URL: https://issues.apache.org/jira/browse/CARBONDATA-637
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
Assignee: Jacky Li
 Fix For: 1.0.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-622) Should use the same fileheader reader for dict generation and data loading

2017-01-11 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-622.
-
Resolution: Fixed

> Should use the same fileheader reader for dict generation and data loading
> --
>
> Key: CARBONDATA-622
> URL: https://issues.apache.org/jira/browse/CARBONDATA-622
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
>Reporter: QiangCai
>Assignee: QiangCai
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> We can get file header from DDL command and CSV file. 
> 1. If the file header comes from DDL command, separate this file header by 
> comma ","
> 2. if the file header comes from CSV file, sparate this file header by 
> specify delimiter in DDL command.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-607) Cleanup ValueCompressionHolder class and all sub-classes

2017-01-10 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-607.
-
   Resolution: Fixed
Fix Version/s: 1.0.0-incubating

> Cleanup ValueCompressionHolder class and all sub-classes
> 
>
> Key: CARBONDATA-607
> URL: https://issues.apache.org/jira/browse/CARBONDATA-607
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: core
>Reporter: Jihong MA
>Assignee: Jihong MA
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Rewrite ValueCompressionHolder class as a base class for compressing or 
> uncompressing numeric data for measurement column chunk. 
> refactor all sub-classes under
> org.apache.carbondata.core.datastorage.store.compression.decimal.*
> org.apache.carbondata.core.datastorage.store.compression.nonDecimal.*
> org.apache.carbondata.core.datastorage.store.compression.none.*
> org.apache.carbondata.core.datastorage.store.compression.type.*
> as part of the work, also fix a performance bug to avoid creating unnecessary 
> compression/uncompression value holder during compression or decompression. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-616) Remove the duplicated class CarbonDataWriterException.java

2017-01-10 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-616.
-
Resolution: Fixed

> Remove the duplicated class CarbonDataWriterException.java
> --
>
> Key: CARBONDATA-616
> URL: https://issues.apache.org/jira/browse/CARBONDATA-616
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.0.0-incubating
>Reporter: Liang Chen
>Assignee: Liang Chen
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Remove the duplicated class CarbonDataWriterException.java [1]
> [1]org.apache.carbondata.core.writer.exception.CarbonDataWriterException.java 
> [2]org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException.java
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-595) Drop Table for carbon throws NPE with HDFS lock type.

2017-01-09 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-595.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.0.0-incubating

> Drop Table for carbon throws NPE with HDFS lock type.
> -
>
> Key: CARBONDATA-595
> URL: https://issues.apache.org/jira/browse/CARBONDATA-595
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 0.2.0-incubating
>Reporter: Babulal
>Assignee: Ravindra Pesala
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Start version :- 1.6.2 
> Start carbon thrift server
> set HDFS LOCK Type
> drop table from beeline
> 0: jdbc:hive2://hacluster> drop table oscon_new_1;
> Error: java.lang.NullPointerException (state=,code=0)
> Error in thrftserver 
> 17/01/04 20:40:08 AUDIT DropTableCommand: 
> [hadoop-master][anonymous][Thread-182]Deleted table [oscon_new_1] under 
> database [default]
> 17/01/04 20:40:08 ERROR AbstractDFSCarbonFile: pool-25-thread-12 Exception 
> occured:File does not exist: 
> hdfs://hacluster/opt/CarbonStore/default/oscon_new_1/droptable.lock
> 17/01/04 20:40:08 ERROR SparkExecuteStatementOperation: Error executing 
> query, currentState RUNNING,
> java.lang.NullPointerException
> at 
> org.apache.carbondata.core.datastorage.store.filesystem.AbstractDFSCarbonFile.delete(AbstractDFSCarbonFile.java:128)
> at 
> org.apache.carbondata.lcm.locks.HdfsFileLock.unlock(HdfsFileLock.java:110)
> at 
> org.apache.spark.sql.execution.command.DropTableCommand.run(carbonTableSchema.scala:613)
> at 
> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
> at 
> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
> Note :- lock file and data are deleted successfully but in beeline it throws 
> ERROR message instead of success. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-608) Compliation Error with spark 1.6 profile

2017-01-08 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-608.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.0.0-incubating

> Compliation Error with spark 1.6 profile
> 
>
> Key: CARBONDATA-608
> URL: https://issues.apache.org/jira/browse/CARBONDATA-608
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Reporter: Prabhat Kashyap
>Assignee: Ravindra Pesala
>Priority: Critical
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-606) Add a Flink example to read CarbonData files

2017-01-07 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-606:
---

 Summary: Add a Flink example to read CarbonData files 
 Key: CARBONDATA-606
 URL: https://issues.apache.org/jira/browse/CARBONDATA-606
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
Assignee: Jacky Li
 Fix For: 1.0.0-incubating


Add a Flink example to read CarbonData files written by Spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-572) clean up code for carbon-spark-common module

2017-01-04 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-572.
-
Resolution: Fixed
  Assignee: Jacky Li

> clean up code for carbon-spark-common module
> 
>
> Key: CARBONDATA-572
> URL: https://issues.apache.org/jira/browse/CARBONDATA-572
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.0.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-218) Remove Dependency: spark-csv and Unify CSV Reader for dataloading

2017-01-04 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-218.
-
Resolution: Fixed

> Remove Dependency: spark-csv and Unify CSV Reader for dataloading
> -
>
> Key: CARBONDATA-218
> URL: https://issues.apache.org/jira/browse/CARBONDATA-218
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: QiangCai
>Assignee: QiangCai
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-401) Look forward to support reading csv file only once in data loading

2016-12-29 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-401.
-
   Resolution: Fixed
Fix Version/s: 1.0.0-incubating

> Look forward to support reading csv file only once in data loading 
> ---
>
> Key: CARBONDATA-401
> URL: https://issues.apache.org/jira/browse/CARBONDATA-401
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Lionx
>Assignee: Lionx
> Fix For: 1.0.0-incubating
>
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Now, In Carbon data loading module, generating global dictionary is 
> independent.  Carbon read the csv file twice for generating global dictionary 
> and loading carbon data, respectively. We look forward to read the csv file 
> only once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-558) Load performance bad when use_kettle=false

2016-12-29 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-558.
-
   Resolution: Fixed
Fix Version/s: 1.0.0-incubating

> Load performance bad when use_kettle=false
> --
>
> Key: CARBONDATA-558
> URL: https://issues.apache.org/jira/browse/CARBONDATA-558
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Gin-zhj
>Assignee: Gin-zhj
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When you import a data file, the measure column contains many empty strings, 
> if use_kettle=false, the load performance has a sharp decline
> I checked the logs of executor, many warnnings printed like below:
>  16/12/22 07:03:12 WARN MeasureFieldConverterImpl: pool-22-thread-6 Cant not 
> convert :  to Numeric type value. Value considered as null.
> 16/12/22 07:03:12 WARN MeasureFieldConverterImpl: pool-22-thread-1 Cant not 
> convert :  to Numeric type value. Value considered as null.
> 16/12/22 07:03:12 WARN MeasureFieldConverterImpl: pool-22-thread-6 Cant not 
> convert :  to Numeric type value. Value considered as null.
> 16/12/22 07:03:12 WARN MeasureFieldConverterImpl: pool-22-thread-1 Cant not 
> convert :  to Numeric type value. Value considered as null.
> 16/12/22 07:03:12 WARN MeasureFieldConverterImpl: pool-22-thread-2 Cant not 
> convert :  to Numeric type value. Value considered as null.
> 16/12/22 07:03:12 WARN MeasureFieldConverterImpl: pool-22-thread-3 Cant not 
> convert :  to Numeric type value. Value considered as null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-564) long time ago, carbon may use dimension table csv file to make dictionary, but now unsed, so remove

2016-12-28 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-564.
-
   Resolution: Fixed
 Assignee: Jay
Fix Version/s: 1.0.0-incubating

> long time ago, carbon may use dimension table csv file to make dictionary, 
> but now unsed, so remove 
> 
>
> Key: CARBONDATA-564
> URL: https://issues.apache.org/jira/browse/CARBONDATA-564
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jay
>Assignee: Jay
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> long time ago, carbon may use dimension table csv file to make dictionary, 
> but now with coldict, allDictionary and so on , there is no need for dimesion 
> table file to make dictionary, and to make carbondata code easy to read, 
> these unused part should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-467) CREATE TABLE extension to support bucket table.

2016-12-28 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-467.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.0.0-incubating

> CREATE TABLE extension to support bucket table.
> ---
>
> Key: CARBONDATA-467
> URL: https://issues.apache.org/jira/browse/CARBONDATA-467
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.0.0-incubating
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> 1. CREATE TABLE Statement extension.
> {code}
> CREATE TABLE test(user_id BIGINT, firstname STRING, lastname STRING)
> CLUSTERED BY(user_id) INTO 32 BUCKETS STORED BY 'carbondata';
> {code}
> 2. Carbon file format update (Thrift definition extension)
> 3. Respect to bucket definition during data load. Store the bucketid to 
> carbondata index file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-576) Add mvn build guide

2016-12-28 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-576.
-
Resolution: Fixed

> Add mvn build guide
> ---
>
> Key: CARBONDATA-576
> URL: https://issues.apache.org/jira/browse/CARBONDATA-576
> Project: CarbonData
>  Issue Type: Improvement
>Affects Versions: NONE
>Reporter: Liang Chen
>Assignee: Liang Chen
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Add mvn build guide to github



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-574) Add thrift server support to Spark 2.0 carbon integration

2016-12-28 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-574.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.0.0-incubating

> Add thrift server support to Spark 2.0 carbon integration
> -
>
> Key: CARBONDATA-574
> URL: https://issues.apache.org/jira/browse/CARBONDATA-574
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.0.0-incubating
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Add thrift server support to Spark 2.0 carbon integration



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-540) Support insertInto without kettle for spark2

2016-12-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-540.
-
Resolution: Fixed

> Support insertInto without kettle for spark2
> 
>
> Key: CARBONDATA-540
> URL: https://issues.apache.org/jira/browse/CARBONDATA-540
> Project: CarbonData
>  Issue Type: Improvement
>  Components: data-load
>Affects Versions: 1.0.0-incubating
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Support inserInto without kettle for spark2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-572) clean up code for carbon-spark-common module

2016-12-27 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-572:
---

 Summary: clean up code for carbon-spark-common module
 Key: CARBONDATA-572
 URL: https://issues.apache.org/jira/browse/CARBONDATA-572
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Jacky Li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-569) clean up code for carbon-processing module

2016-12-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-569:

Summary: clean up code for carbon-processing module   (was: clean up code 
for carbon-core module )

> clean up code for carbon-processing module 
> ---
>
> Key: CARBONDATA-569
> URL: https://issues.apache.org/jira/browse/CARBONDATA-569
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Jacky Li
> Fix For: 1.0.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-571) clean up code for carbon-spark module

2016-12-27 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-571:
---

 Summary: clean up code for carbon-spark module
 Key: CARBONDATA-571
 URL: https://issues.apache.org/jira/browse/CARBONDATA-571
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Jacky Li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-570) clean up code for carbon-hadoop module

2016-12-27 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-570:
---

 Summary: clean up code for carbon-hadoop module
 Key: CARBONDATA-570
 URL: https://issues.apache.org/jira/browse/CARBONDATA-570
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Jacky Li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CARBONDATA-569) clean up code for carbon-core module

2016-12-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li reopened CARBONDATA-569:
-

> clean up code for carbon-core module 
> -
>
> Key: CARBONDATA-569
> URL: https://issues.apache.org/jira/browse/CARBONDATA-569
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Jacky Li
> Fix For: 1.0.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (CARBONDATA-569) clean up code for carbon-core module

2016-12-27 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li closed CARBONDATA-569.
---
Resolution: Duplicate

> clean up code for carbon-core module 
> -
>
> Key: CARBONDATA-569
> URL: https://issues.apache.org/jira/browse/CARBONDATA-569
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Jacky Li
> Fix For: 1.0.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-569) clean up code for carbon-core module

2016-12-27 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-569:
---

 Summary: clean up code for carbon-core module 
 Key: CARBONDATA-569
 URL: https://issues.apache.org/jira/browse/CARBONDATA-569
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Jacky Li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-568) clean up code for carbon-core module

2016-12-27 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-568:
---

 Summary: clean up code for carbon-core module 
 Key: CARBONDATA-568
 URL: https://issues.apache.org/jira/browse/CARBONDATA-568
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Jacky Li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-566) clean up code for carbon-spark2 module

2016-12-26 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-566:

Assignee: Jacky Li

> clean up code for carbon-spark2 module
> --
>
> Key: CARBONDATA-566
> URL: https://issues.apache.org/jira/browse/CARBONDATA-566
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.0.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-566) clean up code for carbon-spark2 module

2016-12-26 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-566:
---

 Summary: clean up code for carbon-spark2 module
 Key: CARBONDATA-566
 URL: https://issues.apache.org/jira/browse/CARBONDATA-566
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Jacky Li






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-565) Clean up code suggested by IDE analyzer

2016-12-26 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-565:

Summary: Clean up code suggested by IDE analyzer  (was: Clean up code )

> Clean up code suggested by IDE analyzer
> ---
>
> Key: CARBONDATA-565
> URL: https://issues.apache.org/jira/browse/CARBONDATA-565
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
> Fix For: 1.0.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-565) Clean up code

2016-12-26 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-565:
---

 Summary: Clean up code 
 Key: CARBONDATA-565
 URL: https://issues.apache.org/jira/browse/CARBONDATA-565
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
 Fix For: 1.0.0-incubating






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-547) Add CarbonSession and enabled parser to use all carbon commands

2016-12-26 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-547.
-
   Resolution: Fixed
Fix Version/s: 1.0.0-incubating

> Add CarbonSession and enabled parser to use all carbon commands
> ---
>
> Key: CARBONDATA-547
> URL: https://issues.apache.org/jira/browse/CARBONDATA-547
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Currently no DDL commands like CREATE,LOAD,ALTER,DROP,DESCRIBE, SHOW LOADS, 
> DELETE SEGMENTS etc are not working in Spark 2.0 integration.
> So please add CarbonSession and overwrite the SQL parser to make all this 
> commands work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-560) In QueryExecutionException, can not use executorService.shutdownNow() to shut down immediately.

2016-12-26 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-560.
-
Resolution: Fixed

> In QueryExecutionException, can not use executorService.shutdownNow() to shut 
> down immediately.
> ---
>
> Key: CARBONDATA-560
> URL: https://issues.apache.org/jira/browse/CARBONDATA-560
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Liang Chen
>Assignee: Liang Chen
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In QueryExecutionException, can not use executorService.shutdownNow() to shut 
> down immediately.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-563) Select Queries are not working with spark 1.6.2.

2016-12-25 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-563.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.0.0-incubating

> Select Queries are  not working with spark 1.6.2.
> -
>
> Key: CARBONDATA-563
> URL: https://issues.apache.org/jira/browse/CARBONDATA-563
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, data-query
>Affects Versions: 0.2.0-incubating
>Reporter: Babulal
>Assignee: Ravindra Pesala
> Fix For: 1.0.0-incubating
>
> Attachments: issue_snapshot.jpg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Create carbon table 
> create table x (a int ,b string) stored by 'carbondata'
> Load data to carbon table 
> run query  select count(*) from x;  
> Java.lang.ClassCastException:[Ljava.lang.Object;can not be cast to 
> org.apache.sql.catalyst.InternalRow
> Log snap shot in attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (CARBONDATA-537) Bug fix for DICTIONARY_EXCLUDE option in spark2 integration

2016-12-22 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li closed CARBONDATA-537.
---
Resolution: Won't Fix

> Bug fix for DICTIONARY_EXCLUDE option in spark2 integration
> ---
>
> Key: CARBONDATA-537
> URL: https://issues.apache.org/jira/browse/CARBONDATA-537
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Jacky Li
> Fix For: 1.0.0-incubating
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> 1. Fix bug for dictionary_exclude option in spark2 integration. In spark2, 
> datat type name is changed from "string" to "stringtype", but 
> `isStringAndTimestampColDictionaryExclude` is not modified.
> 2. Fix bug for data loading with no-kettle. In no-kettle loading, should not 
> ask user to set kettle home environment variable.
> 3. clean up scala code style in `GlobalDictionaryUtil`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-412) in windows, when load into table whose name has "_", the old segment will be deleted.

2016-12-21 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-412.
-
   Resolution: Fixed
 Assignee: Jay
Fix Version/s: 1.0.0-incubating

> in windows, when load into table whose name has "_", the old segment will be 
> deleted.
> -
>
> Key: CARBONDATA-412
> URL: https://issues.apache.org/jira/browse/CARBONDATA-412
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Jay
>Assignee: Jay
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> when carbon table name has "_", such as "load_test", then load into table 
> twice, in the second load,  the first segment 0 will be deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-546) Extract data management command to carbon-spark-common module

2016-12-20 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-546:
---

 Summary: Extract data management command to carbon-spark-common 
module
 Key: CARBONDATA-546
 URL: https://issues.apache.org/jira/browse/CARBONDATA-546
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
Assignee: Jacky Li
 Fix For: 1.0.0-incubating


Currently there are duplicated code for data management command in carbon-spark 
and carbon-spark2 module. In this PR, following commands are removed from 
carbonTableSchema.scala and extracted to carbon-spark-common:

- ShowLoads
- DeleteLoadById
- DeleteLoadByDate
- CleanFiles



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-519) Enable vector reader in Carbon-Spark 2.0 integration and Carbon layer

2016-12-19 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-519.
-
   Resolution: Fixed
Fix Version/s: 1.0.0-incubating

> Enable vector reader in Carbon-Spark 2.0 integration and Carbon layer
> -
>
> Key: CARBONDATA-519
> URL: https://issues.apache.org/jira/browse/CARBONDATA-519
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.0.0-incubating
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> Spark 2.0 supports vectorized reader and uses whole codegen to improve 
> performance, Carbon will enable vectorized reader integrating with Spark to 
> take advantage of new features of Spark2.x



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-539) Return empty row in map reduce application

2016-12-18 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-539:
---

 Summary: Return empty row in map reduce application
 Key: CARBONDATA-539
 URL: https://issues.apache.org/jira/browse/CARBONDATA-539
 Project: CarbonData
  Issue Type: Bug
Reporter: Jacky Li
Assignee: Jacky Li
 Fix For: 1.0.0-incubating


There is a bug that Carbon will return empty row in map reduce app if 
projection columns are not set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-516) [SPARK2]update union class in CarbonLateDecoderRule for Spark 2.x integration

2016-12-15 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-516.
-
   Resolution: Fixed
Fix Version/s: 1.0.0-incubating

> [SPARK2]update union class in CarbonLateDecoderRule for Spark 2.x integration
> -
>
> Key: CARBONDATA-516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-516
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 1.0.0-incubating
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> In spark2, Union class is no longer sub-class of BinaryNode. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-538) Add test case to spark2 integration

2016-12-15 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-538:
---

 Summary: Add test case to spark2 integration
 Key: CARBONDATA-538
 URL: https://issues.apache.org/jira/browse/CARBONDATA-538
 Project: CarbonData
  Issue Type: Improvement
Reporter: Jacky Li
 Fix For: 1.0.0-incubating


Currently spark2 integration has very few test case, it should be improved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-536) Initialize GlobalDictionaryUtil.updateTableMetadataFunc for Spark 2.x

2016-12-15 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-536.
-

> Initialize GlobalDictionaryUtil.updateTableMetadataFunc for Spark 2.x
> -
>
> Key: CARBONDATA-536
> URL: https://issues.apache.org/jira/browse/CARBONDATA-536
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.0.0-incubating
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 1.0.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> GlobalDictionaryUtil.updateTableMetadataFunc needs to be initialized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-537) Bug fix for DICTIONARY_EXCLUDE option in spark2 integration

2016-12-15 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-537:
---

 Summary: Bug fix for DICTIONARY_EXCLUDE option in spark2 
integration
 Key: CARBONDATA-537
 URL: https://issues.apache.org/jira/browse/CARBONDATA-537
 Project: CarbonData
  Issue Type: Bug
Reporter: Jacky Li
 Fix For: 1.0.0-incubating


1. Fix bug for dictionary_exclude option in spark2 integration. In spark2, 
datat type name is changed from "string" to "stringtype", but 
`isStringAndTimestampColDictionaryExclude` is not modified.
2. Fix bug for data loading with no-kettle. In no-kettle loading, should not 
ask user to set kettle home environment variable.
3. clean up scala code style in `GlobalDictionaryUtil`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-535) carbondata should support datatype: Date and Char

2016-12-15 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-535.
-
Resolution: Fixed

> carbondata should support datatype: Date and Char
> -
>
> Key: CARBONDATA-535
> URL: https://issues.apache.org/jira/browse/CARBONDATA-535
> Project: CarbonData
>  Issue Type: Improvement
>  Components: file-format
>Affects Versions: 1.0.0-incubating
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 1.0.0-incubating
>
>
> carbondata should support datatype: Date and Char



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-531) Remove spark dependency in carbon core

2016-12-13 Thread Jacky Li (JIRA)
Jacky Li created CARBONDATA-531:
---

 Summary: Remove spark dependency in carbon core
 Key: CARBONDATA-531
 URL: https://issues.apache.org/jira/browse/CARBONDATA-531
 Project: CarbonData
  Issue Type: Improvement
Affects Versions: 0.2.0-incubating
Reporter: Jacky Li
 Fix For: 1.0.0-incubating


Carbon-core module should not depend on spark 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-470) Add unsafe offheap and on-heap sort in carbodata loading

2016-12-13 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-470.
-
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.0.0-incubating

> Add unsafe offheap and on-heap sort in carbodata loading
> 
>
> Key: CARBONDATA-470
> URL: https://issues.apache.org/jira/browse/CARBONDATA-470
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
> Fix For: 1.0.0-incubating
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> In the current carbondata system loading performance is not so encouraging 
> since we need to sort the data at executor level for data loading. Carbondata 
> collects batch of data and sorts before dumping to the temporary files and 
> finally it does merge sort from those temporary files to finish sorting. Here 
> we face two major issues , one is disk IO and second is GC issue. Even though 
> we dump to the file still carbondata face lot of GC issue since we sort batch 
> data in-memory before dumping to the temporary files.
> To solve the above problems we can introduce Unsafe Storage and Unsafe sort.
> Unsafe Storage : User can configure the memory limit to keep the amount of 
> data to in-memory. Here we can keep all the data in continuous memory 
> location either on off-heap or on-heap using Unsafe. Once configure limit 
> exceeds remaining data will be spilled to disk.
> Unsafe Sort : The data which is store in-memory using Unsafe can be sorted 
> using Unsafe sort. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (CARBONDATA-331) Support no compression option while loading

2016-12-13 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li closed CARBONDATA-331.
---
Resolution: Won't Fix

> Support no compression option while loading
> ---
>
> Key: CARBONDATA-331
> URL: https://issues.apache.org/jira/browse/CARBONDATA-331
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Jacky Li
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Modify the compressor inteface and add a DummyCompressor for not doing 
> compression.
> This interface can be extend later for adding new compressors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CARBONDATA-431) Analysis compression for numeric datatype compared with Parquet/ORC

2016-12-12 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li updated CARBONDATA-431:

Fix Version/s: 1.0.0-incubating

> Analysis compression for numeric datatype compared with Parquet/ORC
> ---
>
> Key: CARBONDATA-431
> URL: https://issues.apache.org/jira/browse/CARBONDATA-431
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: suo tong
>Assignee: Ashok Kumar
> Fix For: 1.0.0-incubating
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> For the data type, carbon's string type has better compression ratio, but for 
> numeric type, orc has the best compression. we should analysis numeric 
> datatype for carbon to get better compression ratio
> DataType  TextParquet   Orc   Carbon
> decimal 16G  |11G  |   6G|13G
> int 5G   | 1G  |1G   |3G
> String  24G  |22G  |11G   |3G   (no 
> dictionary)   ---high cardinality
> String30G|4G   |4G   |1G  -- 
> Dictionary encode1G  -- Dictionary encode without inverted index  
>   3G  -- No dictionary encode  ---low cardinality



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-431) Analysis compression for numeric datatype compared with Parquet/ORC

2016-12-12 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-431.
-
Resolution: Fixed
  Assignee: Ashok Kumar  (was: Raghunandan S)

> Analysis compression for numeric datatype compared with Parquet/ORC
> ---
>
> Key: CARBONDATA-431
> URL: https://issues.apache.org/jira/browse/CARBONDATA-431
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: suo tong
>Assignee: Ashok Kumar
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> For the data type, carbon's string type has better compression ratio, but for 
> numeric type, orc has the best compression. we should analysis numeric 
> datatype for carbon to get better compression ratio
> DataType  TextParquet   Orc   Carbon
> decimal 16G  |11G  |   6G|13G
> int 5G   | 1G  |1G   |3G
> String  24G  |22G  |11G   |3G   (no 
> dictionary)   ---high cardinality
> String30G|4G   |4G   |1G  -- 
> Dictionary encode1G  -- Dictionary encode without inverted index  
>   3G  -- No dictionary encode  ---low cardinality



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-528) to support octal escape delimiter char

2016-12-12 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-528.
-
Resolution: Fixed
  Assignee: zhaowei

> to support octal escape delimiter char 
> ---
>
> Key: CARBONDATA-528
> URL: https://issues.apache.org/jira/browse/CARBONDATA-528
> Project: CarbonData
>  Issue Type: Improvement
>Affects Versions: 0.2.0-incubating
>Reporter: zhaowei
>Assignee: zhaowei
>Priority: Minor
> Fix For: 1.0.0-incubating
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-521) Depends on more stable class of spark in spark2

2016-12-10 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-521.
-
Resolution: Fixed
  Assignee: Fei Wang

> Depends on more stable class of spark in spark2
> ---
>
> Key: CARBONDATA-521
> URL: https://issues.apache.org/jira/browse/CARBONDATA-521
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Fei Wang
>Assignee: Fei Wang
> Fix For: 1.0.0-incubating
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> avoid to use unstable class in spark2, otherwise it leads to compatible issue 
> with spark



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-520) Executor can not get the read support class

2016-12-10 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-520.
-
Resolution: Fixed

> Executor can not get the read support class 
> 
>
> Key: CARBONDATA-520
> URL: https://issues.apache.org/jira/browse/CARBONDATA-520
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Fei Wang
>Assignee: Fei Wang
> Fix For: 1.0.0-incubating
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Executor can not get the read support class, this leads to cast exception 
> when running carbon on spark2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CARBONDATA-517) Use carbon property to get the store path/kettle home

2016-12-09 Thread Jacky Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-517.
-
Resolution: Fixed

> Use carbon property to get the store path/kettle home
> -
>
> Key: CARBONDATA-517
> URL: https://issues.apache.org/jira/browse/CARBONDATA-517
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Affects Versions: 0.2.0-incubating
>Reporter: Fei Wang
>Assignee: Fei Wang
> Fix For: 1.0.0-incubating
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> to distinguish the carbon config with spark config. for carbon config we use 
> carbon property to get them



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >