[jira] [Created] (CARBONDATA-955) CacheProvider test fails

2017-04-18 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-955:
--

 Summary: CacheProvider test fails 
 Key: CARBONDATA-955
 URL: https://issues.apache.org/jira/browse/CARBONDATA-955
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala
Priority: Trivial


CacheProvider test fails in core package.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-953) Add validations to Unsafe dataload. And control the data added to threads

2017-04-18 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-953:
--

 Summary: Add validations to Unsafe dataload. And control the data 
added to threads
 Key: CARBONDATA-953
 URL: https://issues.apache.org/jira/browse/CARBONDATA-953
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Add validations to Unsafe dataload, now there are no validations of how much 
chunksize can be configured and how much working thread memory uses. And also 
there is no control of adding data  to sort threads so it may lead to out of 
memory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-953) Add validations to Unsafe dataload. And control the data added to threads

2017-04-18 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-953:
--

Assignee: Ravindra Pesala

> Add validations to Unsafe dataload. And control the data added to threads
> -
>
> Key: CARBONDATA-953
> URL: https://issues.apache.org/jira/browse/CARBONDATA-953
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>
> Add validations to Unsafe dataload, now there are no validations of how much 
> chunksize can be configured and how much working thread memory uses. And also 
> there is no control of adding data  to sort threads so it may lead to out of 
> memory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-949) Compaction gives NullPointerException after alter table query

2017-04-18 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-949.

   Resolution: Fixed
 Assignee: Rahul Kumar
Fix Version/s: 1.1.0-incubating

> Compaction gives NullPointerException after alter table query
> -
>
> Key: CARBONDATA-949
> URL: https://issues.apache.org/jira/browse/CARBONDATA-949
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Trivial
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> After load if new column added to the table and minor compaction has been 
> performed . 
> Then new column contains null value by default or if bad_record exists . And 
> cast should not done with null value.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-898) When select query and alter table rename table is triggered concurrently, NullPointerException is getting thrown

2017-04-13 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-898.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> When select query and alter table rename table is triggered concurrently, 
> NullPointerException is getting thrown
> 
>
> Key: CARBONDATA-898
> URL: https://issues.apache.org/jira/browse/CARBONDATA-898
> Project: CarbonData
>  Issue Type: Bug
> Environment: Spark 2.1
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When user triggers select query and alter table rename table command 
> concurrently, Select query is throwning NullPointerException if the files 
> does not exist in hdfs.
> When dictionary file or schema file does not exist, File not found exception 
> should be thrown



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-919) result_size query stats is not giving proper row count if vector reader is enabled.

2017-04-13 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-919.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> result_size query stats is not giving proper row count if vector reader is 
> enabled.
> ---
>
> Key: CARBONDATA-919
> URL: https://issues.apache.org/jira/browse/CARBONDATA-919
> Project: CarbonData
>  Issue Type: Bug
> Environment: Spark 2.1
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Trivial
> Fix For: 1.1.0-incubating
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Incase of vector reader, we return columnarbatch which will have row count as 
> size of the batch, whereas we are incrementing the row count with 1 & the 
> result is printed on the query stats log
> Moved result_Size calculation into respective reader and logging the results 
> after the task completes in executor.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-903) data load is not failing even though bad records exists in the data in case of unsafe sort or batch sort

2017-04-13 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-903.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> data load is not failing even though bad records exists in the data in case 
> of unsafe sort or batch sort
> 
>
> Key: CARBONDATA-903
> URL: https://issues.apache.org/jira/browse/CARBONDATA-903
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Critical
> Fix For: 1.1.0-incubating
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-916) Major compaction is failing

2017-04-13 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-916.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Major compaction is failing
> ---
>
> Key: CARBONDATA-916
> URL: https://issues.apache.org/jira/browse/CARBONDATA-916
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Reporter: Manohar Vanam
>Assignee: Manohar Vanam
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> If we execute major compaction query on already compacted table throwing 
> exception



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-915) Call getAll dictionary from codegen of dictionary decoder to improve dictionary load performance

2017-04-12 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-915:
--

 Summary: Call getAll dictionary from codegen of dictionary decoder 
to improve dictionary load performance
 Key: CARBONDATA-915
 URL: https://issues.apache.org/jira/browse/CARBONDATA-915
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Currently it gets the dictionary individualy from cache so it is not effective 
way as it does not load parallel. And also it is not thread safe to just call 
dictionary instead of getAll
Call getAll dictionary from codegen of dictionary decoder to improve dictionary 
load performance



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-915) Call getAll dictionary from codegen of dictionary decoder to improve dictionary load performance

2017-04-12 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-915:
--

Assignee: Ravindra Pesala

> Call getAll dictionary from codegen of dictionary decoder to improve 
> dictionary load performance
> 
>
> Key: CARBONDATA-915
> URL: https://issues.apache.org/jira/browse/CARBONDATA-915
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>
> Currently it gets the dictionary individualy from cache so it is not 
> effective way as it does not load parallel. And also it is not thread safe to 
> just call dictionary instead of getAll
> Call getAll dictionary from codegen of dictionary decoder to improve 
> dictionary load performance



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-890) For Spark 2.1 LRU cache size at driver is getting configured with the executor lru cache size.

2017-04-12 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-890.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> For Spark 2.1 LRU cache size at driver is getting configured with the 
> executor lru cache size.
> --
>
> Key: CARBONDATA-890
> URL: https://issues.apache.org/jira/browse/CARBONDATA-890
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-880) when explain extended is done on a query then store path is getting printed to the user.

2017-04-12 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-880.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> when explain extended is done on a query then store path is getting printed 
> to the user.
> 
>
> Key: CARBONDATA-880
> URL: https://issues.apache.org/jira/browse/CARBONDATA-880
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Reporter: ravikiran
>Assignee: ravikiran
> Fix For: 1.1.0-incubating
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> when explain extended is done on a query then store path is getting printed 
> to the user in carbonhadooprelation. 
> the path should not be shown to the user.
> example :
> CarbonDatasourceHadoopRelation(org.apache.spark.sql.CarbonSession@1ed6b3b9,[Ljava.lang.String;@21b14680,Map(path
>  -> 
> file:/D:/Carbon/incubator-carbondata/examples/spark2/target/warehouse/carbon_table,
>  serialization.format -> 1, dbname -> default, tablepath -> 
> D:Carbonincubator-carbondata/examples/spark2/target/storedefaultcarbon_table, 
> tablename -> 
> carbon_table),Some(StructType(StructField(shortField,ShortType,true), 
> StructField(intField,IntegerType,true)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-893) MR testcase hangs in Hadoop 2.7.2 version profile

2017-04-10 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-893:
--

 Summary: MR testcase hangs in Hadoop 2.7.2 version profile
 Key: CARBONDATA-893
 URL: https://issues.apache.org/jira/browse/CARBONDATA-893
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


MR testcase hangs in Hadoop 2.7.2 version profile



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-891) Fix compilation issue of LocalFileLockTest generate new folder "carbon.store"

2017-04-10 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-891.

Resolution: Fixed

> Fix compilation issue of LocalFileLockTest generate new folder "carbon.store"
> -
>
> Key: CARBONDATA-891
> URL: https://issues.apache.org/jira/browse/CARBONDATA-891
> Project: CarbonData
>  Issue Type: Bug
>  Components: build, core
>Reporter: Liang Chen
>Assignee: Liang Chen
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Fix compilation issue of LocalFileLockTest generate new folder "carbon.store"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-877) String datatype is throwing an error when included in DIctionary_Exclude in a alter query

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-877.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> String datatype is throwing an error when included in DIctionary_Exclude in a 
> alter query
> -
>
> Key: CARBONDATA-877
> URL: https://issues.apache.org/jira/browse/CARBONDATA-877
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SWATI RAO
>Assignee: Kunal Kapoor
> Fix For: 1.1.0-incubating
>
> Attachments: 2000_UniqData.csv
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> ALTER TABLE uniqdata RENAME TO uniqdata1;
> alter table uniqdata1 drop columns(CUST_NAME);
> alter table uniqdata1 add columns(CUST_NAME string) 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='CUST_NAME', 
> 'DEFAULT.VALUE.CUST_NAME'='testuser') ;
> Column added successfully. But when we execute:
> select distinct(CUST_NAME) from uniqdata1 ; 
> &
> select count(CUST_NAME) from uniqdata1 ;
> it throws an error :
> "Job aborted due to stage failure: Task 0 in stage 9.0 failed 1 times, most 
> recent failure: Lost task 0.0 in stage 9.0 (TID 206, localhost, executor 
> driver): java.lang.ArrayIndexOutOfBoundsException: 4186"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-846) Add support to revert changes to alter table commands if there is a failure while executing the changes on hive.

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-846.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Add support to revert changes to alter table commands if there is a failure 
> while executing the changes on hive.
> 
>
> Key: CARBONDATA-846
> URL: https://issues.apache.org/jira/browse/CARBONDATA-846
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-847) Select query not working properly after alter.

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-847.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Select query not working properly after alter.
> --
>
> Key: CARBONDATA-847
> URL: https://issues.apache.org/jira/browse/CARBONDATA-847
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.1.0-incubating
> Environment: Spark2.1
>Reporter: SWATI RAO
>Assignee: Kunal Kapoor
> Fix For: 1.1.0-incubating
>
> Attachments: 2000_UniqData.csv
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Execute these set of queries: 
> CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> ALTER TABLE uniqdata RENAME TO uniqdata1;
> alter table uniqdata1 add columns(dict int) 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='dict','DEFAULT.VALUE.dict'= '');
> select distinct(dict) from uniqdata2 ;
> it will display the result but when we perform :
> select * from uniqdata1 ;
> it will display an error message :
> Job aborted due to stage failure: Task 3 in stage 59.0 failed 1 times, most 
> recent failure: Lost task 3.0 in stage 59.0 (TID 714, localhost, executor 
> driver): java.lang.NullPointerException



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-780) Alter table support for compaction through sort step

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-780.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Alter table support for compaction through sort step
> 
>
> Key: CARBONDATA-780
> URL: https://issues.apache.org/jira/browse/CARBONDATA-780
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Assignee: Manish Gupta
> Fix For: 1.1.0-incubating
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> Alter table need to support compaction process where complete data need to be 
> sorted again and then written to file.
> Currently in compaction process data is directly given to writer step where 
> it is splitted into columns and written. But as columns are sorted from left 
> to right, on dropping a column data will again become unorganized as dropped 
> column data will not be considered during compaction. In these scenarios 
> complete data need to be sorted again and then submitted to writer step.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-400) [Bad Records] Load data is fail and displaying the string value in beeline as exception

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-400.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> [Bad Records] Load data is fail and displaying the string value in beeline as 
> exception
> ---
>
> Key: CARBONDATA-400
> URL: https://issues.apache.org/jira/browse/CARBONDATA-400
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 0.1.0-incubating
> Environment: 3node cluster
>Reporter: MAKAMRAGHUVARDHAN
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Steps
> 1. Create table
> CREATE TABLE String_test2 (string_col string) STORED BY 
> 'org.apache.carbondata.format';
> 2. Load the data with parameter 'BAD_RECORDS_ACTION'='FORCE' and csv contains 
> a string value that is out of boundary.
> LOAD DATA INPATH 'hdfs://hacluster/Carbon/Priyal/string5.csv' into table 
> String_test2 OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_LOGGER_ENABLE'='TRUE', 
> 'BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='string_col');
> Actual Result: Load data is failed and displaying the string value in beeline 
> as exception trace.
> Expected Result:Should display a correct error message and should  not print 
> the exception trace on the console.
> Exception thrown on console is as shown below.
> Error: com.univocity.parsers.common.TextParsingException: Error processing 
> input: Length of parsed input (11) exceeds the maximum number of 
> characters defined in your parser settings (10).
> Hint: Number of characters processed may have exceeded limit of 10 
> characters per column. Use settings.setMaxCharsPerColumn(int) to define the 
> maximum number of characters a column can have
> Ensure your configuration is correct, with delimiters, quotes and escape 
> sequences that match the input format you are trying to parse
> Parser Configuration: CsvParserSettings:
> Column reordering enabled=true
> Empty value=null
> Header extraction enabled=false
> Headers=null
> Ignore leading whitespaces=true
> Ignore trailing whitespaces=true
> Input buffer size=128
> Input reading on separate thread=false
> Line separator detection enabled=false
> Maximum number of characters per column=10
> Maximum number of columns=20480
> Null value=
> Number of records to read=all
> Parse unescaped quotes=true
> Row processor=none
> Selected fields=none
> Skip empty lines=trueFormat configuration:
> CsvFormat:
> Comment character=#
> Field delimiter=,
> Line separator (normalized)=\n
> Line separator sequence=\n
> Quote character="
> Quote escape character=quote escape
> Quote escape escape character=\0, line=0, char=12. 
> Content parsed: 
> [hellohowareyouwelcomehellohellohellohellohellohellohellohelloheellooabcdefghijklmnopqrstuvwxyzabcqwertuyioplkjhgfdsazxcvbnmpoiuytrewqasdfghjklmnbvcxzasdghskhdgkhdbkshkjchskdhfssudkdjdudusdjhdshdshsjddshjdkdhgdhdshdhdududushdudududududududududududududududududuudududududududuudududududududududududududududududududududududududududuhellohowareyouwelcomehellohellohellohellohellohellohellohelloheellooabcdefghijklmnopqrstuvwxyzabcqwertuyioplkjhgfdsazxcvbnmpoiuytrewqasdfghjklmnbvcxzasdghskhdgkhdbkshkjchskdhfssudkdjdudusdjhdshdshsjddshjdkdhgdhdshdhdududushdudududududududududududududududududuudududududududuududududududududuu



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-839) Table lock file is not getting deleted after table rename is successful

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-839.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Table lock file is not getting deleted after table rename is successful
> ---
>
> Key: CARBONDATA-839
> URL: https://issues.apache.org/jira/browse/CARBONDATA-839
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Table lock file is not getting deleted after table rename is successful



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-814) bad record log file writing is not correct

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-814.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> bad record log file writing is not correct
> --
>
> Key: CARBONDATA-814
> URL: https://issues.apache.org/jira/browse/CARBONDATA-814
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-871) If locktype is not configured and store type is HDFS set HDFS lock as default

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-871.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> If locktype is not configured and store type is HDFS set HDFS lock as default
> -
>
> Key: CARBONDATA-871
> URL: https://issues.apache.org/jira/browse/CARBONDATA-871
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Reporter: Manohar Vanam
>Assignee: Manohar Vanam
> Fix For: 1.1.0-incubating
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> if locktype is not configured and store type is HDFS set HDFS lock as default



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-875) create database ddl is creating the database folder with case sensitive name.

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-875.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> create database ddl is creating the database folder with case sensitive name.
> -
>
> Key: CARBONDATA-875
> URL: https://issues.apache.org/jira/browse/CARBONDATA-875
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Reporter: ravikiran
>Assignee: ravikiran
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Create database DBNAME.  here the database name should be case insensitive.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-873) Drop table command throwing table already exists exception

2017-04-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-873.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Drop table command throwing table already exists exception
> --
>
> Key: CARBONDATA-873
> URL: https://issues.apache.org/jira/browse/CARBONDATA-873
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Reporter: Manohar Vanam
>Assignee: Manohar Vanam
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Drop table command throwing table already exists exception.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-863) Support creation and deletion of dictionary files through RDD during alter add and drop

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-863.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Support creation and deletion of dictionary files through RDD during alter 
> add and drop
> ---
>
> Key: CARBONDATA-863
> URL: https://issues.apache.org/jira/browse/CARBONDATA-863
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently during alter add and drop columns operation, dictionary files for 
> columns are being added and dropped in a single thread due to which the 
> operation becomes very slow as the number of columns increases.
> Instead this operation should be done through RDD which will make optimal use 
> of executor cores configured and performance will increase by number of cores 
> configured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-874) select * from table order by limit query is failing

2017-04-05 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-874:
--

 Summary: select * from table order by limit query is failing
 Key: CARBONDATA-874
 URL: https://issues.apache.org/jira/browse/CARBONDATA-874
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Query like below are failing in carbon with spark 2.1
select * from alldatatypestablesort order by empname limit 10



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-870) Folders and files not getting cleaned up created locally during data load operation

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-870.

Resolution: Fixed

> Folders and files not getting cleaned up created locally during data load 
> operation
> ---
>
> Key: CARBONDATA-870
> URL: https://issues.apache.org/jira/browse/CARBONDATA-870
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Folders and files which are created in local temp store location during data 
> load and insert into operations are not getting cleaned up. After some time 
> this will lead to filling up of local disk space and eventually will lead to 
> data load failure if threshold limit is reached.
> For this all the folders and files created locally need to be deleted once 
> the operation is completed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-845) Insert Select into same table is not working

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-845.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Insert Select into same table is not working
> 
>
> Key: CARBONDATA-845
> URL: https://issues.apache.org/jira/browse/CARBONDATA-845
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.1.0-incubating
>Reporter: sounak chakraborty
>Assignee: sounak chakraborty
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Insert Select from same table is not working in Spark-2.1. 
> Insert into table1 select * from table1 
> giving error
> Error: org.apache.spark.sql.AnalysisException: Cannot insert overwrite into 
> table that is also being read from.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-744) The property "spark.carbon.custom.distribution" should be change to carbon.custom.block.distribution and should be part of CarbonProperties

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-744:
---
Summary: The property "spark.carbon.custom.distribution" should be change 
to carbon.custom.block.distribution and should be part of CarbonProperties  
(was: he property "spark.carbon.custom.distribution" should be change to 
carbon.custom.block.distribution and should be part of CarbonProperties)

> The property "spark.carbon.custom.distribution" should be change to 
> carbon.custom.block.distribution and should be part of CarbonProperties
> ---
>
> Key: CARBONDATA-744
> URL: https://issues.apache.org/jira/browse/CARBONDATA-744
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Trivial
> Fix For: 1.1.0-incubating
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> The property "spark.carbon.custom.distribution" should  be part of 
> CarbonProperties
> As naming style adopted in carbon we should name the key 
> carbon.custom.distribution



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-659) Should add WhitespaceAround and ParenPad to javastyle

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-659:
---
Issue Type: Task  (was: Improvement)

> Should add WhitespaceAround and ParenPad to javastyle
> -
>
> Key: CARBONDATA-659
> URL: https://issues.apache.org/jira/browse/CARBONDATA-659
> Project: CarbonData
>  Issue Type: Task
>Reporter: QiangCai
>Assignee: QiangCai
>Priority: Trivial
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-565) Clean up code suggested by IDE analyzer

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-565:
---
Issue Type: Task  (was: Improvement)

> Clean up code suggested by IDE analyzer
> ---
>
> Key: CARBONDATA-565
> URL: https://issues.apache.org/jira/browse/CARBONDATA-565
> Project: CarbonData
>  Issue Type: Task
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.0.0-incubating
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-565) Clean up code suggested by IDE analyzer

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-565:
---
Fix Version/s: (was: 1.1.0-incubating)
   1.0.0-incubating

> Clean up code suggested by IDE analyzer
> ---
>
> Key: CARBONDATA-565
> URL: https://issues.apache.org/jira/browse/CARBONDATA-565
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.0.0-incubating
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-676) Correcting spelling mistakes and removed unnecessary methods

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-676:
---
Summary: Correcting spelling mistakes and removed unnecessary methods  
(was: Code clean)

> Correcting spelling mistakes and removed unnecessary methods
> 
>
> Key: CARBONDATA-676
> URL: https://issues.apache.org/jira/browse/CARBONDATA-676
> Project: CarbonData
>  Issue Type: Task
>Reporter: zhangshunyu
>Assignee: zhangshunyu
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> To clean some code:
> Correct the spelling mistake
> Remove unused function
> Iterate the Array instead of transform it to List.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-676) Code clean

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-676:
---
Issue Type: Task  (was: Improvement)

> Code clean
> --
>
> Key: CARBONDATA-676
> URL: https://issues.apache.org/jira/browse/CARBONDATA-676
> Project: CarbonData
>  Issue Type: Task
>Reporter: zhangshunyu
>Assignee: zhangshunyu
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> To clean some code:
> Correct the spelling mistake
> Remove unused function
> Iterate the Array instead of transform it to List.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-863) Support creation and deletion of dictionary files through RDD during alter add and drop

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-863:
---
Fix Version/s: (was: 1.1.0-incubating)

> Support creation and deletion of dictionary files through RDD during alter 
> add and drop
> ---
>
> Key: CARBONDATA-863
> URL: https://issues.apache.org/jira/browse/CARBONDATA-863
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently during alter add and drop columns operation, dictionary files for 
> columns are being added and dropped in a single thread due to which the 
> operation becomes very slow as the number of columns increases.
> Instead this operation should be done through RDD which will make optimal use 
> of executor cores configured and performance will increase by number of cores 
> configured.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-854) Carbondata with Datastax / Cassandra

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-854:
---
Fix Version/s: (was: 1.1.0-incubating)

> Carbondata with Datastax / Cassandra
> 
>
> Key: CARBONDATA-854
> URL: https://issues.apache.org/jira/browse/CARBONDATA-854
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 1.1.0-incubating
> Environment: Datastax DSE 5.0 ( DSE analytics )
>Reporter: Sanoj MG
>Priority: Minor
>
> I am trying to get Carbondata working in a Datastax DSE 5.0 cluster. 
> An exception is thrown while trying to create Carbondata table from spark 
> shell. Below are the steps: 
> scala> import com.datastax.spark.connector._
> scala> import org.apache.spark.sql.SaveMode
> scala> import org.apache.spark.sql.CarbonContext
> scala> import org.apache.spark.sql.types._
> scala> val cc = new CarbonContext(sc, "cfs://127.0.0.1/opt/CarbonStore")
> scala> val df = 
> cc.read.parquet("file:///home/cassandra/testdata-30day/cassandra/zone.parquet")
> scala> df.write.format("carbondata").option("tableName", 
> "zone").option("compress", 
> "true").option("TempCSV","false").mode(SaveMode.Overwrite).save()
> Below exception is thrown and it fails to create carbondata table. 
> java.io.FileNotFoundException: /opt/CarbonStore/default/zone/Metadata/schema 
> (No such file or directory)
> at java.io.FileOutputStream.open0(Native Method)
> at java.io.FileOutputStream.open(FileOutputStream.java:270)
> at java.io.FileOutputStream.(FileOutputStream.java:213)
> at java.io.FileOutputStream.(FileOutputStream.java:133)
> at 
> org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream(FileFactory.java:207)
> at 
> org.apache.carbondata.core.writer.ThriftWriter.open(ThriftWriter.java:84)
> at 
> org.apache.spark.sql.hive.CarbonMetastore.createTableFromThrift(CarbonMetastore.scala:293)
> at 
> org.apache.spark.sql.execution.command.CreateTable.run(carbonTableSchema.scala:163)
> at 
> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
> at 
> org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
> at 
> org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
> at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
> at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
> at 
> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
> at 
> org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
> at org.apache.spark.sql.DataFrame.(DataFrame.scala:145)
> at org.apache.spark.sql.DataFrame.(DataFrame.scala:130)
> at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:139)
> at 
> org.apache.carbondata.spark.CarbonDataFrameWriter.saveAsCarbonFile(CarbonDataFrameWriter.scala:39)
> at 
> org.apache.spark.sql.CarbonSource.createRelation(CarbonDatasourceRelation.scala:109)
> at 
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:222)
> at 
> org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-778) Alter table support for complex type

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-778:
---
Fix Version/s: (was: 1.1.0-incubating)

> Alter table support for complex type
> 
>
> Key: CARBONDATA-778
> URL: https://issues.apache.org/jira/browse/CARBONDATA-778
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Priority: Minor
>
> Alter table need to support add, drop complex type columns and data type 
> change



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-776) Alter table support for spark 1.6

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-776:
---
Fix Version/s: (was: 1.1.0-incubating)

> Alter table support for spark 1.6
> -
>
> Key: CARBONDATA-776
> URL: https://issues.apache.org/jira/browse/CARBONDATA-776
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Priority: Minor
>
> Alter feature need to be supported for spark 1.6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-779) Alter table support for column group

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-779:
---
Fix Version/s: (was: 1.1.0-incubating)

> Alter table support for column group
> 
>
> Key: CARBONDATA-779
> URL: https://issues.apache.org/jira/browse/CARBONDATA-779
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Priority: Minor
>
> Alter table need to be supported to add, drop and change datatype of column 
> groups



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-859) [Documentation] Alter Table - CHANGE DATA TYPE

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-859:
---
Fix Version/s: (was: 1.1.0-incubating)

> [Documentation] Alter Table - CHANGE DATA TYPE
> --
>
> Key: CARBONDATA-859
> URL: https://issues.apache.org/jira/browse/CARBONDATA-859
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Gururaj Shetty
>
> Documentation for CHANGE DATA TYPE
> Should include the following:
> Function/Description
> Syntax
> Parameter Description
> Usage Guidelines
> Example(s) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-679) Add examples read CarbonData file to dataframe in Spark 2.1 version

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-679:
---
Fix Version/s: (was: 1.1.0-incubating)

> Add examples read CarbonData file to dataframe in Spark 2.1 version
> ---
>
> Key: CARBONDATA-679
> URL: https://issues.apache.org/jira/browse/CARBONDATA-679
> Project: CarbonData
>  Issue Type: Improvement
>  Components: examples
>Reporter: Liang Chen
>Priority: Minor
>
> In examples/spark2, add examples read CarbonData file to dataframe in Spark 
> 2.1 version



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-858) [Documentation] Alter Table - DROP COLUMNS

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-858:
---
Fix Version/s: (was: 1.1.0-incubating)

> [Documentation] Alter Table - DROP COLUMNS
> --
>
> Key: CARBONDATA-858
> URL: https://issues.apache.org/jira/browse/CARBONDATA-858
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Gururaj Shetty
>
> Documentation for DROP COLUMNS
> Should include the following:
> Function/Description
> Syntax
> Parameter Description
> Usage Guidelines
> Example(s) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-857) [Documentation] Alter Table - ADD COLUMNS

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-857:
---
Fix Version/s: (was: 1.1.0-incubating)

> [Documentation] Alter Table - ADD COLUMNS
> -
>
> Key: CARBONDATA-857
> URL: https://issues.apache.org/jira/browse/CARBONDATA-857
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Gururaj Shetty
>
> Documentation for ADD COLUMNS
> Should include the following:
> Function/Description
> Syntax
> Parameter Description
> Usage Guidelines
> Example(s) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-856) [Documentation] Alter Table - TABLE RENAME

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-856:
---
Fix Version/s: (was: 1.1.0-incubating)

> [Documentation] Alter Table -  TABLE RENAME
> ---
>
> Key: CARBONDATA-856
> URL: https://issues.apache.org/jira/browse/CARBONDATA-856
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Gururaj Shetty
>
> Documentation for TABLE RENAME
> Should include the following:
> Function/Description
> Syntax
> Parameter Description
> Usage Guidelines
> Example(s) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-833) load data from dataframe,generater data row may be error when delimiterLevel1 or delimiterLevel2 is special character

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-833:
---
Fix Version/s: (was: 1.1.0-incubating)
   (was: 1.0.0-incubating)

> load data from dataframe,generater data row may be error when delimiterLevel1 
> or delimiterLevel2 is special character
> -
>
> Key: CARBONDATA-833
> URL: https://issues.apache.org/jira/browse/CARBONDATA-833
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.0.0-incubating, 1.1.0-incubating
>Reporter: tianli
>Assignee: tianli
>   Original Estimate: 4h
>  Time Spent: 40m
>  Remaining Estimate: 3h 20m
>
>  load data from dataframe,generater data row may be error by delimiterLevel1 
> or delimiterLevel2 is special character 
>   because delimiterLevel1 and delimiterLevel2 when carbonLoadModel is create 
> by CarbonUtil.delimiterConverter(), CarbonScalaUtil.getString direct use 
> carbonLoadModel.getComplexDelimiterLevel1 and 
> carbonLoadModel.getComplexDelimiterLevel2 
> val delimiter = if (level == 1) {
> delimiterLevel1
>   } else {
> delimiterLevel2
>   }
>   val builder = new StringBuilder()
>   s.foreach { x =>
> builder.append(getString(x, serializationNullFormat, 
> delimiterLevel1,
> delimiterLevel2, timeStampFormat, dateFormat, level + 
> 1)).append(delimiter)
>   }
> make  primitive data  added a more char \ when datatype is complex 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-780) Alter table support for compaction through sort step

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-780:
---
Fix Version/s: (was: 1.1.0-incubating)

> Alter table support for compaction through sort step
> 
>
> Key: CARBONDATA-780
> URL: https://issues.apache.org/jira/browse/CARBONDATA-780
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Alter table need to support compaction process where complete data need to be 
> sorted again and then written to file.
> Currently in compaction process data is directly given to writer step where 
> it is splitted into columns and written. But as columns are sorted from left 
> to right, on dropping a column data will again become unorganized as dropped 
> column data will not be considered during compaction. In these scenarios 
> complete data need to be sorted again and then submitted to writer step.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-767) Alter table support for carbondata

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-767:
---
Fix Version/s: (was: 1.1.0-incubating)

> Alter table support for carbondata
> --
>
> Key: CARBONDATA-767
> URL: https://issues.apache.org/jira/browse/CARBONDATA-767
> Project: CarbonData
>  Issue Type: New Feature
>Affects Versions: 1.1.0-incubating
>Reporter: Manish Gupta
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Currently in carbondata after a table is created once it becomes immutable. 
> Deletion or addition of column is not supported because of which the same 
> table and data cannot be used again. To add more flexibility to the 
> carbondata system alter table support needs to be added to carbondata system.
> Please refer the design document at below location.
> https://drive.google.com/open?id=0B1DnrpMgGOu9a3dBSzhqVlEwY2s



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-565) Clean up code suggested by IDE analyzer

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-565.

Resolution: Fixed

> Clean up code suggested by IDE analyzer
> ---
>
> Key: CARBONDATA-565
> URL: https://issues.apache.org/jira/browse/CARBONDATA-565
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.1.0-incubating
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-313) Update CarbonSource to use CarbonDatasourceHadoopRelation

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-313:
---
Fix Version/s: (was: 1.1.0-incubating)

> Update CarbonSource to use CarbonDatasourceHadoopRelation
> -
>
> Key: CARBONDATA-313
> URL: https://issues.apache.org/jira/browse/CARBONDATA-313
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Jacky Li
>
> Change CarbonSource to use CarbonDatasourceHadoopRelation only, remove 
> extension of BaseRelation, extend from HadoopFsRelation only



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-314) Make CarbonContext to use standard Datasource strategy

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-314:
---
Fix Version/s: (was: 1.1.0-incubating)

> Make CarbonContext to use standard Datasource strategy
> --
>
> Key: CARBONDATA-314
> URL: https://issues.apache.org/jira/browse/CARBONDATA-314
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Jacky Li
>
> Move the dictionary stratey out of CarbonTableScan, make a separate strategy 
> for it.
> Then make CarbonContext use standard datasource strategy for creation of 
> relation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-312) Unify two datasource: CarbonDatasourceHadoopRelation and CarbonDatasourceRelation

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-312:
---
Fix Version/s: (was: 1.1.0-incubating)

> Unify two datasource: CarbonDatasourceHadoopRelation and 
> CarbonDatasourceRelation
> -
>
> Key: CARBONDATA-312
> URL: https://issues.apache.org/jira/browse/CARBONDATA-312
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Jacky Li
>
> Take CarbonDatasourceHadoopRelation as the target datasource definition, 
> after that, CarbonContext can use standard Datasource strategy
> Since CarbonHadoopFSRDD need to be removed, and it is used by 
> CarbonDatasourceHadoopRelation. So we need to change 
> CarbonDatasourceHadoopRelation.buildScan function to return CarbonScanRDD 
> instead CarbonHadoopFSRDD, then CarbonHadoopFSRDD can be removed



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-309) Support two types of ReadSupport in CarbonRecordReader

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-309:
---
Fix Version/s: (was: 1.1.0-incubating)

> Support two types of ReadSupport in CarbonRecordReader
> --
>
> Key: CARBONDATA-309
> URL: https://issues.apache.org/jira/browse/CARBONDATA-309
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: spark-integration
>Reporter: Jacky Li
>
> CarbonRecordReader should support late decode based on passed Configuration
> A config indicating late decode need to be added in CarbonInputFormat for 
> this purpose. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-307) Support executor side scan using CarbonInputFormat

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-307:
---
Fix Version/s: (was: 1.1.0-incubating)

> Support executor side scan using CarbonInputFormat
> --
>
> Key: CARBONDATA-307
> URL: https://issues.apache.org/jira/browse/CARBONDATA-307
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Affects Versions: 0.1.0-incubating
>Reporter: Jacky Li
>
> Currently, there are two read path in carbon-spark module: 
> 1. CarbonContext => CarbonDatasourceRelation => CarbonScanRDD => QueryExecutor
> In this case, CarbonScanRDD uses CarbonInputFormat to get the split, and use 
> QueryExecutor for scan.
> 2. SqlContext => CarbonDatasourceHadoopRelation => CarbonHadoopFSRDD => 
> CarbonInputFormat(CarbonRecordReader) => QueryExecutor
> In this case, CarbonHadoopFSRDD uses CarbonInputFormat to do both get split 
> and scan
> Because of this, there are unnecessary duplicate code, they need to be 
> unified.
> The target approach should be:
> sqlContext/carbonContext => CarbonDatasourceHadoopRelation => CarbonScanRDD 
> =>  CarbonInputFormat(CarbonRecordReader) => QueryExecutor



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-303) 8. Add CarbonTableOutpuFormat to write data to carbon.

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-303:
---
Fix Version/s: (was: 1.1.0-incubating)

> 8. Add CarbonTableOutpuFormat to write data to carbon.
> --
>
> Key: CARBONDATA-303
> URL: https://issues.apache.org/jira/browse/CARBONDATA-303
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>
> Add CarbonTableOutpuFormat to write data to carbon. It should use 
> DataProcessorStep interface to load the data.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-799) change word from currenr to current

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-799:
---
Fix Version/s: (was: 1.0.1-incubating)

> change word from currenr to current
> ---
>
> Key: CARBONDATA-799
> URL: https://issues.apache.org/jira/browse/CARBONDATA-799
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jarck
>Assignee: Jarck
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> change word from currenr to current



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-823) Refactory of data write step

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-823.

Resolution: Fixed
  Assignee: Jacky Li

> Refactory of data write step
> 
>
> Key: CARBONDATA-823
> URL: https://issues.apache.org/jira/browse/CARBONDATA-823
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-849) if alter table ddl is executed on non existing table, then error message is wrong.

2017-04-05 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-849.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> if alter table ddl is executed on non existing table, then error message is 
> wrong.
> --
>
> Key: CARBONDATA-849
> URL: https://issues.apache.org/jira/browse/CARBONDATA-849
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Reporter: ravikiran
>Assignee: ravikiran
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The error message getting while running alter on the non existing table is : 
> Exception in thread "main" 
> org.apache.carbondata.spark.exception.MalformedCarbonCommandException: 
> Unsupported alter operation on hive table
> but this is not correct. 
> The hive table has blocked the alter DDL on its tables. So Carbon should be 
> consistent with HIVE.
> Correct msg : 
> Operation not allowed: alter table name compact 'minor'



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-861) Improvements in query processing.

2017-04-05 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-861:
--

 Summary: Improvements in query processing.
 Key: CARBONDATA-861
 URL: https://issues.apache.org/jira/browse/CARBONDATA-861
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala
Priority: Minor


Following are the list of can improvements done

Remove multiple creation of array and copy of it in Dimension and measure chunk 
readers.
Simplify logic of finding offsets of nodictionary keys in the class 
SafeVariableLengthDimensionDataChunkStore.
Avoid byte array creation and copy for nodictionary columns in case of 
vectorized reader. Instead directly sending the length and offset to vector.
Remove unnecessary decoder plan additions to oprtimized plan. It can optimize 
the codegen flow.
Update CompareTest to take table blocksize and kept as 32 Mb in order to make 
use of small sorting when doing take ordered in spark.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-850) Fix the comment definition issues of CarbonData thrift files

2017-04-04 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-850.

Resolution: Fixed

> Fix the comment definition issues of CarbonData thrift files
> 
>
> Key: CARBONDATA-850
> URL: https://issues.apache.org/jira/browse/CARBONDATA-850
> Project: CarbonData
>  Issue Type: Bug
>  Components: file-format
>Reporter: Liang Chen
>Assignee: Liang Chen
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Fix the comment definition issues of CarbonData thrift files, for helping 
> users to easier understand CarbonData file format



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-838) Alter table add decimal column with default precision and scale is failing in parser.

2017-04-04 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-838.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Alter table add decimal column with default precision and scale is failing in 
> parser.
> -
>
> Key: CARBONDATA-838
> URL: https://issues.apache.org/jira/browse/CARBONDATA-838
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When we add new decimal column without specifying scale and precision, alter 
> table command is failing in parser.
> eg., alter table test1 add columns(dcmlcol decimal)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-822) Add unsafe sort for bucketing feature

2017-03-26 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-822:
--

 Summary: Add unsafe sort for bucketing feature
 Key: CARBONDATA-822
 URL: https://issues.apache.org/jira/browse/CARBONDATA-822
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala


Currently there is no unsafe sort in case of bucketing enabled. To improve the 
bucketing load performance enable unsafe sort for bucketing as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-821) Remove Kettle related code and flow from carbon.

2017-03-26 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-821:
--

Assignee: Ravindra Pesala

> Remove Kettle related code and flow from carbon.
> 
>
> Key: CARBONDATA-821
> URL: https://issues.apache.org/jira/browse/CARBONDATA-821
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>
> Remove Kettle related code and flow from carbon. It becomes difficult to 
> developers to handle all bugs and features in both the flows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-821) Remove Kettle related code and flow from carbon.

2017-03-26 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-821:
--

 Summary: Remove Kettle related code and flow from carbon.
 Key: CARBONDATA-821
 URL: https://issues.apache.org/jira/browse/CARBONDATA-821
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Remove Kettle related code and flow from carbon. It becomes difficult to 
developers to handle all bugs and features in both the flows.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CARBONDATA-761) Dictionary server should not be shutdown after loading

2017-03-23 Thread Ravindra Pesala (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15939841#comment-15939841
 ] 

Ravindra Pesala commented on CARBONDATA-761:


it is duplicated to CARBONDATA-773

> Dictionary server should not be shutdown after loading
> --
>
> Key: CARBONDATA-761
> URL: https://issues.apache.org/jira/browse/CARBONDATA-761
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Reporter: QiangCai
>Assignee: QiangCai
>Priority: Minor
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Code:
> CarbonTableSchema/LoadTable



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-811) Refactor dictionary based result collector class

2017-03-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-811.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Refactor dictionary based result collector class
> 
>
> Key: CARBONDATA-811
> URL: https://issues.apache.org/jira/browse/CARBONDATA-811
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Srinath Thota
>Priority: Minor
> Fix For: 1.0.1-incubating
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Problem: For each batch result collector class is filling all the class level 
> variable this may hit the performance
> Solution: fill it in constructor, so only once it will be initialize.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-783) Loading data with Single Pass 'true' option is throwing an exception

2017-03-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-783:
---
Priority: Major  (was: Trivial)

> Loading data with Single Pass 'true' option is throwing an exception
> 
>
> Key: CARBONDATA-783
> URL: https://issues.apache.org/jira/browse/CARBONDATA-783
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.1.0-incubating
> Environment: spark 2.1
>Reporter: Geetika Gupta
>Assignee: Ravindra Pesala
> Attachments: 7000_UniqData.csv
>
>
> I tried to create table using the following query:
> CREATE TABLE uniq_include_dictionary (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,Double_COLUMN2,DECIMAL_COLUMN2');
> Table creation was successfull but when I tried to load data into the table 
> It showed the following error:
> ERROR 16-03 13:41:32,354 - nioEventLoopGroup-8-2 
> java.lang.IndexOutOfBoundsException: readerIndex(64) + length(25) exceeds 
> writerIndex(80): UnpooledUnsafeDirectByteBuf(ridx: 64, widx: 80, cap: 80)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1161)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1155)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:694)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:702)
>   at 
> org.apache.carbondata.core.dictionary.generator.key.DictionaryMessage.readData(DictionaryMessage.java:70)
>   at 
> org.apache.carbondata.core.dictionary.server.DictionaryServerHandler.channelRead(DictionaryServerHandler.java:59)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
>   at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
>   at java.lang.Thread.run(Thread.java:745)
> ERROR 16-03 13:41:32,355 - nioEventLoopGroup-8-2 exceptionCaught
> java.lang.IndexOutOfBoundsException: readerIndex(64) + length(25) exceeds 
> writerIndex(80): UnpooledUnsafeDirectByteBuf(ridx: 64, widx: 80, cap: 80)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1161)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1155)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:694)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:702)
>   at 
> org.apache.carbondata.core.dictionary.generator.key.DictionaryMessage.readData(DictionaryMessage.java:70)
>   at 
> org.apache.carbondata.core.dictionary.server.DictionaryServerHandler.channelRead(DictionaryServerHandler.java:59)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
>   at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPip

[jira] [Assigned] (CARBONDATA-783) Loading data with Single Pass 'true' option is throwing an exception

2017-03-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-783:
--

Assignee: Ravindra Pesala

> Loading data with Single Pass 'true' option is throwing an exception
> 
>
> Key: CARBONDATA-783
> URL: https://issues.apache.org/jira/browse/CARBONDATA-783
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.1.0-incubating
> Environment: spark 2.1
>Reporter: Geetika Gupta
>Assignee: Ravindra Pesala
>Priority: Trivial
> Attachments: 7000_UniqData.csv
>
>
> I tried to create table using the following query:
> CREATE TABLE uniq_include_dictionary (CUST_ID int,CUST_NAME 
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, 
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), 
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 
> double,INTEGER_COLUMN1 int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,Double_COLUMN2,DECIMAL_COLUMN2');
> Table creation was successfull but when I tried to load data into the table 
> It showed the following error:
> ERROR 16-03 13:41:32,354 - nioEventLoopGroup-8-2 
> java.lang.IndexOutOfBoundsException: readerIndex(64) + length(25) exceeds 
> writerIndex(80): UnpooledUnsafeDirectByteBuf(ridx: 64, widx: 80, cap: 80)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1161)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1155)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:694)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:702)
>   at 
> org.apache.carbondata.core.dictionary.generator.key.DictionaryMessage.readData(DictionaryMessage.java:70)
>   at 
> org.apache.carbondata.core.dictionary.server.DictionaryServerHandler.channelRead(DictionaryServerHandler.java:59)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
>   at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:652)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:575)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:489)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:140)
>   at 
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
>   at java.lang.Thread.run(Thread.java:745)
> ERROR 16-03 13:41:32,355 - nioEventLoopGroup-8-2 exceptionCaught
> java.lang.IndexOutOfBoundsException: readerIndex(64) + length(25) exceeds 
> writerIndex(80): UnpooledUnsafeDirectByteBuf(ridx: 64, widx: 80, cap: 80)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes0(AbstractByteBuf.java:1161)
>   at 
> io.netty.buffer.AbstractByteBuf.checkReadableBytes(AbstractByteBuf.java:1155)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:694)
>   at io.netty.buffer.AbstractByteBuf.readBytes(AbstractByteBuf.java:702)
>   at 
> org.apache.carbondata.core.dictionary.generator.key.DictionaryMessage.readData(DictionaryMessage.java:70)
>   at 
> org.apache.carbondata.core.dictionary.server.DictionaryServerHandler.channelRead(DictionaryServerHandler.java:59)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:367)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:353)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:346)
>   at 
> io.netty.channel.DefaultChannelPipeline$HeadConte

[jira] [Resolved] (CARBONDATA-805) Fix groupid,package name,Class name issues

2017-03-23 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-805.

   Resolution: Fixed
Fix Version/s: 1.1.0-incubating

> Fix groupid,package name,Class name issues
> --
>
> Key: CARBONDATA-805
> URL: https://issues.apache.org/jira/browse/CARBONDATA-805
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: presto-integration
>Reporter: Liang Chen
>Assignee: Liang Chen
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Fix groupid,package name,Class name issues



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-809) Union with alias is returning wrong result.

2017-03-22 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-809:
--

 Summary: Union with alias is returning wrong result.
 Key: CARBONDATA-809
 URL: https://issues.apache.org/jira/browse/CARBONDATA-809
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Union with alias is returning wrong result.

Testcase 
{code}
SELECT t.c1 a FROM (select c1 from  carbon_table1 union all  select c1 from  
carbon_table2) t
{code}

The above query returns the data from only one table and also duplicated.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-804) Update file structure info as per V3 format definition

2017-03-22 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-804.

Resolution: Fixed

> Update file structure info as per V3 format definition
> --
>
> Key: CARBONDATA-804
> URL: https://issues.apache.org/jira/browse/CARBONDATA-804
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs, file-format
>Affects Versions: 1.0.0-incubating
>Reporter: Liang Chen
>Assignee: Liang Chen
>Priority: Minor
> Fix For: 1.1.0-incubating
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Update file structure info as per V3 format definition, the master has merged 
> new V3 format for further improving performance.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-803) Incorrect results returned by not equal to filter on dictionary column with numeric data type

2017-03-22 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-803.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Incorrect results returned by not equal to filter on dictionary column with 
> numeric data type
> -
>
> Key: CARBONDATA-803
> URL: https://issues.apache.org/jira/browse/CARBONDATA-803
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Manish Gupta
>Assignee: Manish Gupta
> Fix For: 1.1.0-incubating, 1.0.1-incubating
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Whenever a not equal to filter is applied on dictionary column with numeric 
> datatype, the cast added by spark plan is removed while creating carbon 
> filters from spark filter. Due to this plan modification incorrect results 
> are returned by spark.
> Steps to reproduce the issue:
> 1. CREATE TABLE IF NOT EXISTS carbon(ID Int, date Timestamp, country String, 
> name String, phonetype String, serialname String, salary Int) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES('dictionary_include'='id')
> 2. LOAD DATA LOCAL INPATH '$csvFilePath' into table carbon
> 3. select Id from test_not_equal_to_carbon where id != '7'



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-793) Count with null values is giving wrong result.

2017-03-18 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-793:
--

 Summary: Count with null values is giving wrong result.
 Key: CARBONDATA-793
 URL: https://issues.apache.org/jira/browse/CARBONDATA-793
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala
Priority: Minor


if the data has null values then it should not count the data. But it is 
counting now. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-791) Exists queries of TPC-DS are failing in carbon

2017-03-17 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-791:
--

 Summary: Exists queries of TPC-DS are failing in carbon
 Key: CARBONDATA-791
 URL: https://issues.apache.org/jira/browse/CARBONDATA-791
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Exists queries are failing in carbon.
These are required in TPC-DS test.

Testcase to reproduce.

{code}
val df = sqlContext.sparkContext.parallelize(1 to 1000).map(x => (x+"", 
(x+100)+"")).toDF("c1", "c2")
df.write
  .format("carbondata")
  .mode(SaveMode.Overwrite)
  .option("tableName", "carbon")
  .save()
sql("select * from carbon where c1='200' and exists(select * from carbon)")
{code}

It fails in carbon.
 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-791) Exists queries of TPC-DS are failing in carbon

2017-03-17 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-791:
--

Assignee: Ravindra Pesala

> Exists queries of TPC-DS are failing in carbon
> --
>
> Key: CARBONDATA-791
> URL: https://issues.apache.org/jira/browse/CARBONDATA-791
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>
> Exists queries are failing in carbon.
> These are required in TPC-DS test.
> Testcase to reproduce.
> {code}
> val df = sqlContext.sparkContext.parallelize(1 to 1000).map(x => (x+"", 
> (x+100)+"")).toDF("c1", "c2")
> df.write
>   .format("carbondata")
>   .mode(SaveMode.Overwrite)
>   .option("tableName", "carbon")
>   .save()
> sql("select * from carbon where c1='200' and exists(select * from carbon)")
> {code}
> It fails in carbon.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-747) Add simple performance test for spark2.1 carbon integration

2017-03-16 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-747.

Resolution: Fixed
  Assignee: Jacky Li

> Add simple performance test for spark2.1 carbon integration
> ---
>
> Key: CARBONDATA-747
> URL: https://issues.apache.org/jira/browse/CARBONDATA-747
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.1.0-incubating
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-787) Fixed Memory leak in Offheap Query + added statistics for V3

2017-03-16 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-787.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Fixed Memory leak in Offheap Query + added statistics for V3
> 
>
> Key: CARBONDATA-787
> URL: https://issues.apache.org/jira/browse/CARBONDATA-787
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
> Fix For: 1.0.1-incubating
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Problem: Memory leak during off heap query + added statistics for V3 during 
> query
> Solution: In data Block iterator need to free memory for occupied during query
> Added statistics for V3 format(Number of valid pages and total number of 
> pages)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-786) Data mismatch if the data data is loaded across blocklet groups

2017-03-16 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-786:
--

 Summary: Data mismatch if the data data is loaded across blocklet 
groups
 Key: CARBONDATA-786
 URL: https://issues.apache.org/jira/browse/CARBONDATA-786
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


Data mismatch if the data data is loaded across blocklet groups and filter 
applied on second column onwards.

Follow testcase

{code} 

CarbonProperties.getInstance()
  .addProperty("carbon.blockletgroup.size.in.mb", "16")
  .addProperty("carbon.enable.vector.reader", "true")
  .addProperty("enable.unsafe.sort", "true")

val rdd = sqlContext.sparkContext
  .parallelize(1 to 120, 4)
  .map { x =>
("city" + x % 8, "country" + x % 1103, "planet" + x % 10007, x.toString,
  (x % 16).toShort, x / 2, (x << 1).toLong, x.toDouble / 13, x.toDouble 
/ 11)
  }.map { x =>
  Row(x._1, x._2, x._3, x._4, x._5, x._6, x._7, x._8, x._9)
}

val schema = StructType(
  Seq(
StructField("city", StringType, nullable = false),
StructField("country", StringType, nullable = false),
StructField("planet", StringType, nullable = false),
StructField("id", StringType, nullable = false),
StructField("m1", ShortType, nullable = false),
StructField("m2", IntegerType, nullable = false),
StructField("m3", LongType, nullable = false),
StructField("m4", DoubleType, nullable = false),
StructField("m5", DoubleType, nullable = false)
  )
)

val input = sqlContext.createDataFrame(rdd, schema)
sql(s"drop table if exists testBigData")
input.write
  .format("carbondata")
  .option("tableName", "testBigData")
  .option("tempCSV", "false")
  .option("single_pass", "true")
  .option("dictionary_exclude", "id") // id is high cardinality column
  .mode(SaveMode.Overwrite)
  .save()
sql(s"select city, sum(m1) from testBigData " +
  s"where country='country12' group by city order by city").show()
{code}

The above code supposed return following data, but not returning it.
{code}
+-+---+
| city|sum(m1)|
+-+---+
|city0|544|
|city1|680|
|city2|816|
|city3|952|
|city4|   1088|
|city5|   1224|
|city6|   1360|
|city7|   1496|
+-+---+
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (CARBONDATA-785) Compilation error in core module util package.

2017-03-16 Thread Ravindra Pesala (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15927894#comment-15927894
 ] 

Ravindra Pesala commented on CARBONDATA-785:


Please close it as invalid. It is the issue with format jar availability 
snapshot repo. Now it is working.

> Compilation error in core module util package.
> --
>
> Key: CARBONDATA-785
> URL: https://issues.apache.org/jira/browse/CARBONDATA-785
> Project: CarbonData
>  Issue Type: Bug
>  Components: build
>Affects Versions: 1.1.0-incubating
> Environment: Spark 2.1
>Reporter: Vinod Rohilla
>Priority: Blocker
>
> Getting error while build creation.
> Error:
> [INFO] -
> [ERROR] COMPILATION ERROR : 
> [INFO] -
> [ERROR] 
> /opt/jenkins/workspace/CarbonData_Functional_Suite_New/incubator-carbondata/core/src/main/java/org/apache/carbondata/core/util/DataFileFooterConverterV3.java:[60,56]
>  cannot find symbol
>   symbol:   method getTime_stamp()
>   location: variable fileHeader of type 
> org.apache.carbondata.format.FileHeader
> [ERROR] 
> /opt/jenkins/workspace/CarbonData_Functional_Suite_New/incubator-carbondata/core/src/main/java/org/apache/carbondata/core/util/CarbonMetadataUtil.java:[903,15]
>  cannot find symbol
>   symbol:   method setTime_stamp(long)
>   location: variable fileHeader of type 
> org.apache.carbondata.format.FileHeader
> [INFO] 2 errors 
> Expected Result: Compilation error should not exist while build creation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-744) he property "spark.carbon.custom.distribution" should be change to carbon.custom.block.distribution and should be part of CarbonProperties

2017-03-15 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-744.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> he property "spark.carbon.custom.distribution" should be change to 
> carbon.custom.block.distribution and should be part of CarbonProperties
> --
>
> Key: CARBONDATA-744
> URL: https://issues.apache.org/jira/browse/CARBONDATA-744
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Trivial
> Fix For: 1.0.1-incubating
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> The property "spark.carbon.custom.distribution" should  be part of 
> CarbonProperties
> As naming style adopted in carbon we should name the key 
> carbon.custom.distribution



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-770) Filter Query not null data mismatch issue

2017-03-15 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-770.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Filter Query not null data mismatch issue
> -
>
> Key: CARBONDATA-770
> URL: https://issues.apache.org/jira/browse/CARBONDATA-770
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
> Fix For: 1.0.1-incubating
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Problem: Not null filter query is selecting null values.
> Solution: Problem is while parsing the data based on data type we are not 
> parsing for int, double, float, and long data type, need to add case for the 
> same



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-752) creating complex type gives exception

2017-03-14 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-752.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> creating complex type gives exception
> -
>
> Key: CARBONDATA-752
> URL: https://issues.apache.org/jira/browse/CARBONDATA-752
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.0.0-incubating
> Environment: spark 2,.1
>Reporter: anubhav tarar
>Assignee: anubhav tarar
>Priority: Trivial
> Fix For: 1.0.1-incubating
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> using complex type in create table gives me exception
> spark.sql(
>   s"""
>  | CREATE TABLE carbon_table(
>  |shortField short,
>  |intField int,
>  |bigintField long,
>  |doubleField double,
>  |stringField string,
>  |timestampField timestamp,
>  |decimalField decimal(18,2),
>  |dateField date,
>  |charField char(5),
>  |floatField float,
>  |complexData array
>  | )
>  | STORED BY 'CARBONDATA'
>  | TBLPROPERTIES('DICTIONARY_INCLUDE'='dateField, charField')
>""".stripMargin)
> it gives me exception
> Caused by: java.lang.RuntimeException: Unsupported data type: 
> ArrayType(StringType,true)
> Caused by: java.lang.RuntimeException: Unsupported data type: 
> ArrayType(StringType,true)
> at scala.sys.package$.error(package.scala:27)
> at 
> org.apache.carbondata.spark.util.DataTypeConverterUtil$.convertToCarbonTypeForSpark2(DataTypeConverterUtil.scala:61



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-424) Data Load will fail for badrecord and "bad_records_action" is fail

2017-03-14 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-424.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Data Load will fail for badrecord and "bad_records_action" is fail
> --
>
> Key: CARBONDATA-424
> URL: https://issues.apache.org/jira/browse/CARBONDATA-424
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, data-load, spark-integration
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
> Fix For: 1.0.1-incubating
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Whenever there is a bad record found in csv file and the BAD_RECORDS_ACTION' 
> is FAIL then data load will fail with an error message which gives 
> information about the badrecord because of which data load is failed. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-766) Size based blocklet for V3

2017-03-14 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-766.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Size based blocklet for V3
> --
>
> Key: CARBONDATA-766
> URL: https://issues.apache.org/jira/browse/CARBONDATA-766
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
> Fix For: 1.0.1-incubating
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Currently number of pages is based on configured fixed value(number of pages 
> per blocklet) , problem with this approach is in some cases blocklet size 
> will be less and it will cause more number of IO, to avoid this we can have 
> size based blocklet , in this case how many pages it will fit in blocklet 
> will based on configure size, so number of IO will be uniform



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-732) User unable to execute the select/Load query using thrift server.

2017-03-14 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-732.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> User unable to execute the select/Load query using thrift server. 
> --
>
> Key: CARBONDATA-732
> URL: https://issues.apache.org/jira/browse/CARBONDATA-732
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.0.0-incubating
> Environment: Spark 2.1
>Reporter: Vinod Rohilla
>Assignee: anubhav tarar
> Fix For: 1.0.1-incubating
>
> Attachments: LOG_FIle
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Result does not display to user while hit Select/Load query.
> Steps to reproduce:
> 1:Hit the query :
> 0: jdbc:hive2://localhost:1> select * from t4;
> Note: Cursor Keep blinking on beeline.
> 2: Logs on Thrift server:
> Error sending result 
> StreamResponse{streamId=/jars/carbondata_2.11-1.0.0-incubating-SNAPSHOT-shade-hadoop2.2.0.jar,
>  byteCount=19350001, 
> body=FileSegmentManagedBuffer{file=/opt/spark-2.1.0/carbonlib/carbondata_2.11-1.0.0-incubating-SNAPSHOT-shade-hadoop2.2.0.jar,
>  offset=0, length=19350001}} to /192.168.2.179:48291; closing connection
> java.lang.AbstractMethodError
>   at io.netty.util.ReferenceCountUtil.touch(ReferenceCountUtil.java:73)
>   at 
> io.netty.channel.DefaultChannelPipeline.touch(DefaultChannelPipeline.java:107)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:811)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:724)
>   at 
> io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:111)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:739)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:731)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:817)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:724)
>   at 
> io.netty.handler.timeout.IdleStateHandler.write(IdleStateHandler.java:305)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:739)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:802)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:815)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:795)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:832)
>   at 
> io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1032)
>   at 
> io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:296)
>   at 
> org.apache.spark.network.server.TransportRequestHandler.respond(TransportRequestHandler.java:194)
>   at 
> org.apache.spark.network.server.TransportRequestHandler.processStreamRequest(TransportRequestHandler.java:150)
>   at 
> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:111)
>   at 
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:119)
>   at 
> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:3

[jira] [Created] (CARBONDATA-771) Dataloading fails in V3 format for TPC-DS data.

2017-03-14 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-771:
--

 Summary: Dataloading fails in V3 format for TPC-DS data.
 Key: CARBONDATA-771
 URL: https://issues.apache.org/jira/browse/CARBONDATA-771
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala
Priority: Minor


Dataloading fails in V3 format for TPC-DS data.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-769) Support Codegen in CarbonDictionaryDecoder

2017-03-14 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-769:
--

 Summary: Support Codegen in CarbonDictionaryDecoder
 Key: CARBONDATA-769
 URL: https://issues.apache.org/jira/browse/CARBONDATA-769
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala
Assignee: Ravindra Pesala


Support Codegen in CarbonDictionaryDecoder to leverage wholecodegen performance 
of Spark2.1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-748) "between and" filter query is very slow

2017-03-10 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-748.

   Resolution: Fixed
 Assignee: Jarck
Fix Version/s: 1.0.1-incubating

> "between and" filter query is very slow
> ---
>
> Key: CARBONDATA-748
> URL: https://issues.apache.org/jira/browse/CARBONDATA-748
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jarck
>Assignee: Jarck
> Fix For: 1.0.1-incubating
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Hi,
> Currently In include and exclude filter case when dimension column does not
> have inverted index it is doing linear search , We can add binary search
> when data for that column is sorted, to get this information we can check
> in carbon table for that column whether user has selected no inverted index
> or not. If user has selected No inverted index while creating a column this
> code is fine, if user has not selected then data will be sorted so we can
> add binary search which will improve the performance.
> Please raise a Jira for this improvement
> -Regards
> Kumar Vishal
> On Fri, Mar 3, 2017 at 7:42 PM, 马云  wrote:
> Hi Dev,
> I used carbondata version 0.2 in my local machine, and found that the
> "between and" filter query is very slow.
> the root caused is by the below code in IncludeFilterExecuterImpl.java.
> It takes about 20s in my test.
> The code's  time complexity is O(n*m). I think it needs to optimized,
> please confirm. thanks
>  private BitSet setFilterdIndexToBitSet(DimensionColumnDataChunkdimens
> ionColumnDataChunk,
>  intnumerOfRows) {
>BitSet bitSet = new BitSet(numerOfRows);
>if (dimensionColumnDataChunkinstanceof FixedLengthDimensionDataChunk)
> {
>  FixedLengthDimensionDataChunk fixedDimensionChunk =
>  (FixedLengthDimensionDataChunk) dimensionColumnDataChunk;
>  byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
>  longstart = System.currentTimeMillis();
>  for (intk = 0; k < filterValues.length; k++) {
>for (intj = 0; j < numerOfRows; j++) {
>  if (ByteUtil.UnsafeComparer.INSTANCE
>  .compareTo(fixedDimensionChunk.getCompleteDataChunk(), j *
> filterValues[k].length,
>  filterValues[k].length, filterValues[k], 0,
> filterValues[k].length) == 0) {
>bitSet.set(j);
>  }
>}
>  }
>  System.out.println("loop time: "+(System.currentTimeMillis() -
> start));
>}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-739) Avoid creating multiple instances of DirectDictionary in DictionaryBasedResultCollector

2017-03-10 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-739.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Avoid creating multiple instances of DirectDictionary in 
> DictionaryBasedResultCollector
> ---
>
> Key: CARBONDATA-739
> URL: https://issues.apache.org/jira/browse/CARBONDATA-739
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Reporter: Ravindra Pesala
>Assignee: Cao, Lionel
>Priority: Minor
> Fix For: 1.0.1-incubating
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Avoid creating multiple instances of DirectDictionary in 
> DictionaryBasedResultCollector.
> For every row, direct dictionary is creating inside 
> DictionaryBasedResultCollector.collectData method.
> Please create single instance per column and reuse it



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-757) Big decimal optimization in store and processing

2017-03-10 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-757:
--

 Summary:  Big decimal optimization in store and processing
 Key: CARBONDATA-757
 URL: https://issues.apache.org/jira/browse/CARBONDATA-757
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala
Assignee: Ravindra Pesala


Currently Decimal is converted to bytes and using LV (length + value) format  
to write to store. And while getting back read the bytes in LV format and 
convert back the bigdecimal.

We can do following operations to improve storage and processing.
1. if decimal precision is less than 9 then we can fit in int (4 bytes)
2. if decimal precision is less than 18 then we can fit in long (8 bytes)
3. if decimal precision is more than 18 then we can fit in fixed length 
bytes(the length bytes can vary depends on precision but it is always fixed 
length)
So in this approach we no need store bigdecimal in LV format, we can store in 
fixed format.It reduces the memory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-743) Remove the abundant class CarbonFilters.scala

2017-03-08 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-743.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Remove the abundant class CarbonFilters.scala
> -
>
> Key: CARBONDATA-743
> URL: https://issues.apache.org/jira/browse/CARBONDATA-743
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Cao, Lionel
>Priority: Trivial
> Fix For: 1.0.1-incubating
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Remove the abundant class CarbonFilters.scala from spark2 package.
> Right now there are two classes with name CarbonFilters in carbondata.
> 1.Delete the CarbonFilters scala file from spark-common package
> 2. Move the CarbonFilters scala from spark2 package to spark-common package.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-740) Add logger for rows processed while closing in AbstractDataLoadProcessorStep

2017-03-08 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-740.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Add logger for rows processed while closing in AbstractDataLoadProcessorStep
> 
>
> Key: CARBONDATA-740
> URL: https://issues.apache.org/jira/browse/CARBONDATA-740
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Cao, Lionel
>Priority: Trivial
> Fix For: 1.0.1-incubating
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Add logger for rows processed while closing in AbstractDataLoadProcessorStep.
> It is good to print the total records processed while closing the step, so 
> please log the rows processed in AbstractDataLoadProcessorStep



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-746) Support spark-sql CLI for spark2.1 carbon integration

2017-03-08 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-746.

Resolution: Fixed

> Support spark-sql CLI for spark2.1 carbon integration
> -
>
> Key: CARBONDATA-746
> URL: https://issues.apache.org/jira/browse/CARBONDATA-746
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jacky Li
>Assignee: Jacky Li
> Fix For: 1.1.0-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-691) After Compaction records count are mismatched.

2017-03-06 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-691.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> After Compaction records count are mismatched.
> --
>
> Key: CARBONDATA-691
> URL: https://issues.apache.org/jira/browse/CARBONDATA-691
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load, data-query, docs
>Affects Versions: 1.0.0-incubating
>Reporter: Babulal
>Assignee: sounak chakraborty
> Fix For: 1.0.1-incubating
>
> Attachments: createLoadcmd.txt, driverlog.txt
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Spark version - Spark 1.6.2 and spark2.1 
> After Compaction data showing is wrong.
> create table and load 4 times s( compaction threshold is 4,3)
> Load 4 times same data .each load 105 records as attached in file . 
> --+--+
> | SegmentSequenceId  |   Status   | Load Start Time  |  Load End 
> Time   |
> +++--+--+--+
> | 3  | Compacted  | 2017-02-01 14:07:51.922  | 2017-02-01 
> 14:07:52.591  |
> | 2  | Compacted  | 2017-02-01 14:07:33.481  | 2017-02-01 
> 14:07:34.443  |
> | 1  | Compacted  | 2017-02-01 14:07:23.495  | 2017-02-01 
> 14:07:24.167  |
> | 0.1| Success| 2017-02-01 14:07:52.815  | 2017-02-01 
> 14:07:57.201  |
> | 0  | Compacted  | 2017-02-01 14:07:07.541  | 2017-02-01 
> 14:07:11.983  |
> +++--+--+--+
> 5 rows selected (0.021 seconds)
> 0: jdbc:hive2://8.99.61.4:23040> select count(*) from 
> Comp_VMALL_DICTIONARY_INCLUDE_7;
> +---+--+
> | count(1)  |
> +---+--+
> | 1680  |
> +---+--+
> 1 row selected (4.468 seconds)
> 0: jdbc:hive2://8.99.61.4:23040> select count(imei) from 
> Comp_VMALL_DICTIONARY_INCLUDE_7;
> +--+--+
> | count(imei)  |
> +--+--+
> | 1680 |
> +--+--+
> Expected :-  total records should be 420 . 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (CARBONDATA-738) Able to load dataframe with boolean type in a carbon table but with null values

2017-03-02 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-738.

   Resolution: Fixed
Fix Version/s: 1.0.1-incubating

> Able to load dataframe with boolean type in a carbon table but with null 
> values
> ---
>
> Key: CARBONDATA-738
> URL: https://issues.apache.org/jira/browse/CARBONDATA-738
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.0.0-incubating
> Environment: spark 1.6,spark 2.1
>Reporter: anubhav tarar
>Assignee: anubhav tarar
>Priority: Trivial
> Fix For: 1.0.1-incubating
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> i created a dataframe with boolean type as follows 
> case class People(name: String, occupation: String, consultant: Boolean)
> val people = List(People("sangeeta", "engineer", true), People("pallavi", 
> "consultant", true))
> val peopleRDD: RDD[People] = cc.sc.parallelize(people)
> val peopleDF: DataFrame = peopleRDD.toDF("name", "occupation", "id")
>  peopleDF.write
>   .format("carbondata")
>   .option("tableName", "carbon2")
>   .option("compress", "true")
>   .mode(SaveMode.Overwrite)
>   .save()
> cc.sql("SELECT * FROM carbon2").show()
> currently boolean type is not supported in carbon data but table gets created
> but it shows me null values
> ++--++
> |name|occupation|  id|
> ++--++
> | pallavi|consultant|null|
> |sangeeta|  engineer|null|
> ++--++
> for boolean type it should throw unsupported type exception
> there is problem with carbondataframe writer



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-743) Remove the abundant class CarbonFilters.scala

2017-03-02 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-743:
--

Assignee: Liang Chen

> Remove the abundant class CarbonFilters.scala
> -
>
> Key: CARBONDATA-743
> URL: https://issues.apache.org/jira/browse/CARBONDATA-743
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Liang Chen
>Priority: Trivial
>
> Remove the abundant class CarbonFilters.scala from spark2 package.
> Right now there are two classes with name CarbonFilters in carbondata.
> 1.Delete the CarbonFilters scala file from spark-common package
> 2. Move the CarbonFilters scala from spark2 package to spark-common package.
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-743) Remove the abundant class CarbonFilters.scala

2017-03-02 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-743:
--

 Summary: Remove the abundant class CarbonFilters.scala
 Key: CARBONDATA-743
 URL: https://issues.apache.org/jira/browse/CARBONDATA-743
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala
Priority: Trivial


Remove the abundant class CarbonFilters.scala from spark2 package.

Right now there are two classes with name CarbonFilters in carbondata.
1.Delete the CarbonFilters scala file from spark-common package
2. Move the CarbonFilters scala from spark2 package to spark-common package.
 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (CARBONDATA-742) Add batch sort to improve the loading performance

2017-03-02 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-742:
---
Description: 
Current Problem:
Sort step is major issue as it is blocking step. It needs to receive all data 
and write down the sort temp files to disk, after that only data writer step 
can start.

Solution: 
Make sort step as non blocking step so it avoids waiting of Data writer step.
Process the data in sort step in batches with size of in-memory capability of 
the machine. For suppose if machine can allocate 4 GB to process data 
in-memory, then Sort step can sorts the data with batch size of 2GB and gives 
it to the data writer step. By the time data writer step consumes the data, 
sort step receives and sorts the data. So here all steps are continuously 
working and absolutely there is no disk IO in sort step.

So there would not be any waiting of data writer step for sort step, As and 
when sort step sorts the data in memory data writer can start writing it.
It can significantly improves the performance.

Advantages:
Increases the loading performance as there is no intermediate IO and no 
blocking of Sort step.
There is no extra effort for compaction, the current flow can handle it.

Disadvantages:
Number of driver side btrees will increase. So the memory might increase but it 
could be controlled by current LRU cache implementation.

  was:
Hi,
Current Problem:
Sort step is major issue as it is blocking step. It needs to receive all data 
and write down the sort temp files to disk, after that only data writer step 
can start.

Solution: 
Make sort step as non blocking step so it avoids waiting of Data writer step.
Process the data in sort step in batches with size of in-memory capability of 
the machine. For suppose if machine can allocate 4 GB to process data 
in-memory, then Sort step can sorts the data with batch size of 2GB and gives 
it to the data writer step. By the time data writer step consumes the data, 
sort step receives and sorts the data. So here all steps are continuously 
working and absolutely there is no disk IO in sort step.

So there would not be any waiting of data writer step for sort step, As and 
when sort step sorts the data in memory data writer can start writing it.
It can significantly improves the performance.

Advantages:
Increases the loading performance as there is no intermediate IO and no 
blocking of Sort step.
There is no extra effort for compaction, the current flow can handle it.

Disadvantages:
Number of driver side btrees will increase. So the memory might increase but it 
could be controlled by current LRU cache implementation.


> Add batch sort to improve the loading performance
> -
>
> Key: CARBONDATA-742
> URL: https://issues.apache.org/jira/browse/CARBONDATA-742
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>
> Current Problem:
> Sort step is major issue as it is blocking step. It needs to receive all data 
> and write down the sort temp files to disk, after that only data writer step 
> can start.
> Solution: 
> Make sort step as non blocking step so it avoids waiting of Data writer step.
> Process the data in sort step in batches with size of in-memory capability of 
> the machine. For suppose if machine can allocate 4 GB to process data 
> in-memory, then Sort step can sorts the data with batch size of 2GB and gives 
> it to the data writer step. By the time data writer step consumes the data, 
> sort step receives and sorts the data. So here all steps are continuously 
> working and absolutely there is no disk IO in sort step.
> So there would not be any waiting of data writer step for sort step, As and 
> when sort step sorts the data in memory data writer can start writing it.
> It can significantly improves the performance.
> Advantages:
> Increases the loading performance as there is no intermediate IO and no 
> blocking of Sort step.
> There is no extra effort for compaction, the current flow can handle it.
> Disadvantages:
> Number of driver side btrees will increase. So the memory might increase but 
> it could be controlled by current LRU cache implementation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (CARBONDATA-742) Add batch sort to improve the loading performance

2017-03-02 Thread Ravindra Pesala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala reassigned CARBONDATA-742:
--

Assignee: Ravindra Pesala

> Add batch sort to improve the loading performance
> -
>
> Key: CARBONDATA-742
> URL: https://issues.apache.org/jira/browse/CARBONDATA-742
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>
> Current Problem:
> Sort step is major issue as it is blocking step. It needs to receive all data 
> and write down the sort temp files to disk, after that only data writer step 
> can start.
> Solution: 
> Make sort step as non blocking step so it avoids waiting of Data writer step.
> Process the data in sort step in batches with size of in-memory capability of 
> the machine. For suppose if machine can allocate 4 GB to process data 
> in-memory, then Sort step can sorts the data with batch size of 2GB and gives 
> it to the data writer step. By the time data writer step consumes the data, 
> sort step receives and sorts the data. So here all steps are continuously 
> working and absolutely there is no disk IO in sort step.
> So there would not be any waiting of data writer step for sort step, As and 
> when sort step sorts the data in memory data writer can start writing it.
> It can significantly improves the performance.
> Advantages:
> Increases the loading performance as there is no intermediate IO and no 
> blocking of Sort step.
> There is no extra effort for compaction, the current flow can handle it.
> Disadvantages:
> Number of driver side btrees will increase. So the memory might increase but 
> it could be controlled by current LRU cache implementation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-742) Add batch sort to improve the loading performance

2017-03-02 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-742:
--

 Summary: Add batch sort to improve the loading performance
 Key: CARBONDATA-742
 URL: https://issues.apache.org/jira/browse/CARBONDATA-742
 Project: CarbonData
  Issue Type: Improvement
Reporter: Ravindra Pesala


Hi,
Current Problem:
Sort step is major issue as it is blocking step. It needs to receive all data 
and write down the sort temp files to disk, after that only data writer step 
can start.

Solution: 
Make sort step as non blocking step so it avoids waiting of Data writer step.
Process the data in sort step in batches with size of in-memory capability of 
the machine. For suppose if machine can allocate 4 GB to process data 
in-memory, then Sort step can sorts the data with batch size of 2GB and gives 
it to the data writer step. By the time data writer step consumes the data, 
sort step receives and sorts the data. So here all steps are continuously 
working and absolutely there is no disk IO in sort step.

So there would not be any waiting of data writer step for sort step, As and 
when sort step sorts the data in memory data writer can start writing it.
It can significantly improves the performance.

Advantages:
Increases the loading performance as there is no intermediate IO and no 
blocking of Sort step.
There is no extra effort for compaction, the current flow can handle it.

Disadvantages:
Number of driver side btrees will increase. So the memory might increase but it 
could be controlled by current LRU cache implementation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-741) Remove the unnecessary classes from carbondata

2017-03-02 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-741:
--

 Summary: Remove the unnecessary classes from carbondata
 Key: CARBONDATA-741
 URL: https://issues.apache.org/jira/browse/CARBONDATA-741
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala
Priority: Trivial


Please remove following classes as it is not used now.

VectorChunkRowIterator
CarbonColumnVectorImpl



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


  1   2   3   4   >