from:"kumar vishal \(JIRA\)"

[jira] [Assigned] (CARBONDATA-3830) Presto read support for complex columns

2020-08-30 Thread Kumar Vishal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kumar Vishal reassigned CARBONDATA-3830:


Assignee: Ajantha Bhat

> Presto read support for complex columns
> ---
>
> Key: CARBONDATA-3830
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3830
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, presto-integration
>Reporter: Akshay
>Assignee: Ajantha Bhat
>Priority: Minor
> Attachments: Presto Read Support.pdf
>
>  Time Spent: 33h 40m
>  Remaining Estimate: 0h
>
> This feature is to enable Presto to read complex columns from carbon file.
> Complex columns include - array, map and struct.
> This design document handles only for array type.
> Map and Struct types will be handled later.
>  
> PR - [https://github.com/apache/carbondata/pull/3773]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CARBONDATA-3830) Presto read support for complex columns

2020-08-30 Thread Kumar Vishal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kumar Vishal resolved CARBONDATA-3830.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> Presto read support for complex columns
> ---
>
> Key: CARBONDATA-3830
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3830
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, presto-integration
>Reporter: Akshay
>Assignee: Ajantha Bhat
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: Presto Read Support.pdf
>
>  Time Spent: 33h 40m
>  Remaining Estimate: 0h
>
> This feature is to enable Presto to read complex columns from carbon file.
> Complex columns include - array, map and struct.
> This design document handles only for array type.
> Map and Struct types will be handled later.
>  
> PR - [https://github.com/apache/carbondata/pull/3773]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CARBONDATA-3555) Refactor DataMapFilter to act as a filter provider.

2019-11-19 Thread Kumar Vishal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kumar Vishal resolved CARBONDATA-3555.
--
Fix Version/s: 2.0.0
 Assignee: (was: kunalkapoor#1)
   Resolution: Fixed

> Refactor DataMapFilter to act as a filter provider.
> ---
>
> Key: CARBONDATA-3555
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3555
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kunalkapoor#1
>Priority: Minor
> Fix For: 2.0.0
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>
> Refactor DataMapFilter to act as a filter provider and should be used in all 
> the API to get the expression and the filter resolver.
>  # All filter resolving related methods will be moved to datamap filter.
>  # change existing internal interfaces to take DataMapFilter instead of 
> Expression.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (CARBONDATA-3454) Optimize the performance of select coun(*) for index server

2019-09-16 Thread kumar vishal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3454.
--
Resolution: Fixed

> Optimize the performance of select coun(*) for index server
> ---
>
> Key: CARBONDATA-3454
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3454
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
>  Time Spent: 13h 10m
>  Remaining Estimate: 0h
>
> Currently all the extended blocklets are being returned to the main driver in 
> case of count *.* But all this information is not required for count(*) case, 
> therefore the optimal thing would be to send only the requried info.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Resolved] (CARBONDATA-3515) Limit local dictionary size to 10% of allowed blocklet size

2019-09-12 Thread kumar vishal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3515.
--
  Assignee: kumar vishal
Resolution: Fixed

> Limit local dictionary size to 10% of allowed blocklet size
> ---
>
> Key: CARBONDATA-3515
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3515
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ajantha Bhat
>Assignee: kumar vishal
>Priority: Major
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> problem: currently local dictionary max size is 2GB, because of this for 
> varchar columns or long string columns, local dictionary can be of 2GB size. 
> so, as local dictionary is stored in blocklet. blocklet size will exceed 2 
> GB, even though configured maximum blocklet size is 64MB. some places inter 
> overflow happens during casting.
>  
> solution: limit the local dictionary size to 10% of maximum allowed blocklet 
> size



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Resolved] (CARBONDATA-3506) Alter table add, drop, rename and datatype change fails with hive compatile property

2019-09-12 Thread kumar vishal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3506.
--
Resolution: Fixed

> Alter table add, drop, rename and datatype change fails with hive compatile 
> property
> 
>
> Key: CARBONDATA-3506
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3506
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> 1. Alter table add, drop, rename and datatype change fails on partition table 
> with hive compatile property
> when hive.metastore.disallow.incompatible.col.type.changes is set true, add 
> column or any alter fails on parition table in spark 2.2 and above
> 2. when table has only two columns with one as partition column, if we allow 
> dropping of non partition column, then it will like table with all the 
> columns as partition column , which is invalid, and fails with above property 
> as true. so block this operation.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Resolved] (CARBONDATA-3505) Fixed drop database cascade issue when 2 database point to same location.

2019-09-05 Thread kumar vishal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3505.
--
Fix Version/s: 1.6.1
   Resolution: Fixed

> Fixed drop database cascade issue when 2 database point to same location.
> -
>
> Key: CARBONDATA-3505
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3505
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.1
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Steps to reproduce: 
>  # create database x location '/x/table1
>  # create database x1 location '/x/table1'
>  # create table in x and x1
>  # drop database x cascade
>  # drop database x1 cascade.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

[jira] [Resolved] (CARBONDATA-3487) wrong Input metrics (size/record) displayed in spark UI during insert into

2019-08-08 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3487.
--
Resolution: Fixed

> wrong Input metrics (size/record) displayed in spark UI during insert into
> --
>
> Key: CARBONDATA-3487
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3487
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Minor
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> create a carbon table 
> insert huge data (2 Billion row)  to carbon table.
> observe the metrics in spark UI. Both the size and record count in input 
> metrics is wrong during insert into scenario.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (CARBONDATA-3452) select query failure when substring on dictionary column with join

2019-08-08 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3452.
--
Resolution: Fixed

> select query failure when substring on dictionary column with join
> --
>
> Key: CARBONDATA-3452
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3452
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> h1. select query failure when substring on dictionary column with join
> "select a.ch from (select substring(s1,1,2) as ch from t1) a join t2 h on 
> (a.ch = h.t2)"
> problem: select query failure when substring on dictionary column with join.
> cause: when dictionary include is present, data type is updated to int from 
> string in plan attribute. so substring was unresolved on int column. Join 
> operation try to reference this attribute which is unresolved.
> solution: skip updating datatype if dictionary is included in the plan
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

[jira] [Resolved] (CARBONDATA-3449) Initialization of listeners in case of concurrent scenrios is not synchronized

2019-06-26 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3449.
--
Resolution: Fixed

> Initialization of listeners in case of concurrent scenrios is not synchronized
> --
>
> Key: CARBONDATA-3449
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3449
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3448) Wrong results in preaggregate query with spark adaptive execution

2019-06-25 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3448.
--
Resolution: Fixed

> Wrong results in preaggregate query with spark adaptive execution
> -
>
> Key: CARBONDATA-3448
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3448
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> problem: Wrong results in preaggregate query with spark adaptive execution
> Spark2TestQueryExecutor.conf.set(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key, 
> "true")
>  
> cause: For preaggreagate, segment info is set into threadLocal. when adaptive 
> execution is called, spark is calling getInternalPartition in another thread 
> where updated segment conf  is not set. Hence it is not using the updated 
> segments.
>  
> solution: CarbonScanRdd is already having the sessionInfo, use it instead of 
> taking session info from the current thread.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3427) Beautify DAG by showing less text

2019-06-25 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3427.
--
Resolution: Fixed

> Beautify DAG by showing less text
> -
>
> Key: CARBONDATA-3427
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3427
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: jiangmanhua
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3441) Aggregate queries are failing on Reading from Hive

2019-06-24 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3441.
--
Resolution: Fixed

> Aggregate queries are failing on Reading from Hive
> --
>
> Key: CARBONDATA-3441
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3441
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Priority: Minor
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Aggregate queries are failing on Reading from Hive



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3444) MV query fails for table having same col name and table name and other issues

2019-06-21 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3444.
--
Resolution: Fixed

> MV query fails for table having same col name and table name and other issues
> -
>
> Key: CARBONDATA-3444
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3444
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Priority: Major
>  Time Spent: 9h 40m
>  Remaining Estimate: 0h
>
> 1. MV creation fails for in join case for table having col name as table name
> 2. MV query fails for query with cast expression presnt as projection in ctas 
> query with alias
> use below test cases to reproduce the issue
>  test("test cast expression with mv") {
> sql("drop table IF EXISTS maintable")
> sql("create table maintable (m_month bigint, c_code string, " +
> "c_country smallint, d_dollar_value double, q_quantity double, u_unit 
> smallint, b_country smallint, i_id int, y_year smallint) stored by 
> 'carbondata'")
> sql("insert into maintable select 10, 'xxx', 123, 456, 45, 5, 23, 1, 
> 2000")
> sql("drop datamap if exists da_cast")
> sql("create datamap da_cast using 'mv' as select cast(floor((m_month 
> +1000) / 900) * 900 - 2000 AS INT) as a, c_code as abc,m_month from 
> maintable")
> val df1 = sql(" select cast(floor((m_month +1000) / 900) * 900 - 2000 AS 
> INT) as a ,c_code as abc  from maintable")
> val df2 = sql(" select cast(floor((m_month +1000) / 900) * 900 - 2000 AS 
> INT),c_code as abc  from maintable")
> val analyzed1 = df1.queryExecution.analyzed
> assert(TestUtil.verifyMVDataMap(analyzed1, "da_cast"))
>   }
>   test("test mv query when the column names and table name same in join 
> scenario") {
> sql("drop table IF EXISTS price")
> sql("drop table IF EXISTS quality")
> sql("create table price(product string,price int) stored by 'carbondata'")
> sql("create table quality(product string,quality string) stored by 
> 'carbondata'")
> sql("create datamap same_mv using 'mv' as select 
> price.product,price.price,quality.product,quality.quality from price,quality 
> where price.product = quality.product")
> val df1 = sql("select price.product from price,quality where 
> price.product = quality.product")
> val analyzed1 = df1.queryExecution.analyzed
> assert(TestUtil.verifyMVDataMap(analyzed1, "same_mv"))
>   }



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3445) In Aggregate query, CountStarPlan throws head of empty list error

2019-06-21 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3445.
--
Resolution: Fixed

> In Aggregate query, CountStarPlan throws head of empty list error
> -
>
> Key: CARBONDATA-3445
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3445
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3373) Optimize scenes with in numbers in SQL

2019-06-21 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3373.
--
Resolution: Fixed

> Optimize scenes with in numbers in SQL
> --
>
> Key: CARBONDATA-3373
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3373
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Reporter: zhxiaoping
>Priority: Critical
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> when sql with 'in numbers'  and spark.sql.codegen.wholeStage is false，the 
> query is slow, 
> the reason is that canbonscan row level filter's time complexity is O(n^2), 
> we can replace list with hashset to improve query performance
> sql example: select * from xx where filed in (1,2,3,4,5,6)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-3373) Optimize scenes with in numbers in SQL

2019-06-21 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-3373:


Assignee: kumar vishal

> Optimize scenes with in numbers in SQL
> --
>
> Key: CARBONDATA-3373
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3373
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Reporter: zhxiaoping
>Assignee: kumar vishal
>Priority: Critical
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> when sql with 'in numbers'  and spark.sql.codegen.wholeStage is false，the 
> query is slow, 
> the reason is that canbonscan row level filter's time complexity is O(n^2), 
> we can replace list with hashset to improve query performance
> sql example: select * from xx where filed in (1,2,3,4,5,6)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-3373) Optimize scenes with in numbers in SQL

2019-06-21 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-3373:


Assignee: (was: kumar vishal)

> Optimize scenes with in numbers in SQL
> --
>
> Key: CARBONDATA-3373
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3373
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core
>Reporter: zhxiaoping
>Priority: Critical
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> when sql with 'in numbers'  and spark.sql.codegen.wholeStage is false，the 
> query is slow, 
> the reason is that canbonscan row level filter's time complexity is O(n^2), 
> we can replace list with hashset to improve query performance
> sql example: select * from xx where filed in (1,2,3,4,5,6)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3447) Index Server Performance Improvement

2019-06-20 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3447:


 Summary: Index Server Performance Improvement
 Key: CARBONDATA-3447
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3447
 Project: CarbonData
  Issue Type: Improvement
  Components: core
Reporter: kumar vishal
Assignee: kumar vishal


Problem:

When number of splits are high, index server performance is slow as compared to 
old flow(Driver caching). This is because data is transferred over network is 
more and causing performance bottleneck.   

Solution:
 # If data transferred is less we can sent through network, but when it grows 
we can write to file and only send file name and in Main driver it will read 
the file and construct input split. 
 # Use snappy to compress the data, so data transferred through network/written 
to file size will be less, so IO time wont impact performance 
 # In main driver pruning is done in multiple thread, added same for index 
executor as now index executor will do the pruning
 # In case of block cache no need to send blockletdetailinfo object as size is 
more and same can be constructed in executor from file footer



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3413) Arrow allocators gives OutOfMemory error when test with hugedata

2019-06-12 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3413.
--
Resolution: Fixed

> Arrow allocators gives  OutOfMemory error when test with hugedata
> -
>
> Key: CARBONDATA-3413
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3413
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Arrow allocators gives  OutOfMemory error when test with hugedata
>  
> problem: OOM exception when in arrow with huge data
> cause: In ArrowConverter, allocator is not closed 
> solution: close the allocator in arrowConverter.
> Also handle the problems in test utility API 
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3410) Add UDF, Hex/Base64 SQL functions for binary

2019-06-12 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3410.
--
Resolution: Fixed

> Add UDF, Hex/Base64 SQL functions for binary
> 
>
> Key: CARBONDATA-3410
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3410
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Add UDF, Hex/Base64 SQL functions for binary



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3421) Create table without column with properties failed, but throw incorrect exception

2019-06-12 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3421.
--
Resolution: Fixed

> Create table without column with properties failed, but throw incorrect 
> exception
> -
>
> Key: CARBONDATA-3421
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3421
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Reporter: Yahui Liu
>Priority: Minor
> Attachments: screenshot-1.png
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> 1. Create table without column but with correct tblproperties: create table 
> ll stored by 'carbondata' tblproperties('sort_columns'='');
> 2. Create table will fail with exception: Invalid table properties 
> sort_columns, this is incorrect. Should throw correct exception like no 
> schema is specified.
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3394) Clean files taking lot of time to finish

2019-05-30 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3394.
--
Resolution: Fixed

> Clean files taking lot of time to finish
> 
>
> Key: CARBONDATA-3394
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3394
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Clean files taking lot of time to finish, even though there are no marked for 
> delete segment



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-3394) Clean files taking lot of time to finish

2019-05-30 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-3394:


Assignee: (was: kumar vishal)

> Clean files taking lot of time to finish
> 
>
> Key: CARBONDATA-3394
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3394
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Clean files taking lot of time to finish, even though there are no marked for 
> delete segment



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3358) Support configurable decode for loading binary data, support base64 and Hex decode.

2019-05-30 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3358.
--
Resolution: Fixed

> Support configurable decode for loading binary data, support base64 and Hex 
> decode.
> ---
>
> Key: CARBONDATA-3358
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3358
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Support configurable decode for loading binary data, support base64 and Hex 
> decode.
> 1. support configurable decode for loading
> 2. test datamap: mv, preaggregate, timeseries, bloomfilter, lucene
> 3. test datamap and configurable decode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-3394) Clean files taking lot of time to finish

2019-05-30 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-3394:


Assignee: kumar vishal

> Clean files taking lot of time to finish
> 
>
> Key: CARBONDATA-3394
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3394
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: kumar vishal
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Clean files taking lot of time to finish, even though there are no marked for 
> delete segment



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3343) Support Compaction for Range Sort

2019-05-07 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3343.
--
Resolution: Fixed

> Support Compaction for Range Sort
> -
>
> Key: CARBONDATA-3343
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3343
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: MANISH NALLA
>Priority: Major
> Attachments: Support Compaction for Range.docx
>
>  Time Spent: 19h 50m
>  Remaining Estimate: 0h
>
> CarbonData supports Compaction for all sort scopes based on their 
> taskIds, i.e, we group the partitions(carbondata files) of different 
> segments which have the same taskId to one task and then compact. But this 
> would not be the correct way to handle the compaction in the case of Range 
> Sort where we have data divided into different ranges for different 
> segments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3345) Presto query in Carbondata-streaming failed

2019-05-07 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3345.
--
Resolution: Fixed

> Presto query in Carbondata-streaming failed
> ---
>
> Key: CARBONDATA-3345
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3345
> Project: CarbonData
>  Issue Type: Bug
>  Components: presto-integration
>Affects Versions: 1.5.2
>Reporter: zjy
>Priority: Blocker
>  Labels: features
> Attachments: image-2019-04-09-01-39-45-724.png, 
> image-2019-04-09-01-50-37-797.png, image-2019-04-09-01-51-11-120.png, 
> image-2019-04-09-01-51-53-115.png, image-2019-04-09-01-53-18-685.png, 
> image-2019-04-09-01-54-05-491.png
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> I use streaming for saving switch' s syslog, here' s my table ddl
> CREATE TABLE IF NOT EXISTS syslog(id LONG, device_id LONG, ip STRING, message 
> STRING, level SHORT, message_type CHAR(1), port_index INT, area_id LONG, 
> createdon TIMESTAMP) STORED AS carbondata TBLPROPERTIES 
> ('INVERTED_INDEX'='device_id,level,area_id','SORT_COLUMNS'='device_id,level,area_id,id','streaming'='true')
>  
> Here' s a record example
> !image-2019-04-09-01-39-45-724.png!
>  I mock ten thousand switchs，each one produced a record every 10 seconds on a 
> day.
> On the early time, the presto' s query is normally.
> However, when the data growing, the presto query result is diffent from the 
> spark sql.
>  
> !image-2019-04-09-01-50-37-797.png!
>  
> !image-2019-04-09-01-51-11-120.png!
>  
> !image-2019-04-09-01-51-53-115.png!
>  
>  
> !image-2019-04-09-01-53-18-685.png!
>  
> !image-2019-04-09-01-54-05-491.png!
>  
> I am looking forward to resolving it, thanks !



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3359) Data mismatch issue for decimal column after delete operation

2019-05-02 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3359.
--
Resolution: Fixed

> Data mismatch issue for decimal column after delete operation
> -
>
> Key: CARBONDATA-3359
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3359
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> after delete operation, the decimal column data is incorrect
>  
> sql(
>  s"""create table decimal_table(smallIntField smallInt,intField 
> int,bigIntField bigint,floatField float,
>  doubleField double,decimalField decimal(25, 4),timestampField 
> timestamp,dateField date,stringField string,
>  varcharField varchar(10),charField char(10))stored as carbondata
>  """.stripMargin)
>  sql(s"load data local inpath '$resourcesPath/decimalData.csv' into table 
> decimal_table")
> sql("drop table if exists decimal_table")
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3341) Query is giving NULL in result

2019-04-15 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3341.
--
Resolution: Fixed

> Query is giving NULL in result
> --
>
> Key: CARBONDATA-3341
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3341
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
> 1 set bad_records_action=force and 
> carbon.push.rowfilters.for.vector=true
> 2. create table t1(a bigint) stored by 'carbondata' 
> TBLPROPERTIES('sort_columns'='a');
> 3. insert into t1 select 'k';
> 4. insert into t1 select 1;
> 5. select * from t1 where a = 1 or a = 0;
>  
> ++-+
> |a|
> ++-+
> |NULL|
> |1|
> ++-+
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3335) Fixed load and compaction failure after alter done in older version

2019-03-29 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3335:


 Summary: Fixed load and compaction failure after alter done in 
older version
 Key: CARBONDATA-3335
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3335
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


*No Sort Load/Compaction is failing in latest version with alter in older 
version*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3333) Fixed No Sort Store Size issue and Compatibility issue after alter addd column done in 1.1 and load in 1.5

2019-03-27 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-:


 Summary: Fixed No Sort Store Size issue and Compatibility issue 
after alter addd column done in 1.1 and load in 1.5
 Key: CARBONDATA-
 URL: https://issues.apache.org/jira/browse/CARBONDATA-
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal


Issue 1: Load is failing in latest version with alter in older version

Issue 2: After PR#3140 store size got increased  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3328) Performance issue with merge small files distribution

2019-03-25 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3328:


 Summary: Performance issue with merge small files distribution
 Key: CARBONDATA-3328
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3328
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal


After PR#3154 in case of merge small files split length was coming 0 because of 
this it was merging all the files because of this query with merge small files 
was slow



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3321) Improve Single/Concurrent query performance

2019-03-21 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3321:


 Summary: Improve Single/Concurrent query performance 
 Key: CARBONDATA-3321
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3321
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal


When number of Segments are high Single/Concurrent query is slow. because of 
below reason 
 # Memory footprint is more because of this gc is more and reducing query 
performance
 # Converting to Unsafe data map row to safe data map during pruning 
 # Multi threaded pruning in case of non filter query is not supported 
 # Retrieval from unsafe data map row is slower 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3302) code cleaning related to CarbonCreateTable command

2019-03-20 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3302.
--
Resolution: Fixed

> code cleaning related to CarbonCreateTable command
> --
>
> Key: CARBONDATA-3302
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3302
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Reporter: Sujith Chacko
>Priority: Minor
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Extra check to validate whether the stream relation is not null , moreover 
> condition can be optimized further, currently the condition has path 
> validation whether path is part of s3 file system and then system is  
> checking whether the stream relation is not null, thi check can be first as 
> this overall condition has to be evaluated for stream table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3297) Throw IndexOutOfBoundsException when creating table and drop table at the same time

2019-03-12 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3297.
--
Resolution: Fixed
  Assignee: kumar vishal

> Throw IndexOutOfBoundsException when creating table and drop table at the 
> same time
> ---
>
> Key: CARBONDATA-3297
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3297
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Chenjian Qiu
>Assignee: kumar vishal
>Priority: Major
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> java.lang.IndexOutOfBoundsException: 179
> at 
> scala.collection.mutable.ResizableArray$class.apply(ResizableArray.scala:43)
> at scala.collection.mutable.ArrayBuffer.apply(ArrayBuffer.scala:48)
> at 
> scala.collection.IndexedSeqOptimized$class.segmentLength(IndexedSeqOptimized.scala:195)
> at 
> scala.collection.mutable.ArrayBuffer.segmentLength(ArrayBuffer.scala:48)
> at scala.collection.GenSeqLike$class.prefixLength(GenSeqLike.scala:93)
> at scala.collection.AbstractSeq.prefixLength(Seq.scala:41)
> at 
> scala.collection.IndexedSeqOptimized$class.find(IndexedSeqOptimized.scala:50)
> at scala.collection.mutable.ArrayBuffer.find(ArrayBuffer.scala:48)
> at 
> org.apache.spark.sql.hive.CarbonFileMetastore.getTableFromMetadataCache(CarbonFileMetastore.scala:203)
> at org.apache.spark.sql.CarbonEnv$.getCarbonTable(CarbonEnv.scala:203)
> at org.apache.spark.sql.CarbonEnv$.getTablePath(CarbonEnv.scala:288)
> at 
> org.apache.spark.sql.execution.command.table.CarbonCreateTableCommand$$anonfun$1.apply(CarbonCreateTableCommand.scala:74)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3307) when we creating the table without sort_columns and loading the data into it. Its generating more carbondata files than expected.

2019-03-12 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3307.
--
Resolution: Fixed
  Assignee: kumar vishal

> when we creating the table without sort_columns and loading the data into it. 
> Its generating more carbondata files than expected.
> -
>
> Key: CARBONDATA-3307
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3307
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Shivam Goyal
>Assignee: kumar vishal
>Priority: Minor
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3280) SDK batch read failed

2019-01-30 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3280.
--
Resolution: Fixed

> SDK batch read failed
> -
>
> Key: CARBONDATA-3280
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3280
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> SDK batch read failed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (CARBONDATA-3235) AlterTableRename and PreAgg Datamap Fail Issue

2019-01-28 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-3235:
-
Description: 
h3. Alter Table Rename Table Fail
 * When table rename is success in hive, but failed in carbon data store, it 
would throw exception, but would not go back and undo rename in hive.

h3.  

  was:
h3. Alter Table Rename Table Fail
 * When table rename is success in hive, but failed in carbon data store, it 
would throw exception, but would not go back and undo rename in hive.

h3. Create-Preagregate-Datamap Fail
 * When (preaggregate) datamap schema is written, but table updation is failed
-> call CarbonDropDataMapCommand.processMetadata()
-> call dropDataMapFromSystemFolder() -> this is supposed to delete the folder 
on disk, but doesnt as the datamap is not yet updated in table, and throws 
NoSuchDataMapException


> AlterTableRename and PreAgg Datamap Fail Issue
> --
>
> Key: CARBONDATA-3235
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3235
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Naman Rastogi
>Assignee: Naman Rastogi
>Priority: Minor
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> h3. Alter Table Rename Table Fail
>  * When table rename is success in hive, but failed in carbon data store, it 
> would throw exception, but would not go back and undo rename in hive.
> h3.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3264) Support SORT_SCOPE in ALTER TABLE SET Command

2019-01-25 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3264.
--
Resolution: Fixed

> Support SORT_SCOPE in ALTER TABLE SET Command
> -
>
> Key: CARBONDATA-3264
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3264
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Naman Rastogi
>Assignee: Naman Rastogi
>Priority: Minor
>  Time Spent: 5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3267) Data loading is failing with OOM using range sort

2019-01-23 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3267:


 Summary: Data loading is failing with OOM using range sort
 Key: CARBONDATA-3267
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3267
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


h3. Problem:

Range sort is failing with OOM.
h3. Root cause:

This is because UnsafeSortStorageMemory is not able to control the off heap 
memory because of this when huge data is loaded it OOM exception is coming fron 
UnsafeMemoryAllocator.allocate.
h3. Solution:

Added code code to control Sort Storage memory. After sorting the rows if 
memory is available then only add sorted records to sort storage memory 
otherwise write to disk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (CARBONDATA-3267) Data loading is failing with OOM using range sort

2019-01-23 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-3267:
-
Description: 
h3. Problem:

Range sort is failing with OOM.
h3. Root cause:

This is because UnsafeSortStorageMemory is not able to control the off heap 
memory because of this when huge data is loaded it OOM exception is coming fron 
UnsafeMemoryAllocator.allocate.
h3. Solution:

Control Sort Storage memory. After sorting the rows if memory is available then 
only add sorted records to sort storage memory otherwise write to disk

  was:
h3. Problem:

Range sort is failing with OOM.
h3. Root cause:

This is because UnsafeSortStorageMemory is not able to control the off heap 
memory because of this when huge data is loaded it OOM exception is coming fron 
UnsafeMemoryAllocator.allocate.
h3. Solution:

Added code code to control Sort Storage memory. After sorting the rows if 
memory is available then only add sorted records to sort storage memory 
otherwise write to disk


> Data loading is failing with OOM using range sort
> -
>
> Key: CARBONDATA-3267
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3267
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
>
> h3. Problem:
> Range sort is failing with OOM.
> h3. Root cause:
> This is because UnsafeSortStorageMemory is not able to control the off heap 
> memory because of this when huge data is loaded it OOM exception is coming 
> fron UnsafeMemoryAllocator.allocate.
> h3. Solution:
> Control Sort Storage memory. After sorting the rows if memory is available 
> then only add sorted records to sort storage memory otherwise write to disk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3257) Data Load is in No sort flowwhen version is upgraded even if sort columns are given. Also describe formatted displays wrong sort scope after refresh.

2019-01-23 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3257.
--
Resolution: Fixed

> Data Load is in No sort flowwhen version is upgraded even if sort columns are 
> given. Also describe formatted displays wrong sort scope after refresh.
> -
>
> Key: CARBONDATA-3257
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3257
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Minor
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3237) optimize presto query time for dictionary include string column

2019-01-09 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3237.
--
Resolution: Fixed

> optimize presto query time for dictionary include string column
> ---
>
> Key: CARBONDATA-3237
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3237
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> optimize presto query time for dictionary include string column.
>  
> problem: currently, for each query, presto carbon creates dictionary block 
> for string columns.
> This happens for each query and if cardinality is more , it takes more time 
> to build. This is not required. we can lookup using normal dictionary lookup.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3200) No-Sort Compaction

2019-01-09 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3200.
--
Resolution: Fixed

> No-Sort Compaction
> --
>
> Key: CARBONDATA-3200
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3200
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core
>Reporter: Naman Rastogi
>Assignee: Naman Rastogi
>Priority: Major
>  Time Spent: 14h
>  Remaining Estimate: 0h
>
> When the data is loaded with SORT_SCOPE as NO_SORT, and done compaction upon, 
> the data still remains unsorted. This does not affect much in query. The 
> major purpose of compaction, is better pack the data and improve query 
> performance.
>  
> Now, the expected behaviour of compaction is sort to the data, so that after 
> compaction, query performance becomes better. The columns to sort upon are 
> provided by SORT_COLUMNS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3236) JVM Crash for insert into new table from old table

2019-01-09 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3236.
--
Resolution: Fixed

> JVM Crash for insert into new table from old table
> --
>
> Key: CARBONDATA-3236
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3236
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3235) AlterTableRename and PreAgg Datamap Fail Issue

2019-01-09 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3235.
--
Resolution: Fixed

> AlterTableRename and PreAgg Datamap Fail Issue
> --
>
> Key: CARBONDATA-3235
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3235
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Naman Rastogi
>Assignee: Naman Rastogi
>Priority: Minor
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> h3. Alter Table Rename Table Fail
>  * When table rename is success in hive, but failed in carbon data store, it 
> would throw exception, but would not go back and undo rename in hive.
> h3. Create-Preagregate-Datamap Fail
>  * When (preaggregate) datamap schema is written, but table updation is failed
> -> call CarbonDropDataMapCommand.processMetadata()
> -> call dropDataMapFromSystemFolder() -> this is supposed to delete the 
> folder on disk, but doesnt as the datamap is not yet updated in table, and 
> throws NoSuchDataMapException



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3201) SORT_SCOPE in LOAD_OPTIONS

2019-01-09 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3201.
--
Resolution: Fixed
  Assignee: Naman Rastogi

> SORT_SCOPE in LOAD_OPTIONS
> --
>
> Key: CARBONDATA-3201
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3201
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Naman Rastogi
>Assignee: Naman Rastogi
>Priority: Major
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Prerequisite: 
> [CARBONDATA-3200|https://issues.apache.org/jira/projects/CARBONDATA/issues/CARBONDATA-3200]
>  
> If the Compaction always sort the data, then we can take advantage of the 
> faster loading speed. If we provide SORT_COLUMNS in CREATE TABLE command, 
> then we can load some data with SORT_SCOPE as NO_SORT. This helps in faster 
> loading speed. But during off-peak time, user can COMPACT the data, and thus 
> improving the subsequent query perfrmance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3217) Optimize implicit filter expression performance by removing extra serialization

2019-01-04 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3217.
--
Resolution: Fixed
  Assignee: Manish Gupta

> Optimize implicit filter expression performance by removing extra 
> serialization
> ---
>
> Key: CARBONDATA-3217
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3217
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> # Currently all the filter values are getting serialized for all the tasks 
> which is increasing the schedular delay thereby impacting the query 
> performance.
>  # For each task 2 times deserialization is taking place in the executor side 
> which is not required. 1 time is suficient



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3212) Select * is failing with java.lang.NegativeArraySizeException in SDK flow

2019-01-02 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3212.
--
Resolution: Fixed

> Select * is failing with java.lang.NegativeArraySizeException in SDK flow
> -
>
> Key: CARBONDATA-3212
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3212
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.5.1
>Reporter: Shivam Goyal
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3218) Schema is not refreshing in presto which is changed in spark carbon.

2019-01-02 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3218.
--
Resolution: Fixed

> Schema is not refreshing in presto which is changed in spark carbon.
> 
>
> Key: CARBONDATA-3218
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3218
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Schema which is updated in spark is not reflecting in presto. which results 
> in wrong query result in presto.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3195) Added validation for inverted index

2018-12-28 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3195.
--
Resolution: Fixed
  Assignee: Shardul Singh

> Added validation for inverted index
> ---
>
> Key: CARBONDATA-3195
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3195
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Shardul Singh
>Assignee: Shardul Singh
>Priority: Minor
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3192) Compaction Compatibilty Failure

2018-12-24 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3192.
--
Resolution: Fixed

> Compaction Compatibilty Failure
> ---
>
> Key: CARBONDATA-3192
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3192
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Minor
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Table Created, Loaded and Altered(Column added) in 1.5.1 version and 
> Refreshed, Altered(Added Column dropped) , Loaded and Compacted with Varchar 
> Columns in new version giving error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3186) NPE when all the records in a file is badrecord with action redirect/ignore

2018-12-23 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3186.
--
Resolution: Fixed

> NPE when all the records in a file is badrecord with action redirect/ignore
> ---
>
> Key: CARBONDATA-3186
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3186
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> *problem:* In the no_sort flow, writer will be open as there is no blocking 
> sort step.
> So, when all the record goes as bad record with redirect in converted step.
> writer is closing the empty .carbondata file.
> when this empty carbondata file is queried , we get multiple issues including 
> NPE.
> *solution:* When the file size is 0 bytes. do the following
> a) If one data and one index file -- delete carbondata file and avoid index 
> file creation
> b) If multiple data and one index file (with few data file is full of bad 
> recod) -- delete carbondata files, remove them from blockIndexInfoList, so 
> index file not will not have that info of empty carbon files
> c) In case direct write to store path is enable. need to delete data file 
> from there and avoid writing index file with that carbondata in info.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3179) DataLoad Failure in Map Data Type

2018-12-20 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3179.
--
Resolution: Fixed

> DataLoad Failure in Map Data Type
> -
>
> Key: CARBONDATA-3179
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3179
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Data Load failing for insert into table select * from table containing Map 
> datatype



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3187) Global Dictionary Support for Complex Map

2018-12-20 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3187.
--
Resolution: Fixed
  Assignee: MANISH NALLA

> Global Dictionary Support for Complex Map
> -
>
> Key: CARBONDATA-3187
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3187
> Project: CarbonData
>  Issue Type: Task
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Minor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2563) Explain query with Order by operator is fired Spark Job which is increase explain query time

2018-12-12 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2563.
--
Resolution: Fixed
  Assignee: Sujith

> Explain query with Order by operator is fired Spark Job which is increase 
> explain query time
> 
>
> Key: CARBONDATA-2563
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2563
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Sujith
>Priority: Minor
> Attachments: image-2018-05-29-20-02-58-129.png
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Create Table (hive or carbon )
> create table justtesthive( name string,age int)
> insert into justtesthive select 'babu',12;
>  
> WithCarbonSession means startCarbonThriftserver
>  
>  
> Without CarbonSession (with SparkSession)
> 0: jdbc:hive2://10.18.222.231:23040> explain select name from justtesthive 
> order by name;
> ++--+
> | plan |
> ++--+
> | == Physical Plan ==
> *Sort [name#15 ASC NULLS FIRST], true, 0
> +- Exchange rangepartitioning(name#15 ASC NULLS FIRST, 200)
>  +- HiveTableScan [name#15], HiveTableRelation `default`.`justtesthive`, 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [name#15, age#16] |
> ++--+
> 1 row selected (0.089 seconds)
> 0: jdbc:hive2://10.18.222.231:23040> 
>  
>  
>  
> With CarbonSession.
>  
> 0: jdbc:hive2://10.18.222.231:23040> explain select name from justtesthive 
> order by name;
> +--+--+
> | plan |
> +--+--+
> | == CarbonData Profiler ==
>  |
> | == Physical Plan ==
> *Sort [name#1867 ASC NULLS FIRST], true, 0
> +- Exchange rangepartitioning(name#1867 ASC NULLS FIRST, 200)
>  +- HiveTableScan [name#1867], HiveTableRelation `default`.`justtesthive`, 
> org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, [name#1867, age#1868] |
> +--+--+
> 2 rows selected (11.609 seconds)
>  
> Time taken by explain is 10 sec for CarbonSession because SparkJob is fired , 
> but for Explain Job should not fired .
> !image-2018-05-29-20-02-58-129.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3005) Supporting Gzip as Column Compressor

2018-12-11 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3005.
--
Resolution: Fixed

> Supporting Gzip as Column Compressor
> 
>
> Key: CARBONDATA-3005
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3005
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Shardul Singh
>Assignee: Shardul Singh
>Priority: Minor
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Currently CarbonData uses Snappy as default codec to compress its columnar 
> file, Other than SNAPPY carbondata supports zstd. This Issue is targeted to 
> support:
> 1. Gzip compression codec.
> Benefits of Gzip are :
>  # Gzip offers reduced file size compared to other codec like snappy but at 
> the cost of processing speed.
>  # Gzip is suitable for users who have cold data i.e. data which which are 
> stored permanently and will be queried rarely.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3145) Avoid duplicate decoding for complex column pages while querying

2018-12-10 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3145.
--
Resolution: Fixed

> Avoid duplicate decoding for complex column pages while querying
> 
>
> Key: CARBONDATA-3145
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3145
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3143) Fix local dictionary issue for presto

2018-12-10 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3143.
--
Resolution: Fixed
  Assignee: Ravindra Pesala

> Fix local dictionary issue for presto
> -
>
> Key: CARBONDATA-3143
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3143
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Fix local dictionary issue for presto



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3138) Random count mismatch in query in multi-thread block-pruning scenario

2018-11-29 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3138.
--
Resolution: Fixed

> Random count mismatch in query in multi-thread block-pruning scenario
> -
>
> Key: CARBONDATA-3138
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3138
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> problem: Random count mismatch in query in multi-thread block-pruning 
> scenario.
> cause:Existing prune method not meant for multi-threading as synchronization 
> was missiing. 
> only in implicit filter scenario, while preparing the block ID list, 
> synchronization was missing. Hence pruning was giving wrong result.
> solution: syncronize the imlicit filter prepartion, as prune now called in 
> multi-thread
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3114) Remove Null Values for a Dictionary_Include Timestamp column for Range Filters

2018-11-22 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3114.
--
Resolution: Fixed

> Remove Null Values for a Dictionary_Include Timestamp column for Range Filters
> --
>
> Key: CARBONDATA-3114
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3114
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Minor
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Issue:
> Null Values are not removed in case of RangeFilters, if column is a 
> dictionary and no_inverted_index column



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3115) Fix CodeGen error in preaggregate table and codegen display issue in oldstores

2018-11-22 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3115.
--
Resolution: Fixed

> Fix CodeGen error in preaggregate table and codegen display issue in oldstores
> --
>
> Key: CARBONDATA-3115
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3115
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
> Attachments: image-2018-11-21-20-28-38-226.png
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Issues:
>  * While querying a preaggregate table, codegen error is displayed
>  * In old stores, code is getting displayed while executing queries.
> !image-2018-11-21-20-28-38-226.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3096) Wrong records size on the input metrics

2018-11-21 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3096.
--
Resolution: Fixed

> Wrong records size on the input metrics
> ---
>
> Key: CARBONDATA-3096
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3096
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Attachments: 3096.PNG
>
>  Time Spent: 11.5h
>  Remaining Estimate: 0h
>
> (1) Scanned record result size is taking from the default batch size. It 
> should be taken from the records scanned.
> h1. +*Steps to reproduce:*+
> spark.sql("DROP TABLE IF EXISTS person")
>  spark.sql("create table person (id int, name string) stored by 'carbondata'")
>  spark.sql("insert into person select 1,'a'")
>  spark.sql("select * from person").show(false)
> !3096.PNG!
>  
> (2) The intermediate page used to sort in adaptive encoding should be freed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3113) Fixed Local Dictionary Query Performance and Added reusable buffer for direct flow

2018-11-20 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3113:


 Summary: Fixed Local Dictionary Query Performance  and Added 
reusable buffer for direct flow
 Key: CARBONDATA-3113
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3113
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal
Assignee: kumar vishal


1. Added reusable buffer for direct flow
In query for each page each column it is creating a byte array, when number of 
columns are high it is causing lots of minor gc and degrading query 
performance, as each page is getting uncompressed one by one we can use same 
buffer for all the columns and based on requested size it will resize.

2. Fixed Local Dictionary performance issue.
Reverted back #2895 and fixed NPE issue by setting null for local dictionary to 
vector In safe and Unsafe VariableLengthDataChunkStore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3048) Added Lazy Loading For 2.2/2.1

2018-10-26 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3048:


 Summary: Added Lazy Loading For 2.2/2.1 
 Key: CARBONDATA-3048
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3048
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal
Assignee: kumar vishal


Problem:

Currently in 2.2/2.1 For Direct fill Lazy loading is not added because of this 
when data is huge and number of columns are high query is taking more time Lazy 
to execute.

Solution:

Add Lazy loading for 2.2 and 2.1 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3047) UnsafeMemoryManager fallback mechanism in case of memory not available

2018-10-26 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3047:


 Summary: UnsafeMemoryManager fallback mechanism in case of memory 
not available 
 Key: CARBONDATA-3047
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3047
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal


Currently when unsafe working memory is not available UnsafeMemoryManager is 
throwing MemoryException and killing the running task.
To make system more easier for the user now added fallback to heap when offheap 
memory is not available



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3015) Support lazy loading in carbon.

2018-10-26 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3015.
--
Resolution: Fixed
  Assignee: Ravindra Pesala

> Support lazy loading in carbon. 
> 
>
> Key: CARBONDATA-3015
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3015
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Support lazy loading in carbon. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3014) Support inverted index and delete delta fillings to vector for direct fill vector

2018-10-26 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3014.
--
Resolution: Fixed
  Assignee: Ravindra Pesala

> Support inverted index and delete delta fillings to vector for direct fill 
> vector
> -
>
> Key: CARBONDATA-3014
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3014
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> Support inverted index and delete delta fillings to vector for direct fill 
> vector



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3013) Support filter interface to allow prune the pages and fill the vector.

2018-10-26 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3013.
--
Resolution: Fixed
  Assignee: Ravindra Pesala

> Support filter interface to allow prune the pages and fill the vector.
> --
>
> Key: CARBONDATA-3013
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3013
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> Support filter interface to allow prune the pages and fill the vector. After 
> pages are pruned through min/max meta the column pages will be decoded and 
> fill the data directly to vector.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3012) Support full scan queries for vector direct fill.

2018-10-25 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3012.
--
Resolution: Fixed
  Assignee: Ravindra Pesala

> Support full scan queries for vector direct fill.
> -
>
> Key: CARBONDATA-3012
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3012
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 20h 20m
>  Remaining Estimate: 0h
>
> Add support for full scan queries which it fills the vector after decoding 
> the column page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-3011) Add carbon property to configure vector based row pruning push down

2018-10-25 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-3011.
--
Resolution: Fixed

> Add carbon property to configure vector based row pruning push down
> ---
>
> Key: CARBONDATA-3011
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3011
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Add configuration in carbon to enable or disable row filter push down for 
> vector



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-3011) Add carbon property to configure vector based row pruning push down

2018-10-25 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-3011:


Assignee: Ravindra Pesala

> Add carbon property to configure vector based row pruning push down
> ---
>
> Key: CARBONDATA-3011
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3011
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Add configuration in carbon to enable or disable row filter push down for 
> vector



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-1516) Support pre-aggregate tables and timeseries in carbondata

2018-10-16 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1516:


Assignee: kumar vishal

> Support pre-aggregate tables and timeseries in carbondata
> -
>
> Key: CARBONDATA-1516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1516
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Assignee: kumar vishal
>Priority: Major
> Fix For: 1.4.0
>
> Attachments: CarbonData Pre-aggregation Table.pdf, CarbonData 
> Pre-aggregation Table_v1.1.pdf, CarbonData Pre-aggregation Table_v1.2.pdf, 
> CarbonData Pre-aggregation Table_v1.3.pdf
>
>
> Currently Carbondata has standard SQL capability on distributed data 
> sets.Carbondata should support pre-aggregating tables for timeseries and 
> improve query performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3016) Refactor No Dictionary Dimension Column Query Processing Code

2018-10-16 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3016:


 Summary: Refactor No Dictionary Dimension Column Query Processing 
Code
 Key: CARBONDATA-3016
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3016
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal
Assignee: kumar vishal


*Method In-lining Optimization*
JIT will inline any method if method size is less than 325 byte code size and 
if it is called more than 10K times(default value). If method is private or 
static it will be easier for JIT to inline as type safe check is not required, 
for protected/public method it will add a overhead of type check and because of 
this it will not behave as inline.
Because of above case some refactoring is done for primitive no dictionary data 
type columns. Earlier ColumnPageWrapper.java was handling query processing for 
all primitive no dictionary data type column now in This PR separate classes 
are created for each data type handling and all the HOT method is kept as 
private and protected methods are overridden and other methods are added in 
Super classes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (CARBONDATA-3006) Carbon Store Size Optimization and Query Performance Improvement

2018-10-15 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-3006:
-
Summary: Carbon Store Size Optimization and Query Performance Improvement  
(was: Carbon Store Size Optimization and Scan Query Performance Improvement)

> Carbon Store Size Optimization and Query Performance Improvement
> 
>
> Key: CARBONDATA-3006
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3006
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: kumar vishal
>Priority: Major
>
> *String/Varchar Datatype Store Size Optimization:*
> Currently length is stored as Short/Int for String/varchar datatype because 
> of this store size is more. To reduce the store size Adaptive encoding is 
> applied for length part irrespective of String/Varchar type so during 
> processing there will not be separate handling for String/Varchar datatype.
> *String/Varchar datatype query processing optimization:*
> Currently for processing the String/Varchar datatype during query 
> offset(positions of data) is calculated and based on position data is 
> fetched. Because of this many cacheline misses is happening and its degrading 
> query performance.
> To handle this for full scan query with no inverted index, data is fetched is 
> in linear way to avoid cache line misses.
> *Adaptive encoding for Global/Direct/Local dictionary columns*
> Currently Global/Direct/Local dictionary are stored in binary format and only 
> snappy is applied for compression. As Global/Direct/Local dictionary values 
> are of Integer data type it can adaptability stored with the data type 
> smaller than Integer.
> Added adaptive for global/direct dictionary column to reduce the store size.
> *Method In-lining Optimization*
> JIT will inline any method if method size is less than 325 byte code size and 
> if it is called more than 10K times(default value). If method is private or 
> static it will be easier for JIT to inline as type safe check is not 
> required, for protected/public method it will add a overhead of type check 
> and because of this it will not behave as inline.
> Because of above case some refactoring is done for primitive no dictionary 
> data type columns. Earlier ColumnPageWrapper.java was handling query 
> processing for all primitive no dictionary data type column now in This PR 
> separate classes are created for each data type handling and all the HOT 
> method is kept as private and protected methods are overridden and other 
> methods are added in Super classes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-3006) Carbon Store Size Optimization and Scan Query Performance Improvement

2018-10-15 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-3006:


 Summary: Carbon Store Size Optimization and Scan Query Performance 
Improvement
 Key: CARBONDATA-3006
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3006
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal


*String/Varchar Datatype Store Size Optimization:*
Currently length is stored as Short/Int for String/varchar datatype because of 
this store size is more. To reduce the store size Adaptive encoding is applied 
for length part irrespective of String/Varchar type so during processing there 
will not be separate handling for String/Varchar datatype.

*String/Varchar datatype query processing optimization:*
Currently for processing the String/Varchar datatype during query 
offset(positions of data) is calculated and based on position data is fetched. 
Because of this many cacheline misses is happening and its degrading query 
performance.
To handle this for full scan query with no inverted index, data is fetched is 
in linear way to avoid cache line misses.

*Adaptive encoding for Global/Direct/Local dictionary columns*
Currently Global/Direct/Local dictionary are stored in binary format and only 
snappy is applied for compression. As Global/Direct/Local dictionary values are 
of Integer data type it can adaptability stored with the data type smaller than 
Integer.
Added adaptive for global/direct dictionary column to reduce the store size.

*Method In-lining Optimization*
JIT will inline any method if method size is less than 325 byte code size and 
if it is called more than 10K times(default value). If method is private or 
static it will be easier for JIT to inline as type safe check is not required, 
for protected/public method it will add a overhead of type check and because of 
this it will not behave as inline.
Because of above case some refactoring is done for primitive no dictionary data 
type columns. Earlier ColumnPageWrapper.java was handling query processing for 
all primitive no dictionary data type column now in This PR separate classes 
are created for each data type handling and all the HOT method is kept as 
private and protected methods are overridden and other methods are added in 
Super classes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2594) Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column

2018-10-04 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2594.
--
Resolution: Fixed

> Incorrect logic when set 'Encoding.INVERTED_INDEX' for each dimension column
> 
>
> Key: CARBONDATA-2594
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2594
> Project: CarbonData
>  Issue Type: Improvement
>  Components: spark-integration
>Reporter: Zhichao  Zhang
>Assignee: Jacky Li
>Priority: Minor
> Fix For: 1.5.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> All of non-sort dimension columns are set as 'Encoding.INVERTED_INDEX' 
> column, this is wrong, only the columns defined in 'SORT_COLUMN' and not in 
> 'NO_INVERTED_INDEX' need to be set as  'Encoding.INVERTED_INDEX' column.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-2992) Fixed Between Query Data Mismatch issue for timestamp data type

2018-10-04 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-2992:


 Summary: Fixed Between Query Data Mismatch issue for timestamp 
data type
 Key: CARBONDATA-2992
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2992
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


*Problem:*
Between query is giving wrong result.
*Root cause:*
For timestamp time when filter is given in -mm-dd format instead of 
-mm-dd HH:MM:SS format it will add cast, In CastExpressionOptimization it 
is using SimpleDateFormat object to parse the filter value which is failing as 
filter values is not same.
*Solution:*
Use SPARK:DateTimeUtils.stringToTime method as spark is handling for above 
scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-2978) JVM crashes when data inserted from one table to other table with unsafe true

2018-09-28 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-2978:


Assignee: kumar vishal

> JVM crashes when data inserted from one table to other table with unsafe true
> -
>
> Key: CARBONDATA-2978
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2978
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: kumar vishal
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> JVM crashes when data inserted from one table to other table with unsafe true



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2978) JVM crashes when data inserted from one table to other table with unsafe true

2018-09-28 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2978.
--
Resolution: Fixed

> JVM crashes when data inserted from one table to other table with unsafe true
> -
>
> Key: CARBONDATA-2978
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2978
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> JVM crashes when data inserted from one table to other table with unsafe true



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-2978) JVM crashes when data inserted from one table to other table with unsafe true

2018-09-28 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-2978:


Assignee: Ravindra Pesala  (was: kumar vishal)

> JVM crashes when data inserted from one table to other table with unsafe true
> -
>
> Key: CARBONDATA-2978
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2978
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> JVM crashes when data inserted from one table to other table with unsafe true



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2970) Basic queries like drop table and load are not working in ViewFS

2018-09-27 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2970.
--
Resolution: Fixed

> Basic queries like drop table and load are not working in ViewFS
> 
>
> Key: CARBONDATA-2970
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2970
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> when default fs is set to ViewFS then the drop table and load fails with 
> follwing exception
> org.apache.carbondata.spark.exception.ProcessMetaDataException: operation 
> failed for default.tb: Dropping table default.tb failed: Acquire table lock 
> failed after retry, please try after some time
>  at 
> org.apache.spark.sql.execution.command.MetadataProcessOpeation$class.throwMetadataException(package.scala:52)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.throwMetadataException(package.scala:86)
>  at 
> org.apache.spark.sql.execution.command.table.CarbonDropTableCommand.processMetadata(CarbonDropTableCommand.scala:157)
>  at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:90)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:71)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
>  at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:190)
>  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
>  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
>  at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:245)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:177)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2968) Single pass load fails 2nd time in Spark submit execution due to port binding error

2018-09-26 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2968.
--
Resolution: Fixed

> Single pass load fails 2nd time in Spark submit execution due to port binding 
> error
> ---
>
> Key: CARBONDATA-2968
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2968
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Shardul Singh
>Assignee: Shardul Singh
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2962) Even after carbon file is copied to targetfolder(local/hdfs), carbon files is not deleted from temp directory

2018-09-26 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2962.
--
Resolution: Fixed

> Even after carbon file is copied to targetfolder(local/hdfs), carbon files is 
> not deleted from temp directory
> -
>
> Key: CARBONDATA-2962
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2962
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2960) SDK reader not working without projection columns

2018-09-25 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2960.
--
Resolution: Fixed

> SDK reader not working without projection columns
> -
>
> Key: CARBONDATA-2960
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2960
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2954) Fix error when create external table command fired when path already exists

2018-09-24 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2954.
--
Resolution: Fixed

> Fix error when create external table command fired when path already exists
> ---
>
> Key: CARBONDATA-2954
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2954
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Shardul Singh
>Assignee: Shardul Singh
>Priority: Minor
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2958) Compaction with CarbonProperty 'carbon.enable.page.level.reader.in.compaction' enabled fails as Compressor is null

2018-09-24 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2958.
--
Resolution: Fixed

> Compaction with CarbonProperty 
> 'carbon.enable.page.level.reader.in.compaction' enabled fails as Compressor 
> is null
> --
>
> Key: CARBONDATA-2958
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2958
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2950) Alter table add columns fails for hive table in carbon session for spark version above 2.1

2018-09-21 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2950.
--
Resolution: Fixed

> Alter table add columns fails for hive table in carbon session for spark 
> version above 2.1
> --
>
> Key: CARBONDATA-2950
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2950
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> spark does not support add columns in spark-2.1, but it is supported in 2.2 
> and above
> when add column is fired for hive table in carbon session, for spark -version 
> above 2.1, it throws error as unsupported operation on hive table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2953) Dataload fails when sort column is given, and query returns null value from another session

2018-09-21 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2953.
--
Resolution: Fixed

> Dataload fails when sort column is given, and query returns null value from 
> another session
> ---
>
> Key: CARBONDATA-2953
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2953
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> # when dataload is done with sort columns, it fails with following exeptions
> java.lang.ClassCastException: java.lang.Integer cannot be cast to [B
>  at 
> org.apache.carbondata.processing.sort.sortdata.IntermediateSortTempRowComparator.compare(IntermediateSortTempRowComparator.java:71)
>  at 
> org.apache.carbondata.processing.loading.sort.unsafe.holder.UnsafeInmemoryHolder.compareTo(UnsafeInmemoryHolder.java:71)
>  at 
> org.apache.carbondata.processing.loading.sort.unsafe.holder.UnsafeInmemoryHolder.compareTo(UnsafeInmemoryHolder.java:26)
>  at java.util.PriorityQueue.siftUpComparable(PriorityQueue.java:656)
>  at java.util.PriorityQueue.siftUp(PriorityQueue.java:647)
>  at java.util.PriorityQueue.offer(PriorityQueue.java:344)
>  at java.util.PriorityQueue.add(PriorityQueue.java:321)
>  at 
> org.apache.carbondata.processing.loading.sort.unsafe.merger.UnsafeSingleThreadFinalSortFilesMerger.startSorting(UnsafeSingleThreadFinalSortFilesMerger.java:129)
>  at 
> org.apache.carbondata.processing.loading.sort.unsafe.merger.UnsafeSingleThreadFinalSortFilesMerger.startFinalMerge(UnsafeSingleThreadFinalSortFilesMerger.java:94)
>  at 
> org.apache.carbondata.processing.loading.sort.impl.UnsafeParallelReadMergeSorterImpl.sort(UnsafeParallelReadMergeSorterImpl.java:110)
>  at 
> org.apache.carbondata.processing.loading.steps.SortProcessorStepImpl.execute(SortProcessorStepImpl.java:55)
>  at 
> org.apache.carbondata.processing.loading.steps.DataWriterProcessorStepImpl.execute(DataWriterProcessorStepImpl.java:112)
>  at 
> org.apache.carbondata.processing.loading.DataLoadExecutor.execute(DataLoadExecutor.java:51)
>  at 
> org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD$$anon$1.(NewCarbonDataLoadRDD.scala:212)
>  at 
> org.apache.carbondata.spark.rdd.NewCarbonDataLoadRDD.internalCompute(NewCarbonDataLoadRDD.scala:188)
>  at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:78)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
>  # when two sessions are running in parallel, the follow below steps in 
> session1
>  ## drop table
>  ## create table
>  ## load data to table
>  # follow below step in session2
>  ## query on table(select * from table limit 1), then the query returns null 
> result instead for proper result



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2889) Support Decoder based fall back mechanism in Local Dictionary

2018-09-10 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2889.
--
Resolution: Fixed

> Support Decoder based fall back mechanism in Local Dictionary
> -
>
> Key: CARBONDATA-2889
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2889
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
>  Time Spent: 9h 20m
>  Remaining Estimate: 0h
>
> Currently, when the fallback is initiated for a column page in case of local 
> dictionary, we are keeping both encoded data
> and actual data in memory and then we form the new column page without 
> dictionary encoding and then at last we free the Encoded Column Page.
> Because of this offheap memory footprint increases.
>  
> We can reduce the offheap memory footprint. This can be done using decoder 
> based fallback mechanism.
> This means, no need to keep the actual data along with encoded data in 
> encoded column page. We can keep only encoded data and to form a new column 
> page, get the dictionary data from encoded column page by uncompressing and 
> using dictionary data get the actual data using local dictionary generator 
> and put it in new column page created and compress it again and give to 
> consumer for writing blocklet. 
>  
> The above process may slow down the loading, but it will reduces the memory 
> footprint. So we can give a property which will decide whether to take 
> current fallback procedure or decoder based fallback mechanism dring fallback



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2895) [Batch-sort]Query result mismatch with Batch-sort in save to disk (sort temp files) scenario.

2018-09-05 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2895.
--
Resolution: Fixed

> [Batch-sort]Query result mismatch with Batch-sort in save to disk (sort temp 
> files) scenario.
> -
>
> Key: CARBONDATA-2895
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2895
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> probelm: Query result mismatch with Batch-sort in save to disk (sort temp 
> files) scenario.
> scenario:
> a) Configure batchsort but give batch size more than 
> UnsafeMemoryManager.INSTANCE.getUsableMemory().
> b) Load data that is greater than batch size. Observe that 
> unsafeMemoryManager save to disk happened as it cannot process one batch.  
> c) so load happens in 2 batch. 
> d) When query the results. There result data rows is more than expected data 
> rows.
> root cause:
> For each batch, createSortDataRows() will be called.
> Files saved to disk during sorting of previous batch was considered for this 
> batch.
> solution:
> Files saved to disk during sorting of previous batch ,should not be 
> considered for this batch.
> Hence use batchID as rangeID field of sorttempfiles.
> So getFilesToMergeSort() will select files of only this batch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2898) Double boundary condition and clear datamaps are not working properly

2018-08-30 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2898.
--
Resolution: Fixed
  Assignee: Ravindra Pesala

> Double boundary condition and clear datamaps are not working properly
> -
>
> Key: CARBONDATA-2898
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2898
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> 1.DataMaps are not clearing properly as it creates temp table for each 
> request. 
>  # In double value bounadry cases loading fails as carbon does not handle 
> infinite properly.
>  # Added validations for sort columns cannot be used while inferring the 
> schema



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2887) Filters in complex datatypes are not working in carbon using fileformat

2018-08-29 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2887.
--
Resolution: Fixed
  Assignee: Ravindra Pesala

> Filters in complex datatypes are not working in carbon using fileformat
> ---
>
> Key: CARBONDATA-2887
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2887
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Filters n complex datatypes are not working in carbon using fileformat



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (CARBONDATA-2885) Broadcast Issue and Small file distribution Issue

2018-08-27 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-2885:


Assignee: Babulal

> Broadcast Issue and Small file distribution Issue
> -
>
> Key: CARBONDATA-2885
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2885
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Babulal
>Priority: Major
>
> Carbon Relation size is getting calculated wrongly ( always 0 ) for External 
> Table.
> Root Cause:- Because Tablestatus file is not present for external table
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2885) Broadcast Issue and Small file distribution Issue

2018-08-27 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2885.
--
Resolution: Fixed

> Broadcast Issue and Small file distribution Issue
> -
>
> Key: CARBONDATA-2885
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2885
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Babulal
>Priority: Major
>
> Carbon Relation size is getting calculated wrongly ( always 0 ) for External 
> Table.
> Root Cause:- Because Tablestatus file is not present for external table
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2872) Support standard spark's FIleFormat interface in carbon

2018-08-24 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2872.
--
Resolution: Fixed

> Support standard spark's FIleFormat interface in carbon
> ---
>
> Key: CARBONDATA-2872
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2872
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Priority: Major
> Attachments: Support Spark FileFormat Interface in Carbon.pdf
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Current carbondata has deep integration with spark to provide optimizations 
> in performance and also supports features like compaction, IUD, data maps and 
> metadata management etc. This type of integration forces user to use 
> CarbonSession instance to use carbon even for read and write operations.
> For the users who wants a same spark datasource integration to support read 
> and write data carbon should support FIleFormat interface exposed by spark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2817) Thread Leak in Update and in No sort flow

2018-08-08 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2817.
--
Resolution: Fixed
  Assignee: Babulal

> Thread Leak in Update and in No sort flow
> -
>
> Key: CARBONDATA-2817
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2817
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Babulal
>Priority: Major
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> After update  finished , loading threads (input process,convet,sort..etc) are 
> still alive.
>  
> "Thread-36" #172 daemon prio=5 os_prio=0 tid=0x7fd021eba000 nid=0x17136 
> waiting on condition [0x7fd01000]
>  java.lang.Thread.State: TIMED_WAITING (sleeping)
>  at java.lang.Thread.sleep(Native Method)
>  at 
> org.apache.carbondata.processing.loading.AbstractDataLoadProcessorStep$1.run(AbstractDataLoadProcessorStep.java:81)
> "Thread-35" #171 daemon prio=5 os_prio=0 tid=0x01a7f000 nid=0x17135 
> waiting on condition [0x7fd01101]
>  java.lang.Thread.State: TIMED_WAITING (sleeping)
>  at java.lang.Thread.sleep(Native Method)
>  at 
> org.apache.carbondata.processing.loading.AbstractDataLoadProcessorStep$1.run(AbstractDataLoadProcessorStep.java:81)
> "Thread-34" #170 daemon prio=5 os_prio=0 tid=0x01e4 nid=0x17134 
> waiting on condition [0x7fd019aa9000]
>  java.lang.Thread.State: TIMED_WAITING (sleeping)
>  at java.lang.Thread.sleep(Native Method)
>  at 
> org.apache.carbondata.processing.loading.AbstractDataLoadProcessorStep$1.run(AbstractDataLoadProcessorStep.java:81)
>  
> "NoSortDataWriterPool:tbl_data_event_41_carbon_nosort" #96 prio=5 
> os_prio=0 tid=0x00f2e800 nid=0x129a2 waiting on condition 
> [0x7fd0197a6000]
>  java.lang.Thread.State: WAITING (parking)
>  at sun.misc.Unsafe.park(Native Method)
>  - parking to wait for <0x0006a35989a0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>  at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>  at 
> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (CARBONDATA-2807) Fixed data load performance issue with more number of records

2018-07-30 Thread kumar vishal (JIRA)

kumar vishal created CARBONDATA-2807:


 Summary: Fixed data load performance issue with more number of 
records
 Key: CARBONDATA-2807
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2807
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal
Assignee: kumar vishal


**Problem:**Data Loading is taking more time when number of records are high.
**Root cause:** As number of records are high intermediate merger is taking 
more time.
**Solution:** Checking the number of files present in file list is done is 
synchronized block because of this 
each intermediate request is taking sometime and when number of records are 
high it impacting overall data loading performance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (CARBONDATA-2775) Adaptive encoding fails for Unsafe OnHeap if, target data type is SHORT_INT

2018-07-29 Thread kumar vishal (JIRA)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2775.
--
Resolution: Fixed

> Adaptive encoding fails for Unsafe OnHeap if, target data type is SHORT_INT
> ---
>
> Key: CARBONDATA-2775
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2775
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

1 2 3 >

1 - 100 of 264 matches

Mail list logo