[GitHub] [carbondata] ajantha-bhat commented on pull request #4106: [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV partition table having sort column

2021-03-17 Thread GitBox


ajantha-bhat commented on pull request #4106:
URL: https://github.com/apache/carbondata/pull/4106#issuecomment-801641696


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4106: [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV partition table having sort column

2021-03-17 Thread GitBox


Indhumathi27 commented on a change in pull request #4106:
URL: https://github.com/apache/carbondata/pull/4106#discussion_r596564072



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/TestPartitionWithMV.scala
##
@@ -748,6 +748,22 @@ class TestPartitionWithMV extends QueryTest with 
BeforeAndAfterAll with BeforeAn
 sql("drop table if exists partitionone")
   }
 
+  test("test partition on MV with sort column") {
+sql("drop table if exists partitionone")
+sql("create table if not exists partitionone (ts timestamp, " +
+"metric STRING, tags_id STRING, value DOUBLE) partitioned by (ts1 
timestamp,ts2 timestamp) stored as carbondata TBLPROPERTIES 
('SORT_COLUMNS'='metric,ts2')")

Review comment:
   All other testcase in TestPartitionWithMV are without sort column only 
and also tested manually. It works





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #4106: [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV partition table having sort column

2021-03-17 Thread GitBox


ajantha-bhat commented on a change in pull request #4106:
URL: https://github.com/apache/carbondata/pull/4106#discussion_r596560888



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/view/rewrite/TestPartitionWithMV.scala
##
@@ -748,6 +748,22 @@ class TestPartitionWithMV extends QueryTest with 
BeforeAndAfterAll with BeforeAn
 sql("drop table if exists partitionone")
   }
 
+  test("test partition on MV with sort column") {
+sql("drop table if exists partitionone")
+sql("create table if not exists partitionone (ts timestamp, " +
+"metric STRING, tags_id STRING, value DOUBLE) partitioned by (ts1 
timestamp,ts2 timestamp) stored as carbondata TBLPROPERTIES 
('SORT_COLUMNS'='metric,ts2')")

Review comment:
   I think above MV rearrange code was added when partition column is not a 
sort column case I guess.
   
   can you please add one more test case, same but just remove sort columns 
from the create table and test it ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4106: [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV partition table having sort column

2021-03-17 Thread GitBox


Indhumathi27 commented on a change in pull request #4106:
URL: https://github.com/apache/carbondata/pull/4106#discussion_r59666



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##
@@ -181,7 +183,13 @@ case class CarbonInsertIntoCommand(databaseNameOp: 
Option[String],
   if (isNotReArranged) {
 // Re-arrange the catalog table schema and output for partition 
relation
 logicalPartitionRelation =
-  getReArrangedSchemaLogicalRelation(reArrangedIndex, 
logicalPartitionRelation)
+  if (carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable.isMV) {
+// For MV partition table, partition columns will be at the end. 
Re-arrange

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on pull request #4105: [CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file

2021-03-17 Thread GitBox


Indhumathi27 commented on pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#issuecomment-801628181


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4105: [CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#issuecomment-801600676


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5579/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4105: [CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#issuecomment-801600161


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3813/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4102: [CARBONDATA-4144] During compaction, the segment lock of SI table is not released in abnormal scenarios.

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4102:
URL: https://github.com/apache/carbondata/pull/4102#issuecomment-801584412


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3812/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4102: [CARBONDATA-4144] During compaction, the segment lock of SI table is not released in abnormal scenarios.

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4102:
URL: https://github.com/apache/carbondata/pull/4102#issuecomment-801582139


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5578/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jack86596 commented on pull request #4105: [CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file

2021-03-17 Thread GitBox


jack86596 commented on pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#issuecomment-801552836


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jack86596 commented on a change in pull request #4105: [CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file

2021-03-17 Thread GitBox


jack86596 commented on a change in pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#discussion_r596494501



##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -898,15 +897,8 @@ public SegmentFile getSegmentFile() {
* @return
*/
   public List getIndexCarbonFiles() {
-Map indexFiles = getIndexFiles();
-Set files = new HashSet<>();
-for (Map.Entry entry: indexFiles.entrySet()) {
-  Path path = new Path(entry.getKey());
-  files.add(entry.getKey());
-  if (entry.getValue() != null) {
-files.add(new Path(path.getParent(), entry.getValue()).toString());
-  }
-}
+List indexFiles = getIndexFiles();

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] liuhe0702 commented on a change in pull request #4102: [CARBONDATA-4144] During compaction, the segment lock of SI table is not released in abnormal scenarios.

2021-03-17 Thread GitBox


liuhe0702 commented on a change in pull request #4102:
URL: https://github.com/apache/carbondata/pull/4102#discussion_r596482836



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/load/Compactor.scala
##
@@ -93,6 +93,9 @@ object Compactor {
 segmentToSegmentTimestampMap, null,
 forceAccessSegment, isCompactionCall = true,
 isLoadToFailedSISegments = false)
+if (segmentLocks.isEmpty) {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4107: [WIP] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#issuecomment-801460677


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5577/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4107: [WIP] Query with SI after add partition based on location on partition table gives incorrect results

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4107:
URL: https://github.com/apache/carbondata/pull/4107#issuecomment-801458583


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3811/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4105: [CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#issuecomment-801419839


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3810/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4105: [CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#issuecomment-801406211


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5576/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (CARBONDATA-4150) Information about indexed datamap

2021-03-17 Thread Mahesh Raju Somalaraju (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-4150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303630#comment-17303630
 ] 

Mahesh Raju Somalaraju commented on CARBONDATA-4150:


Hi Suyash Yadhav,

Index datamap is a data structure that can be used to accelerate certain query 
of the table. Currently, Carbondata supports three types of Indexes:
 # *BloomFilter Index*: A space-efficient probabilistic data structure that is 
used to test whether an element is a member of a set.
 # *Lucene Index*: High performance, full-featured text search engine.
 # *Secondary Index[SI]*:  Secondary index tables are created as indexes and 
managed as child tables internally by Carbondata. Users can create a secondary 
index based on the column position in the main table(Recommended for right 
columns) and the queries should have filter on that column to improve the 
filter query performance. Secondary index tables to hold blocklets are created 
as indexes and managed as child tables internally by Carbondata.

you can refer to the below link for detail info about usage. Hope this helps, 
if any further info please reply back! 

[https://github.com/apache/carbondata/blob/master/docs/index/bloomfilter-index-guide.md]

[https://github.com/apache/carbondata/blob/master/docs/index/lucene-index-guide.md]

[https://github.com/apache/carbondata/blob/master/docs/index/secondary-index-guide.md]

 

Thanks & Regards

Mahesh Raju S

> Information about indexed datamap
> -
>
> Key: CARBONDATA-4150
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4150
> Project: CarbonData
>  Issue Type: Wish
>  Components: core
>Affects Versions: 2.0.1
> Environment: apache 2.0.1 spark 2.4.5 hadoop 2.7.2
>Reporter: suyash yadav
>Priority: Critical
> Fix For: 2.0.1
>
>
> Hi Team,
>  
> We would like to know detailed information about indexed datamap and possible 
> use cases for this datamap.
> So please help us in getting answer to below queries:-
>  
> 1) What is an indexed datamap and related use cases.
> 2) how it is to be used,
> 3) any reference documents
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4101: [WIP] Fix Writing Segment Min max with all blocks of a segment

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4101:
URL: https://github.com/apache/carbondata/pull/4101#issuecomment-801240380


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3809/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4101: [WIP] Fix Writing Segment Min max with all blocks of a segment

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4101:
URL: https://github.com/apache/carbondata/pull/4101#issuecomment-801229887


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5575/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on pull request #4101: [WIP] Test

2021-03-17 Thread GitBox


Indhumathi27 commented on pull request #4101:
URL: https://github.com/apache/carbondata/pull/4101#issuecomment-801131745


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] maheshrajus commented on pull request #4108: [CARBONDATA-4153] Fix DoNot Push down not equal to filter with Cast on SI

2021-03-17 Thread GitBox


maheshrajus commented on pull request #4108:
URL: https://github.com/apache/carbondata/pull/4108#issuecomment-801009688


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [WIP] Fix various concurrent issues with clean files

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-800966836


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5574/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4109: [WIP] Fix various concurrent issues with clean files

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109#issuecomment-800965166


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3808/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4106: [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV partition table having sort column

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4106:
URL: https://github.com/apache/carbondata/pull/4106#issuecomment-800954845


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbon_PR_Builder_2.4.5/3807/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4106: [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV partition table having sort column

2021-03-17 Thread GitBox


CarbonDataQA2 commented on pull request #4106:
URL: https://github.com/apache/carbondata/pull/4106#issuecomment-800952347


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12444/job/ApacheCarbonPRBuilder2.3/5573/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #4102: [CARBONDATA-4144] During compaction, the segment lock of SI table is not released in abnormal scenarios.

2021-03-17 Thread GitBox


akashrn5 commented on a change in pull request #4102:
URL: https://github.com/apache/carbondata/pull/4102#discussion_r595866555



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/secondaryindex/load/Compactor.scala
##
@@ -93,6 +93,9 @@ object Compactor {
 segmentToSegmentTimestampMap, null,
 forceAccessSegment, isCompactionCall = true,
 isLoadToFailedSISegments = false)
+if (segmentLocks.isEmpty) {

Review comment:
   please add a log here saying that the loading of compacted segment into 
SI table failed, couldn't acquire segment lock on the specific segment





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4105: [CARBONDATA-4148] Reindex failed when SI has stale carbonindexmerge file

2021-03-17 Thread GitBox


Indhumathi27 commented on a change in pull request #4105:
URL: https://github.com/apache/carbondata/pull/4105#discussion_r595857079



##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -898,15 +897,8 @@ public SegmentFile getSegmentFile() {
* @return
*/
   public List getIndexCarbonFiles() {
-Map indexFiles = getIndexFiles();
-Set files = new HashSet<>();
-for (Map.Entry entry: indexFiles.entrySet()) {
-  Path path = new Path(entry.getKey());
-  files.add(entry.getKey());
-  if (entry.getValue() != null) {
-files.add(new Path(path.getParent(), entry.getValue()).toString());
-  }
-}
+List indexFiles = getIndexFiles();

Review comment:
   Can you return Set from getIndexFiles itself





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 opened a new pull request #4109: [WIP] Fix various concurrent issues with clean files

2021-03-17 Thread GitBox


vikramahuja1001 opened a new pull request #4109:
URL: https://github.com/apache/carbondata/pull/4109


### Why is this PR needed?


### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-4153) DoNot Push down 'not equal to' filter with Cast on SI

2021-03-17 Thread Indhumathi Muthumurugesh (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Indhumathi Muthumurugesh updated CARBONDATA-4153:
-
Description: 
For NOT EQUAL TO filter on SI index column, should not be pushed down to SI 
table.

Currently, where x!='2' is not pushing down to SI, but where x!=2 is pushed 
down to SI.

> DoNot Push down 'not equal to' filter with Cast on SI
> -
>
> Key: CARBONDATA-4153
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4153
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Minor
>
> For NOT EQUAL TO filter on SI index column, should not be pushed down to SI 
> table.
> Currently, where x!='2' is not pushing down to SI, but where x!=2 is pushed 
> down to SI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-4153) DoNot Push down 'not equal to' filter with Cast on SI

2021-03-17 Thread Indhumathi Muthumurugesh (Jira)
Indhumathi Muthumurugesh created CARBONDATA-4153:


 Summary: DoNot Push down 'not equal to' filter with Cast on SI
 Key: CARBONDATA-4153
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4153
 Project: CarbonData
  Issue Type: Bug
Reporter: Indhumathi Muthumurugesh






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4102) Add UT and FT to improve coverage of SI module.

2021-03-17 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-4102:
-
Issue Type: Test  (was: Bug)

> Add UT and FT to improve coverage of SI module.
> ---
>
> Key: CARBONDATA-4102
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4102
> Project: CarbonData
>  Issue Type: Test
>Reporter: Nihal kumar ojha
>Priority: Major
> Fix For: 2.1.1
>
>  Time Spent: 17h 10m
>  Remaining Estimate: 0h
>
> Add UT and FT to improve coverage of SI module and also remove dead or unused 
> code if exists.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3991) File system could not set modified time because don't override the settime function

2021-03-17 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-3991:
-
Fix Version/s: (was: 2.1.1)

> File system could not set modified time because don't override the settime 
> function
> ---
>
> Key: CARBONDATA-3991
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3991
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.1
>Reporter: jingpan xiong
>Priority: Major
>  Time Spent: 17h 10m
>  Remaining Estimate: 0h
>
> The file system like S3 and Alluxio, don't override the settime function, 
> cause the updata and create mv got some problem. This bug can't raise a 
> exception on set modified time, and may set a null value in modified time. 
> This bug may cause multi tenant problem and data consistency problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3908) When a carbon segment is added through the alter add segments query, then it is not accounting the added carbon segment values.

2021-03-17 Thread Ajantha Bhat (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303162#comment-17303162
 ] 

Ajantha Bhat commented on CARBONDATA-3908:
--

[https://github.com/apache/carbondata/pull/4001]

 

> When a carbon segment is added through the alter add segments query, then it 
> is not accounting the added carbon segment values.
> ---
>
> Key: CARBONDATA-3908
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3908
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 2.0.0
> Environment: FI cluster and opensource cluster.
>Reporter: Prasanna Ravichandran
>Priority: Major
> Fix For: 2.1.1
>
>
> When a carbon segment is added through the alter add segments query, then it 
> is not accounting the added carbon segment values. If we do count(*) on the 
> added segment, then it is always showing as 0.
> Test queries:
> drop table if exists uniqdata;
> CREATE TABLE uniqdata (cust_id int,cust_name String,active_emui_version 
> string, dob timestamp, doj timestamp, bigint_column1 bigint,bigint_column2 
> bigint,decimal_column1 decimal(30,10), decimal_column2 
> decimal(36,36),double_column1 double, double_column2 double,integer_column1 
> int) stored as carbondata;
> load data inpath 'hdfs://hacluster/BabuStore/Data/2000_UniqData.csv' into 
> table uniqdata 
> options('fileheader'='cust_id,cust_name,active_emui_version,dob,doj,bigint_column1,bigint_column2,decimal_column1,decimal_column2,double_column1,double_column2,integer_column1','bad_records_action'='force');
> --hdfs dfs -mkdir /uniqdata-carbon-segment;
> --hdfs dfs -cp /user/hive/warehouse/uniqdata/Fact/Part0/Segment_0/* 
> /uniqdata-carbon-segment/
> Alter table uniqdata add segment options 
> ('path'='hdfs://hacluster/uniqdata-carbon-segment/','format'='carbon');
> select count(*) from uniqdata;--4000 expected as one load of 2000 records 
> happened and same segment is added again;
> set carbon.input.segments.default.uniqdata=1;
> select count(*) from uniqdata;--2000 expected - it should just show the 
> records count of added segments;
> CONSOLE:
> /> set carbon.input.segments.default.uniqdata=1;
> +-++
> | key | value |
> +-++
> | carbon.input.segments.default.uniqdata | 1 |
> +-++
> 1 row selected (0.192 seconds)
> /> select count(*) from uniqdata;
> INFO : Execution ID: 1734
> +---+
> | count(1) |
> +---+
> | 2000 |
> +---+
> 1 row selected (4.036 seconds)
> /> set carbon.input.segments.default.uniqdata=2;
> +-++
> | key | value |
> +-++
> | carbon.input.segments.default.uniqdata | 2 |
> +-++
> 1 row selected (0.088 seconds)
> /> select count(*) from uniqdata;
> INFO : Execution ID: 1745
> +---+
> | count(1) |
> +---+
> | 2000 |
> +---+
> 1 row selected (6.056 seconds)
> /> set carbon.input.segments.default.uniqdata=3;
> +-++
> | key | value |
> +-++
> | carbon.input.segments.default.uniqdata | 3 |
> +-++
> 1 row selected (0.161 seconds)
> /> select count(*) from uniqdata;
> INFO : Execution ID: 1753
> +---+
> | count(1) |
> +---+
> | 0 |
> +---+
> 1 row selected (4.875 seconds)
> /> show segments for table uniqdata;
> +-+--+--+--+++-+--+
> | ID | Status | Load Start Time | Load Time Taken | Partition | Data Size | 
> Index Size | File Format |
> +-+--+--+--+++-+--+
> | 4 | Success | 2020-07-17 16:01:53.673 | 5.579S | {} | 269.10KB | 7.21KB | 
> columnar_v3 |
> | 3 | Success | 2020-07-17 16:00:24.866 | 0.578S | {} | 88.55KB | 1.81KB | 
> columnar_v3 |
> | 2 | Success | 2020-07-17 15:07:54.273 | 0.642S | {} | 36.72KB | NA | orc |
> | 1 | Success | 2020-07-17 15:03:59.767 | 0.564S | {} | 89.26KB | NA | 
> parquet |
> | 0 | Success | 2020-07-16 12:44:32.095 | 4.484S | {} | 88.55KB | 1.81KB | 
> columnar_v3 |
> +-+--+--+--+++-+--+
> Expected result: Records added by adding carbon segment should be considered.
> Actual result: Records added by adding carbon segment is not considered.



--
This message was sent by Atlassian Jira

[jira] [Updated] (CARBONDATA-3880) How to start JDBC service in distributed index

2021-03-17 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-3880:
-
Fix Version/s: (was: 2.1.1)

>  How to start JDBC service in distributed index
> ---
>
> Key: CARBONDATA-3880
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3880
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.0
>Reporter: li
>Priority: Major
>
> How to start JDBC service in distributed index



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3880) How to start JDBC service in distributed index

2021-03-17 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat resolved CARBONDATA-3880.
--
Resolution: Not A Problem

>  How to start JDBC service in distributed index
> ---
>
> Key: CARBONDATA-3880
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3880
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.0
>Reporter: li
>Priority: Major
>
> How to start JDBC service in distributed index



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3643) Insert array('')/array() into Struct column will result in array(null), which is inconsist with Parquet

2021-03-17 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-3643:
-
Fix Version/s: (was: 2.1.1)
   2.2.0

> Insert array('')/array() into Struct column will result in 
> array(null), which is inconsist with Parquet
> --
>
> Key: CARBONDATA-3643
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3643
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.6.1, 2.0.0
>Reporter: Xingjun Hao
>Priority: Minor
> Fix For: 2.2.0
>
>
>  
> {code:java}
> //
> sql("create table datatype_struct_parquet(price struct>) 
> stored as parquet") 
> sql("insert into table datatype_struct_parquet values(named_struct('b', 
> array('')))") 
> sql("create table datatype_struct_carbondata(price struct>) 
> stored as carbondata") 
> sql("insert into datatype_struct_carbondata select * from 
> datatype_struct_parquet")
> checkAnswer( sql("SELECT * FROM datatype_struct_carbondata"), sql("SELECT * 
> FROM datatype_struct_parquet"))
> !== Correct Answer - 1 == == Spark Answer - 1 == 
> ![[WrappedArray()]] [[WrappedArray(null)]]
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4137) Refactor CarbonDataSourceScan without Spark Filter

2021-03-17 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-4137:
-
Fix Version/s: (was: 2.2.0)
   2.1.1

> Refactor CarbonDataSourceScan without Spark Filter
> --
>
> Key: CARBONDATA-4137
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4137
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: David Cai
>Priority: Major
> Fix For: 2.1.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-4137) Refactor CarbonDataSourceScan without Spark Filter

2021-03-17 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-4137:
-
Fix Version/s: (was: 2.1.1)
   2.2.0

> Refactor CarbonDataSourceScan without Spark Filter
> --
>
> Key: CARBONDATA-4137
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4137
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: David Cai
>Priority: Major
> Fix For: 2.2.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] akashrn5 commented on pull request #4106: [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV partition table having sort column

2021-03-17 Thread GitBox


akashrn5 commented on pull request #4106:
URL: https://github.com/apache/carbondata/pull/4106#issuecomment-800846438


   LGTM, @ajantha-bhat please review this once as you have worked on rearrange 
logic in insert optimization. Please see if there is any impact.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on pull request #4106: [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV partition table having sort column

2021-03-17 Thread GitBox


akashrn5 commented on pull request #4106:
URL: https://github.com/apache/carbondata/pull/4106#issuecomment-800846101


   @Indhumathi27 please change description with example as discussed



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #4106: [CARBONDATA-4147] Fix re-arrange schema in logical relation on MV partition table having sort column

2021-03-17 Thread GitBox


akashrn5 commented on a change in pull request #4106:
URL: https://github.com/apache/carbondata/pull/4106#discussion_r595752950



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonInsertIntoCommand.scala
##
@@ -181,7 +183,13 @@ case class CarbonInsertIntoCommand(databaseNameOp: 
Option[String],
   if (isNotReArranged) {
 // Re-arrange the catalog table schema and output for partition 
relation
 logicalPartitionRelation =
-  getReArrangedSchemaLogicalRelation(reArrangedIndex, 
logicalPartitionRelation)
+  if (carbonLoadModel.getCarbonDataLoadSchema.getCarbonTable.isMV) {
+// For MV partition table, partition columns will be at the end. 
Re-arrange

Review comment:
   as discussed, please add a comment here with example, so reviewers and 
developers will be clear why only for MV we need to handle it separately





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3816) Support Float and Decimal in the Merge Flow

2021-03-17 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-3816:
-
Fix Version/s: (was: 2.1.1)
   2.2.0

> Support Float and Decimal in the Merge Flow
> ---
>
> Key: CARBONDATA-3816
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3816
> Project: CarbonData
>  Issue Type: New Feature
>  Components: data-load
>Affects Versions: 2.0.0
>Reporter: Xingjun Hao
>Priority: Major
> Fix For: 2.2.0
>
>
> We don't support FLOAT and DECIMAL datatype in the CDC Flow. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3615) Show metacache shows the index server index-dictionary files when data loaded after index server disabled using set command

2021-03-17 Thread Ajantha Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajantha Bhat updated CARBONDATA-3615:
-
Fix Version/s: (was: 2.1.1)
   2.2.0

> Show metacache shows the index server index-dictionary files when data loaded 
> after index server disabled using set command
> ---
>
> Key: CARBONDATA-3615
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3615
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Vikram Ahuja
>Priority: Minor
> Fix For: 2.2.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Show metacache shows the index server index-dictionary files when data loaded 
> after index server disabled using set command
> +-+-+-+-+--+
> |    Field    |  Size   |         Comment         | Cache Location  |
> +-+-+-+-+--+
> | Index       | 0 B     | 0/2 index files cached  | DRIVER          |
> | Dictionary  | 0 B     |                         | DRIVER          |
> *| Index       | 1.5 KB  | 2/2 index files cached  | INDEX SERVER    |*
> *| Dictionary  | 0 B     |                         | INDEX SERVER    |*
> *+-+-+-+*-+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CARBONDATA-3615) Show metacache shows the index server index-dictionary files when data loaded after index server disabled using set command

2021-03-17 Thread Ajantha Bhat (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303133#comment-17303133
 ] 

Ajantha Bhat commented on CARBONDATA-3615:
--

[~vikramahuja_]: please check and close issue. if it is already handled

> Show metacache shows the index server index-dictionary files when data loaded 
> after index server disabled using set command
> ---
>
> Key: CARBONDATA-3615
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3615
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.0.0
>Reporter: Vikram Ahuja
>Priority: Minor
> Fix For: 2.1.1
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Show metacache shows the index server index-dictionary files when data loaded 
> after index server disabled using set command
> +-+-+-+-+--+
> |    Field    |  Size   |         Comment         | Cache Location  |
> +-+-+-+-+--+
> | Index       | 0 B     | 0/2 index files cached  | DRIVER          |
> | Dictionary  | 0 B     |                         | DRIVER          |
> *| Index       | 1.5 KB  | 2/2 index files cached  | INDEX SERVER    |*
> *| Dictionary  | 0 B     |                         | INDEX SERVER    |*
> *+-+-+-+*-+--+



--
This message was sent by Atlassian Jira
(v8.3.4#803005)