[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort
ajantha-bhat commented on a change in pull request #3979: URL: https://github.com/apache/carbondata/pull/3979#discussion_r509038981 ## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala ## @@ -1009,6 +1009,10 @@ object CommonUtil { private def convertSparkComplexTypeToCarbonObject(data: AnyRef, objectDataType: DataType): AnyRef = { +if (data == null && (objectDataType.isInstanceOf[ArrayType] Review comment: ok This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort
ajantha-bhat commented on a change in pull request #3979: URL: https://github.com/apache/carbondata/pull/3979#discussion_r509026576 ## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala ## @@ -1009,6 +1009,10 @@ object CommonUtil { private def convertSparkComplexTypeToCarbonObject(data: AnyRef, objectDataType: DataType): AnyRef = { +if (data == null && (objectDataType.isInstanceOf[ArrayType] Review comment: check with objectDataType.isComplexType This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort
ajantha-bhat commented on a change in pull request #3979: URL: https://github.com/apache/carbondata/pull/3979#discussion_r503099172 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/InsertIntoCarbonTableTestCase.scala ## @@ -67,6 +67,21 @@ class InsertIntoCarbonTableTestCase extends QueryTest with BeforeAndAfterAll { } + test("insert from orc-select columns with columns having null values and sort scope as global sort") { +sql("drop table if exists TORCSource") +sql("drop table if exists TCarbon") +sql("create table TORCSource(name string,col array,fee int) STORED AS orc") +sql("insert into TORCSource values('karan',null,2)") +sql("create table TCarbon(name string, col array,fee int) STORED AS carbondata TBLPROPERTIES ('SORT_COLUMNS'='name','TABLE_BLOCKSIZE'='128','TABLE_BLOCKLET_SIZE'='128','SORT_SCOPE'='global_SORT')") +sql("insert overwrite table TCarbon select name,col,fee from TORCSource") +val result = sql("show segments for table TCarbon").collect()(0).get(1).toString() +if(!"Success".equalsIgnoreCase(result)) { + assert(false) +} +sql("drop table if exists TORCSource") +sql("drop table if exists TCarbon") + } + Review comment: please handle and verify 4 scenarios by comparing with ORC. a) local sort insert with complex type null value data b) global sort insert with complex type null value data c) same point a) with `carbon.enable.bad.record.handling.for.insert `as `true` d) same point b) with `carbon.enable.bad.record.handling.for.insert` as `true` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort
ajantha-bhat commented on a change in pull request #3979: URL: https://github.com/apache/carbondata/pull/3979#discussion_r503097655 ## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/util/CommonUtil.scala ## @@ -1011,38 +1011,46 @@ object CommonUtil { objectDataType: DataType): AnyRef = { objectDataType match { case _: ArrayType => -val arrayDataType = objectDataType.asInstanceOf[ArrayType] -val arrayData = data.asInstanceOf[UnsafeArrayData] -val size = arrayData.numElements() -val childDataType = arrayDataType.elementType -val arrayChildObjects = new Array[AnyRef](size) -var i = 0 -while (i < size) { - arrayChildObjects(i) = convertSparkComplexTypeToCarbonObject(arrayData.get(i, -childDataType), childDataType) - i = i + 1 +if (data == null) { Review comment: after line 1011, add a check that if data type is array or struct or map and the data is null, return null to avoid changing many lines. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3979: [Carbondata-3954] Fix insertion from ORC table into carbon table when sort scope is global sort
ajantha-bhat commented on a change in pull request #3979: URL: https://github.com/apache/carbondata/pull/3979#discussion_r503096192 ## File path: integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/allqueries/InsertIntoCarbonTableTestCase.scala ## @@ -67,6 +67,21 @@ class InsertIntoCarbonTableTestCase extends QueryTest with BeforeAndAfterAll { } + test("insert from orc-select columns with columns having null values and sort scope as global sort") { +sql("drop table if exists TORCSource") +sql("drop table if exists TCarbon") +sql("create table TORCSource(name string,col array,fee int) STORED AS orc") +sql("insert into TORCSource values('karan',null,2)") +sql("create table TCarbon(name string, col array,fee int) STORED AS carbondata TBLPROPERTIES ('SORT_COLUMNS'='name','TABLE_BLOCKSIZE'='128','TABLE_BLOCKLET_SIZE'='128','SORT_SCOPE'='global_SORT')") +sql("insert overwrite table TCarbon select name,col,fee from TORCSource") +val result = sql("show segments for table TCarbon").collect()(0).get(1).toString() +if(!"Success".equalsIgnoreCase(result)) { Review comment: please do a select query and compare orc and carbon table results This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org