[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4045: ci_test

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4045:
URL: https://github.com/apache/carbondata/pull/4045#issuecomment-813839896


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5128/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4045: ci_test

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4045:
URL: https://github.com/apache/carbondata/pull/4045#issuecomment-813837968


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3377/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ydvpankaj99 commented on pull request #4045: ci_test

2021-04-05 Thread GitBox


ydvpankaj99 commented on pull request #4045:
URL: https://github.com/apache/carbondata/pull/4045#issuecomment-813833910


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4113: [CARBONDATA-4161] Describe complex columns

2021-04-05 Thread GitBox


Indhumathi27 commented on a change in pull request #4113:
URL: https://github.com/apache/carbondata/pull/4113#discussion_r607518121



##
File path: docs/ddl-of-carbondata.md
##
@@ -646,6 +647,28 @@ CarbonData DDL statements are documented here,which 
includes:
 
 ## TABLE MANAGEMENT  
 
+### DESCRIBE COMMAND
+
+Describe column of table and visualize its structure with child fields.
+  ```
+  DESCRIBE COLUMN fieldname[.nestedFieldNames] ON [db_name.]table_name;
+
+  Example: DESCRIBE COLUMN channelsId ON carbonTable;
+++-+---+
+|col_name|data_type|comment|
+++-+---+
+|channelsId  |map  |null   |
+|## Children of channelsId:  | |   |
+|key |string   |null   |
+|value   |string   |null   |
+++-+---+
+  ```
+
+This command is used to display short version of table columns.

Review comment:
   ```suggestion
   This command is used to display short version of table complex columns.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4034: [CARBONDATA-4091] support prestosql 333 integartion with carbon

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4034:
URL: https://github.com/apache/carbondata/pull/4034#issuecomment-813833357


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3375/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #4113: [CARBONDATA-4161] Describe complex columns

2021-04-05 Thread GitBox


Indhumathi27 commented on a change in pull request #4113:
URL: https://github.com/apache/carbondata/pull/4113#discussion_r607512702



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonSparkSqlParserUtil.scala
##
@@ -744,6 +747,52 @@ object CarbonSparkSqlParserUtil {
 CarbonAlterTableAddColumnCommand(alterTableAddColumnsModel)
   }
 
+
+  def describeColumn(
+  databaseNameOp: Option[String],
+  tableName: String,
+  inputFields: java.util.List[String]
+  ): CarbonDescribeColumnCommand = {
+val sparkSession = SparkSQLUtil.getSparkSession
+validateTableExists(databaseNameOp, tableName, sparkSession)
+val relation = CarbonEnv
+  .getInstance(sparkSession)
+  .carbonMetaStore
+  .lookupRelation(databaseNameOp, tableName)(sparkSession)
+  .asInstanceOf[CarbonRelation]
+val tableSchema = StructType.fromAttributes(relation.output)
+val carbonTable = relation.carbonTable
+val inputColumn = 
tableSchema.find(_.name.equalsIgnoreCase(inputFields.get(0)))
+if (!inputColumn.isDefined) {
+  throw new MalformedCarbonCommandException(
+s"${inputFields.get(0)} not present in schema of table: $tableName")
+}
+CarbonDescribeColumnCommand(
+  carbonTable,
+  inputFields,
+  inputColumn.get
+)
+  }
+
+  def describeShort(
+  databaseNameOp: Option[String],
+  tableName: String
+  ): CarbonDescribeShortCommand = {
+val sparkSession = SparkSQLUtil.getSparkSession
+validateTableExists(databaseNameOp, tableName, sparkSession)

Review comment:
   Can move duplicate code from describeShort and describeColumn to common 
method

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/parser/CarbonExtensionSpark2SqlParser.scala
##
@@ -27,7 +28,26 @@ import org.apache.spark.sql.catalyst.plans.logical._
 class CarbonExtensionSpark2SqlParser extends CarbonSpark2SqlParser {
 
   override protected lazy val extendedSparkSyntax: Parser[LogicalPlan] =
-loadDataNew | alterTableAddColumns | explainPlan
+loadDataNew | alterTableAddColumns | explainPlan | describeColumn | 
describeShort

Review comment:
   I think, no need to define here, since CarbonExtensionSpark2SqlParser 
extends CarbonSpark2SqlParser.

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonDescribeFormattedCommand.scala
##
@@ -370,3 +373,134 @@ private[sql] case class CarbonDescribeFormattedCommand(
 
   override protected def opName: String = "DESC FORMATTED"
 }
+
+case class CarbonDescribeColumnCommand(
+carbonTable: CarbonTable,
+inputFieldNames: java.util.List[String],
+field: StructField)
+  extends MetadataCommand {
+
+  override val output: Seq[Attribute] = Seq(
+// Column names are based on Hive.
+AttributeReference("col_name", StringType, nullable = false,
+  new MetadataBuilder().putString("comment", "name of the 
column").build())(),
+AttributeReference("data_type", StringType, nullable = false,
+  new MetadataBuilder().putString("comment", "data type of the 
column").build())(),
+AttributeReference("comment", StringType, nullable = true,
+  new MetadataBuilder().putString("comment", "comment of the 
column").build())()
+  )
+
+  override def processMetadata(sparkSession: SparkSession): Seq[Row] = {
+setAuditTable(carbonTable)
+var results = Seq[(String, String, String)]()
+var currField = field
+val inputFieldsIterator = inputFieldNames.iterator()
+var inputColumn = inputFieldsIterator.next()
+while (results.size == 0) {
+  breakable {
+if 
(currField.dataType.typeName.equalsIgnoreCase(CarbonCommonConstants.ARRAY)) {
+  if (inputFieldsIterator.hasNext) {
+val nextField = inputFieldsIterator.next()
+if (!nextField.equalsIgnoreCase("item")) {
+  throw handleException(nextField, currField.name, 
carbonTable.getTableName)
+}
+currField = StructField("item", 
currField.dataType.asInstanceOf[ArrayType].elementType)
+inputColumn += "." + currField.name
+break()
+  }
+  val colComment = currField.getComment().getOrElse("null")
+  results = Seq((inputColumn,
+currField.dataType.typeName, 
currField.getComment().getOrElse("null")),
+("## Children of " + inputColumn + ":  ", "", ""))
+  results ++= Seq(("item", currField.dataType.asInstanceOf[ArrayType]
+.elementType.simpleString, colComment))

Review comment:
   colComment given for Parent Column, will be displayed while describe 
child columns also?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4034: [CARBONDATA-4091] support prestosql 333 integartion with carbon

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4034:
URL: https://github.com/apache/carbondata/pull/4034#issuecomment-813831709


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5126/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4110: [CARBONDATA-4158]Secondary Index as a coarse-grain datamap and use them for Presto queries

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4110:
URL: https://github.com/apache/carbondata/pull/4110#issuecomment-813822392


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5127/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4110: [CARBONDATA-4158]Secondary Index as a coarse-grain datamap and use them for Presto queries

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4110:
URL: https://github.com/apache/carbondata/pull/4110#issuecomment-813821344


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3376/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #4034: [CARBONDATA-4091] support prestosql 333 integartion with carbon

2021-04-05 Thread GitBox


ajantha-bhat commented on pull request #4034:
URL: https://github.com/apache/carbondata/pull/4034#issuecomment-813776191


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813382532


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5125/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116#issuecomment-813379819


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3374/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4110: [CARBONDATA-4158]Secondary Index as a coarse-grain datamap and use them for Presto queries

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4110:
URL: https://github.com/apache/carbondata/pull/4110#issuecomment-813362851


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5124/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4110: [CARBONDATA-4158]Secondary Index as a coarse-grain datamap and use them for Presto queries

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4110:
URL: https://github.com/apache/carbondata/pull/4110#issuecomment-813362446


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3373/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 opened a new pull request #4116: [CARBONDATA-4162] Leverage Secondary Index till segment level with Spark plan rewrite

2021-04-05 Thread GitBox


nihal0107 opened a new pull request #4116:
URL: https://github.com/apache/carbondata/pull/4116


### Why is this PR needed?
In the existing architecture, if the parent(main) table and SI table don’t 
have the same valid segments then we disable the SI table. And then from the 
next query onwards, we scan and prune only the parent table until we trigger 
the next load or REINDEX command (as these commands will make the parent and SI 
table segments in sync). Because of this, queries take more time to give the 
result when SI is disabled.

### What changes were proposed in this PR?
   1. Instead of disabling the SI table(when parent and child table segments 
are not in sync) we will do pruning on SI tables for all the valid 
segments(segments with status success, marked for update and load partial 
success) and the rest of the segments will be pruned by the parent table.
   2. Now, different SI tables may contain different numbers of segments. In 
that case, made the changes to identify the best fit SI table based on segment 
count. If more than one SI table contains the same segment count then identify 
the best fit SI table based on the current design.
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- Yes
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




Re: [DISCUSSION] Support alter schema for complex types

2021-04-05 Thread akshay_nuthala
Handled comments



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Support SI at Segment level

2021-04-05 Thread Nihal
Hi All, 
Thanks for your input and suggestion.

For now, we will support leveraging SI to segment level only with SQL
plan rewrite(already mentioned in this thread and design document). 

   As a parallel work is going on to support SI as datamap(without plan
rewrite), which will be at table level. 
This work is independent of the existing property "isSITableEnabled" 
as mentioned in the design doc or  PR 4110
  . 
Also, there is no other major conflict or dependency between both designs.
So we can safely handle both the work parallelly.

We are planning to leverage the datamap SI to the segment level
later(once the PR merged). I will create a separate JIRA ticket to track
this work. 


Regards
Nihal kumar ojha



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Looking to contribute to carbondata

2021-04-05 Thread Liang Chen
Hi Pratyaksh

Welcome to Apache CarbonData community, we will discuss with you and help
you quickly to be familiar with CarbonData. 
one suggestion : please first join in dev mailing list and check the quick
start document.

Regards
Liang

Pratyaksh Sharma wrote
> Hi everyone,
> 
> I am looking to contribute to this project. I tried going through the
> jiras
> but could not find any jira with label 'newBie' or something similar. So
> just wanted to check if we have any such label that a new contributor can
> use to search basic tasks and get started?
> 
> If not, can someone point me to some appropriate jira so that I may pick
> it
> up? Any leads are appreciated.
> 
> My jira id - pratyakshsharma.





--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4115: [CARBONDATA-4160] Alter carbon schema by adding single-level complex columns(array/struct)

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4115:
URL: https://github.com/apache/carbondata/pull/4115#issuecomment-813232378


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbon_PR_Builder_2.4.5/3372/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA2 commented on pull request #4115: [CARBONDATA-4160] Alter carbon schema by adding single-level complex columns(array/struct)

2021-04-05 Thread GitBox


CarbonDataQA2 commented on pull request #4115:
URL: https://github.com/apache/carbondata/pull/4115#issuecomment-813232084


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12602/job/ApacheCarbonPRBuilder2.3/5123/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org