[jira] [Resolved] (CARBONDATA-4325) Documentation Issue in Github Link: https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md and fix partition table creation with

2022-03-04 Thread Indhumathi (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Indhumathi resolved CARBONDATA-4325.

Fix Version/s: 2.3.0
   Resolution: Fixed

> Documentation Issue in Github Link: 
> https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md
>  and fix partition table creation with df issue
> 
>
> Key: CARBONDATA-4325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4325
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Reporter: PURUJIT CHAUGULE
>Priority: Minor
> Fix For: 2.3.0
>
> Attachments: 
> Partition_Table_Creation_Fail_With_Spatial_Index_Property.png
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> *Scenario 1:*
> [https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md]
>  :
>  * Under _*SUPPORTED Options,*_ mention all supported Table Properties. 
> Following are list of supported Table Properties not mentioned in the 
> document:
>  * 
>  ** bucketNumber
>  ** bucketColumns
>  ** streaming
>  ** timestampformat
>  ** dateformat
>  ** SPATIAL_INDEX
>  ** SPATIAL_INDEX_type
>  ** SPATIAL_INDEX_sourcecolumns
>  ** SPATIAL_INDEX_originLatitude
>  ** SPATIAL_INDEX_gridSize
>  ** SPATIAL_INDEX_conversionRatio
>  ** SPATIAL_INDEX_class
> *Scenario 2:*
> _Partition Table Creation Using Spark Dataframe Fails with Spatial Index 
> Property._
> Queries:
> val geoSchema = StructType(Seq(StructField("timevalue", LongType, nullable = 
> true),
>       StructField("longitude", LongType, nullable = false),
>       StructField("latitude", LongType, nullable = false)))
> val geoDf = sqlContext.read.option("delimiter", ",").option("header", 
> "true").schema(geoSchema).csv("hdfs://hacluster/geodata/geodata.csv")
> sql("drop table if exists source_index_df").show()
> geoDf.write
>       .format("carbondata")
>       .option("tableName", "source_index_df")
>       .option("partitionColumns", "timevalue")
>       .option("SPATIAL_INDEX", "mygeohash")
>       .option("SPATIAL_INDEX.mygeohash.type", "geohash")
>       .option("spatial_index.MyGeoHash.sourcecolumns", "longitude, latitude")
>       .option("SPATIAL_INDEX.MyGeoHash.originLatitude", "39.832277")
>       .option("SPATIAL_INDEX.mygeohash.gridSize", "50")
>       .option("spatial_index.mygeohash.conversionRatio", "100")
>       .option("spatial_index.mygeohash.CLASS", 
> "org.apache.carbondata.geo.GeoHashIndex")
>       .mode(SaveMode.Overwrite)
>       .save()
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (CARBONDATA-4325) Documentation Issue in Github Link: https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md and fix partition table creation with d

2022-03-04 Thread PURUJIT CHAUGULE (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PURUJIT CHAUGULE updated CARBONDATA-4325:
-
Attachment: Partition_Table_Creation_Fail_With_Spatial_Index_Property.png

> Documentation Issue in Github Link: 
> https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md
>  and fix partition table creation with df issue
> 
>
> Key: CARBONDATA-4325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4325
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Reporter: PURUJIT CHAUGULE
>Priority: Minor
> Attachments: 
> Partition_Table_Creation_Fail_With_Spatial_Index_Property.png
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> *Scenario 1:*
> [https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md]
>  :
>  * Under _*SUPPORTED Options,*_ mention all supported Table Properties. 
> Following are list of supported Table Properties not mentioned in the 
> document:
>  * 
>  ** bucketNumber
>  ** bucketColumns
>  ** streaming
>  ** timestampformat
>  ** dateformat
>  ** SPATIAL_INDEX
>  ** SPATIAL_INDEX_type
>  ** SPATIAL_INDEX_sourcecolumns
>  ** SPATIAL_INDEX_originLatitude
>  ** SPATIAL_INDEX_gridSize
>  ** SPATIAL_INDEX_conversionRatio
>  ** SPATIAL_INDEX_class
> *Scenario 2:*
> _Partition Table Creation Using Spark Dataframe Fails with Spatial Index 
> Property._
> Queries:
> val geoSchema = StructType(Seq(StructField("timevalue", LongType, nullable = 
> true),
>       StructField("longitude", LongType, nullable = false),
>       StructField("latitude", LongType, nullable = false)))
> val geoDf = sqlContext.read.option("delimiter", ",").option("header", 
> "true").schema(geoSchema).csv("hdfs://hacluster/geodata/geodata.csv")
> sql("drop table if exists source_index_df").show()
> geoDf.write
>       .format("carbondata")
>       .option("tableName", "source_index_df")
>       .option("partitionColumns", "timevalue")
>       .option("SPATIAL_INDEX", "mygeohash")
>       .option("SPATIAL_INDEX.mygeohash.type", "geohash")
>       .option("spatial_index.MyGeoHash.sourcecolumns", "longitude, latitude")
>       .option("SPATIAL_INDEX.MyGeoHash.originLatitude", "39.832277")
>       .option("SPATIAL_INDEX.mygeohash.gridSize", "50")
>       .option("spatial_index.mygeohash.conversionRatio", "100")
>       .option("spatial_index.mygeohash.CLASS", 
> "org.apache.carbondata.geo.GeoHashIndex")
>       .mode(SaveMode.Overwrite)
>       .save()
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (CARBONDATA-4325) Documentation Issue in Github Link: https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md and fix partition table creation with d

2022-03-04 Thread PURUJIT CHAUGULE (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PURUJIT CHAUGULE updated CARBONDATA-4325:
-
Description: 
*Scenario 1:*

[https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md]
 :
 * Under _*SUPPORTED Options,*_ mention all supported Table Properties. 
Following are list of supported Table Properties not mentioned in the document:

 * 
 ** bucketNumber
 ** bucketColumns
 ** streaming
 ** timestampformat
 ** dateformat
 ** SPATIAL_INDEX
 ** SPATIAL_INDEX_type
 ** SPATIAL_INDEX_sourcecolumns
 ** SPATIAL_INDEX_originLatitude
 ** SPATIAL_INDEX_gridSize
 ** SPATIAL_INDEX_conversionRatio
 ** SPATIAL_INDEX_class

*Scenario 2:*

_Partition Table Creation Using Spark Dataframe Fails with Spatial Index 
Property._

Queries:

val geoSchema = StructType(Seq(StructField("timevalue", LongType, nullable = 
true),
      StructField("longitude", LongType, nullable = false),
      StructField("latitude", LongType, nullable = false)))
val geoDf = sqlContext.read.option("delimiter", ",").option("header", 
"true").schema(geoSchema).csv("hdfs://hacluster/geodata/geodata.csv")

sql("drop table if exists source_index_df").show()
geoDf.write
      .format("carbondata")
      .option("tableName", "source_index_df")
      .option("partitionColumns", "timevalue")
      .option("SPATIAL_INDEX", "mygeohash")
      .option("SPATIAL_INDEX.mygeohash.type", "geohash")
      .option("spatial_index.MyGeoHash.sourcecolumns", "longitude, latitude")
      .option("SPATIAL_INDEX.MyGeoHash.originLatitude", "39.832277")
      .option("SPATIAL_INDEX.mygeohash.gridSize", "50")
      .option("spatial_index.mygeohash.conversionRatio", "100")
      .option("spatial_index.mygeohash.CLASS", 
"org.apache.carbondata.geo.GeoHashIndex")
      .mode(SaveMode.Overwrite)
      .save()

 

  was:
[https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md]
 :
 * Under _*SUPPORTED Options,*_ mention all supported Table Properties. 
Following are list of supported Table Properties not mentioned in the document:

 ** bucketNumber
 ** bucketColumns
 ** streaming
 ** timestampformat
 ** dateformat
 ** SPATIAL_INDEX
 ** SPATIAL_INDEX_type
 ** SPATIAL_INDEX_sourcecolumns
 ** SPATIAL_INDEX_originLatitude
 ** SPATIAL_INDEX_gridSize
 ** SPATIAL_INDEX_conversionRatio
 ** SPATIAL_INDEX_class


> Documentation Issue in Github Link: 
> https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md
>  and fix partition table creation with df issue
> 
>
> Key: CARBONDATA-4325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4325
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Reporter: PURUJIT CHAUGULE
>Priority: Minor
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> *Scenario 1:*
> [https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md]
>  :
>  * Under _*SUPPORTED Options,*_ mention all supported Table Properties. 
> Following are list of supported Table Properties not mentioned in the 
> document:
>  * 
>  ** bucketNumber
>  ** bucketColumns
>  ** streaming
>  ** timestampformat
>  ** dateformat
>  ** SPATIAL_INDEX
>  ** SPATIAL_INDEX_type
>  ** SPATIAL_INDEX_sourcecolumns
>  ** SPATIAL_INDEX_originLatitude
>  ** SPATIAL_INDEX_gridSize
>  ** SPATIAL_INDEX_conversionRatio
>  ** SPATIAL_INDEX_class
> *Scenario 2:*
> _Partition Table Creation Using Spark Dataframe Fails with Spatial Index 
> Property._
> Queries:
> val geoSchema = StructType(Seq(StructField("timevalue", LongType, nullable = 
> true),
>       StructField("longitude", LongType, nullable = false),
>       StructField("latitude", LongType, nullable = false)))
> val geoDf = sqlContext.read.option("delimiter", ",").option("header", 
> "true").schema(geoSchema).csv("hdfs://hacluster/geodata/geodata.csv")
> sql("drop table if exists source_index_df").show()
> geoDf.write
>       .format("carbondata")
>       .option("tableName", "source_index_df")
>       .option("partitionColumns", "timevalue")
>       .option("SPATIAL_INDEX", "mygeohash")
>       .option("SPATIAL_INDEX.mygeohash.type", "geohash")
>       .option("spatial_index.MyGeoHash.sourcecolumns", "longitude, latitude")
>       .option("SPATIAL_INDEX.MyGeoHash.originLatitude", "39.832277")
>       .option("SPATIAL_INDEX.mygeohash.gridSize", "50")
>       .option("spatial_index.mygeohash.conversionRatio", "100")
>       .option("spatial_index.mygeohash.CLASS", 
> "org.apache.carbondata.geo.GeoHashIndex")
>       .mode(SaveMode.Overwrite)
>       .save()
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (CARBONDATA-4325) Documentation Issue in Github Link: https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md and fix partition table creation with d

2022-03-04 Thread PURUJIT CHAUGULE (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PURUJIT CHAUGULE updated CARBONDATA-4325:
-
Summary: Documentation Issue in Github Link: 
https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md
 and fix partition table creation with df issue  (was: Documentation Issue in 
Github Link: 
https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md)

> Documentation Issue in Github Link: 
> https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md
>  and fix partition table creation with df issue
> 
>
> Key: CARBONDATA-4325
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4325
> Project: CarbonData
>  Issue Type: Bug
>  Components: docs
>Reporter: PURUJIT CHAUGULE
>Priority: Minor
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> [https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md]
>  :
>  * Under _*SUPPORTED Options,*_ mention all supported Table Properties. 
> Following are list of supported Table Properties not mentioned in the 
> document:
>  ** bucketNumber
>  ** bucketColumns
>  ** streaming
>  ** timestampformat
>  ** dateformat
>  ** SPATIAL_INDEX
>  ** SPATIAL_INDEX_type
>  ** SPATIAL_INDEX_sourcecolumns
>  ** SPATIAL_INDEX_originLatitude
>  ** SPATIAL_INDEX_gridSize
>  ** SPATIAL_INDEX_conversionRatio
>  ** SPATIAL_INDEX_class



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (CARBONDATA-4326) mv created in beeline not hitting in sql/shell and vice versa if both beeline and sql/shell are running in parellel

2022-03-04 Thread Indhumathi (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Indhumathi resolved CARBONDATA-4326.

Fix Version/s: 2.3.0
   Resolution: Fixed

> mv created in beeline not hitting in sql/shell and vice versa if both beeline 
> and sql/shell are running in parellel
> ---
>
> Key: CARBONDATA-4326
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4326
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Major
> Fix For: 2.3.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> [Steps] :-
> When MV is created in spark-shell/spark-sql on table created using Spark 
> Dataframe, Explain query hits MV in spark-shell/spark-sql, but doesnt hit MV 
> in spark-beeline, Same is the case when MV is created in spark-beeline on 
> table created using Spark Dataframe, query hits MV in spark-beeline, but 
> doesnt hit MV in spark-shell/spark-sql. This issue is faced when both 
> sessions are running in parallel during MV Creation. On restarting the 
> sessions of Spark-shell/ Spark-beeline, query hits the MV in both sessions.
> Queries Table created using Spark Dataframe:
> val geoSchema = StructType(Seq(StructField("timevalue", LongType, nullable = 
> true), StructField("longitude", LongType, nullable = false), 
> StructField("latitude", LongType, nullable = false))) val geoDf = 
> sqlContext.read.option("delimiter", ",").option("header", 
> "true").schema(geoSchema).csv("hdfs://hacluster/geodata/geodata.csv")
> sql("drop table if exists source_index_df").show() geoDf.write 
> .format("carbondata") .option("tableName", "source_index_df") 
> .mode(SaveMode.Overwrite) .save()
> Queries for MV created in spark-shell: sql("CREATE MATERIALIZED VIEW 
> datamap_mv1 as select latitude,longitude from source_index_df group by 
> latitude,longitude").show() sql("explain select latitude,longitude from 
> source_index_df group by latitude,longitude").show(100,false)
> Queries for MV created in spark-beeline/spark-sql: CREATE MATERIALIZED VIEW 
> datamap_mv1 as select latitude,longitude from source_index_df group by 
> latitude,longitude; explain select latitude,longitude from source_index_df 
> group by latitude,longitude;
> [Expected Result] :- mv created in beeline should hit the sql/shell and vice 
> versa if both beeline and sql/shell are running in parellel
> [Actual Issue]:- mv created in beeline not hitting in sql/shell and vice 
> versa if both beeline and sql/shell are running in parellel
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)