Hi

Please create a new mailing list discussion for your topic.
Please provide all columns' cardinality.

For high cardinality column, system doesn't do dictionary
-------------------------------------------------------
##threshold to identify whether high cardinality column
#high.cardinality.threshold=1000000

Regards
Liang


simafengyun wrote
> Hi DEV,
> 
> I create table according to the below SQL 
> 
>     cc.sql(""" 
>            CREATE TABLE IF NOT EXISTS t3 
>            (ID Int, date Timestamp, country String, 
>            name String, phonetype String, serialname String, salary Int, 
>            name1 String, name2 String, name3 String, name4 String, name5
> String, name6 String,name7 String,name8 String 
>            ) 
>            STORED BY 'carbondata' 
>            """) 
> 
> after I load data to this table, I found the dimension columns "name" and
> "name7"  both have no dictionary encode. 
> column "name" has no inverted index but column "name7" has inverted index 
> questions: 
> 1. why by default they have no dictionary decode and some have no inverted
> index?
> 2. is there any document to introduce these loading strategies?
> 3. the dimension column "name" has no inverted index, does its' data still
> have order in DataChunk2 blocklet? 
> 4. as I know, usually dimension column data is sorted and stored in
> DataChunk2 blocklet. 
>  which cases the dimension column data are not sorted in DataChunk2
> blocklet except user specify the column with no inverted index? 
> 
> 
> 5. as I know the first column of mdk key is always sorted in DataChunk2
> blocklet, why not set the isExplicitSorted to true?





--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Re-DISCUSSION-Initiating-Apache-CarbonData-1-1-0-incubating-Release-tp9672p9687.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.

Reply via email to