[ https://issues.apache.org/jira/browse/CARBONDATA-888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961952#comment-15961952 ]
Sanoj MG commented on CARBONDATA-888: ------------------------------------- Can this be assigned to me, I have already made the code changes and would like to create a pr. > Dictionary include / exclude option in dataframe writer > ------------------------------------------------------- > > Key: CARBONDATA-888 > URL: https://issues.apache.org/jira/browse/CARBONDATA-888 > Project: CarbonData > Issue Type: Improvement > Components: spark-integration > Affects Versions: 1.2.0-incubating > Environment: HDP 2.5, Spark 1.6 > Reporter: Sanoj MG > Priority: Minor > Fix For: 1.2.0-incubating > > > While creating a Carbondata table from dataframe, currently it is not > possible to specify columns that needs to be included in or excluded from the > dictionary. An option is required to specify it as below : > df.write.format("carbondata") > .option("tableName", "test") > .option("compress","true") > .option("dictionary_include","incol1,intcol2") > .option("dictionary_exclude","stringcol1,stringcol2") > .mode(SaveMode.Overwrite) > .save() > We have lot of integer columns that are dimensions, dataframe.save is used to > quickly create tables instead of writing ddls, and it would be nice to have > this feature to execute POCs. > > -- This message was sent by Atlassian JIRA (v6.3.15#6346)