Re: Feature Proposal: CarbonCli tool

2018-09-05 Thread Jacky Li
Reply inline > 在 2018年9月5日,下午3:46,ravipesala 写道: > > > Hi , > > I have following doubts and suggestions for this tool. > > 1. To which module you are planning to keep this tool. Ideally, it should be > under tools folder and going forward we can add more tools like this under > it. Sure, I w

Re: Feature Proposal: CarbonCli tool

2018-09-05 Thread ravipesala
Hi , I have following doubts and suggestions for this tool. 1. To which module you are planning to keep this tool. Ideally, it should be under tools folder and going forward we can add more tools like this under it. 2. Which file schema are you printing? are you randomly choosing the file to r

Re: Feature Proposal: CarbonCli tool

2018-09-05 Thread xuchuanyin
-c,--columncolumn to print statistics --- Do we support multiple columns and how to use it? I think the current short name for cli command is not clear. Before it is stable, I do recommend to use the full name instead of short name. For example -b for --tblProperties is not suitable... --

Re: Feature Proposal: CarbonCli tool

2018-09-04 Thread Jacky Li
For “summary” command, I just pick the first carbondata file and read the schema from its header. The intention here is just to show one schema, assuming all schema in all data files in this folder is the same. If there is need to validate schema in all files, we can add a “validate” command.

Re: Feature Proposal: CarbonCli tool

2018-09-04 Thread xuchuanyin
In the above example, you specify one directory and get two segments. But it only shows one schema info. I thought the number of the schema is the same as that of data directories. Since you mentioned that we can support nested folder, what if the schema in these files are not the same? Another pr

Feature Proposal: CarbonCli tool

2018-09-04 Thread Jacky Li
Hi All, When I am tuning carbon performance, very often that I want to check the metadata in carbon files without launching spark shell or sql. In order to do that, I am writing a tool to print metadata information of a given data folder. Currently, I am planning to do like this: usage: Carbon