Re: Re: Re: Optimize Order By + Limit Query

2017-03-29 Thread Ravindra Pesala
Hi, It comes up with many limitations 1. It cannot work for dictionary columns. As there is no guarantee that dictionary allocation is in sorted order. 2. It cannot work for no inverted index columns. 3. It cannot work for measures. Moreover as you mentioned that it can reduce IO, But I don't

Re:Re: Re: Optimize Order By + Limit Query

2017-03-29 Thread 马云
Hi Ravindran,yes, use carbon do the sorting if the order by column is not first column.But its sorting is very high since the dimension data in blocklet is stored after sorting.So in carbon can use merge sort + topN to get N data from each block.In addition, the biggest difference is that it

Re: Load data into carbondata executors distributed unevenly

2017-03-29 Thread Ravindra Pesala
Hi, It seems attachments are missing.Can you attach them again. Regards, Ravindra. On 30 March 2017 at 08:02, a wrote: > Hello! > > *Test result:* > When I load csv data into carbondata table 3 times,the executors > distributed unevenly。My purpose >

Load data into carbondata executors distributed unevenly

2017-03-29 Thread a
Hello! Test result: When I load csv data into carbondata table 3 times,the executors distributed unevenly。My purpose is one node one task,but the result is some node has 2 task and some node has no task。 See the load data 1.png,data 2.png,data 3.png。 The carbondata data.PNG is the data

Re: [DISCUSSION]: (New Feature) Streaming Ingestion into CarbonData

2017-03-29 Thread Aniket Adnaik
Hi Jacky, Thanks for your comments. I guess i should have uploaded in google doc format instead of pdf, somehow google doc messes up all the diagrams if I copy paste and i have not figured the way to fix it. Anyway, I apologize for the inconvenience for those wanted to add in-line comments in the

Re: [DISCUSSION]: (New Feature) Streaming Ingestion into CarbonData

2017-03-29 Thread Aniket Adnaik
Hi Liang, Thanks, please see my comments to your questions. 2. Whether support compaction for streaming ingested data to add index, or not ? AA>> Yes, Eventually we would need streaming data files to be compacted into regular read optimized CarbonData format. Triggering of compaction can be

Re: Re: Optimize Order By + Limit Query

2017-03-29 Thread Ravindra Pesala
Hi, You mean Carbon do the sorting if the order by column is not first column and provide only limit values to spark. But the same job spark is also doing it just sorts the partition and gets the top values out of it. You can reduce the table_blocksize to get the better sort performance as spark

Re: [DISCUSSION]: (New Feature) Streaming Ingestion into CarbonData

2017-03-29 Thread Jacky Li
Hi Aniket, Comment inline And I have put some review comment in the PDF here: https://drive.google.com/file/d/0B5vjWGChUwXdSUV0OTFkTGE4am8/view?usp=sharing > 在 2017年3月29日,上午7:10,Aniket Adnaik

[jira] [Created] (CARBONDATA-832) Data loading is failing with duplicate header column in csv file

2017-03-29 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-832: --- Summary: Data loading is failing with duplicate header column in csv file Key: CARBONDATA-832 URL: https://issues.apache.org/jira/browse/CARBONDATA-832

Re: carbondata hive

2017-03-29 Thread Sea
set hive.mapred.supports.subdirectories=true; set mapreduce.input.fileinputformat.input.dir.recursive=true; -- Original -- From: "261810726";<261810...@qq.com>; Date: Wed, Mar 29, 2017 07:42 PM To: "dev"; Subject: Re:

Re: carbondata hive

2017-03-29 Thread Sea
Hi, fengyun: please do as the follow steps https://github.com/cenyuhai/incubator-carbondata/blob/CARBONDATA-727/integration/hive/hive-guide.md -- Original -- From: "";<1141982...@qq.com>; Date: Wed, Mar 29, 2017 09:21 AM To:

[jira] [Created] (CARBONDATA-831) can't run PerfTest example

2017-03-29 Thread sehriff (JIRA)
sehriff created CARBONDATA-831: -- Summary: can't run PerfTest example Key: CARBONDATA-831 URL: https://issues.apache.org/jira/browse/CARBONDATA-831 Project: CarbonData Issue Type: Bug

Re: [DISCUSSION]: (New Feature) Streaming Ingestion into CarbonData

2017-03-29 Thread Liang Chen
Hi Aniket Thanks for your great contribution, The feature of ingestion streaming data to carbondata would be very useful for some real-time query scenarios. Some inputs from my side: 1. I agree with approach 2 for streaming file format, the performance for query must be ensured. 2. Whether

Re: [DISCUSSION] Initiating Apache CarbonData-1.1.0 incubating Release

2017-03-29 Thread Henry Saputra
Sure, lets do one more release +1 On Mon, Mar 27, 2017 at 2:58 AM, manish gupta wrote: > +1 > > Regards > Manish Gupta > > On Mon, Mar 27, 2017 at 2:41 PM, Kumar Vishal > wrote: > > > +1 > > -Regards > > Kumar Vishal > > > > On Mar 27,

[jira] [Created] (CARBONDATA-830) Incorrect schedule for NewCarbonDataLoadRDD

2017-03-29 Thread Weizhong (JIRA)
Weizhong created CARBONDATA-830: --- Summary: Incorrect schedule for NewCarbonDataLoadRDD Key: CARBONDATA-830 URL: https://issues.apache.org/jira/browse/CARBONDATA-830 Project: CarbonData Issue