Re: [DISCUSSION] Optimize the properties documentation or comments

2018-12-15 Thread Liang Chen
+1

Regards
Liang

xubo245 wrote
> Optimize the properties documentation or comments:
> Some properties have not documentation or comments, which will not easy to
> understand for user.
> We should add properties documentation or comments.
> 
> Unify documentation:
> Some properties have not documentation or comments in code such as
> org.apache.carbondata.core.constants.CarbonCommonConstants , but it has
> some
> documentation or comments on .md file, so we should unify it.
> 
> JIRA:
> https://issues.apache.org/jira/browse/CARBONDATA-3170
> 
> 
> 
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/





--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: [Discussion] Make 'no_sort' as default sort_scope and keep sort_columns as 'empty' by default

2018-12-15 Thread Liang Chen
Hi

First, let me understand your propoal,you mean : 
1, If user defines the "sort_columns=columns" : all behaviors are same as
the current, no any change.(most of users will set this key option during
create carbondata table)
2, If user doesn't define the "sort_columns" : current default behavior: all
the dimension columns are selected for sort_columns, sort_scope is
local_sort :  *you propose to change this default behavior,use the no_sort,
right ?*

if yes, I agree with this proposal. and propose to remove "empty
sort_column" option. *it would be more easy for users to understand.  If
define the sort_column, use the local_sort, if don't define the sort_column,
use the no_sort.*

Regards
Liang


Ajantha Bhat wrote
> Hi all,
> Currently in carbondata, we have 'local_sort' as default sort_scope and by
> default, all the dimension columns are selected for sort_columns.
> This will slow down the data loading.
> *To give the best performance benefit to user by default values, *
> we can change sort_scope to 'no_sort' and stop using all dimensions for
> sort_columns by default.
> Also if sort_columns are specified but sort_scope is not specified by the
> user, implicitly need to consider scort_scope as 'local_sort'.
> These default values are applicable for carbonsession, spark file format
> and SDK also. (all will have the same behavior)
> 
> With these changes below is the performance results of TPCH queries on
> 500GB data
> 
> 
> 
> ** Load time is improved nearly by 4 times. * total Query time by all
> queries is improved. (50% of queries are faster with no_sort, other 50%
> queries are slightly degraded or same. overall better performance)*
> Also when I did this change, I found few major issues from existing code
> in
> 'no_sort' and empty sort_columns flow. I have fixed that also.
> Below are the issues found,
> 
> 
> 
> 
> *[CARBONDATA-3162] Range filters don't remove null values for no_sort
> direct dictionary dimension columns. [CARBONDATA-3163] If table has
> different time format, for no_sort columns data goes as bad record (null)
> for second table when loaded after first table.[CARBONDATA-3164] During
> no_sort, exception happened at converter step is not reaching to user.
> same
> problem in SDK and spark file format flow also.Also fixed multiple test
> case issues.*
> I have already opened a PR for fixing these issues.
> https://github.com/apache/carbondata/pull/2966
> 
> Let me know if any suggestions about these changes.
> 
> Thanks,
> Ajantha





--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: [carbondata-presto enhancements] support reading pre-aggregate table in presto

2018-12-15 Thread Liang Chen
Hi xm_zzc

Good questions.
As you know, pre-aggregate table use the datamap, need to change query plan,
it will be considered after 1.5.2.

Before presto integration is alpha feature, 1.5.2 will do pretty
optimizations and change presto integration to formal feature(support to
deploy in production).

Regards
Liang


xm_zzc wrote
> Hi all:
>   Do we plan to support reading pre-aggregate table in presto? 
> 
> 
> 
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/





--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/