Re: [DISCUSSION] Distributed Index Cache Server

2019-02-12 Thread manishgupta88
+1 1. Add the impacted areas in design document. 2. If any executor goes down then update the index cache to executor mapping in driver accordingly. 3. Even though the cache would be divided based on index files, the minimum unit of cache need to be fixed. Example: 1 segment cache should belong to

Re: [VOTE] Apache CarbonData 1.5.2(RC2) release

2019-02-01 Thread manishgupta88
+1 Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSS] Move to gitbox as per ASF infra team mail

2019-01-06 Thread manishgupta88
+1 Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion] Make 'no_sort' as default sort_scope and keep sort_columns as 'empty' by default

2018-12-17 Thread manishgupta88
Hi Ajantha +1 for the proposal. 1. I agree with Liang to remove empty SORT_COLUMNS option. This will give more calrity to the user about the property behavior. If configured we use LOCAL_SORT else we use NO_SORT. Internal behavior you can keep anything as per the implementation, it need nnot be e

Re: [DISCUSSION] Complex Delimiter support as per Hive format

2018-12-07 Thread manishgupta88
+1 We should modify the delimters as per hive. Also update the documentation as per the change. Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion]Alter table column rename feature

2018-12-07 Thread manishgupta88
+1 We already have a DDL for data type change and the same can be used for rename column. The DDL is same as that of hive. Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

RE: [SUGGESTION]Support compaction no_sort

2018-12-05 Thread manishgupta88
Hi Xuchuanyin The scope for this feature is to SORT the data during compaction when the data is loaded using NO_SORT option during data load operation. There are few users who want to maximize the data load speed and in turn fine tune the data further during off peak time (time when system is leas

Re: Throw NullPointerException occasionally when query from stream table

2018-11-15 Thread manishgupta88
Hi xm_zzc As I can see from logs and the code flow, the hand-off code clears the cache from executor code while the exception is thrown from driver code during query. You are getting the exception because you are using local mode whereIn executor and driver are in the same JVM. We will check on th

Re: [VOTE] Apache CarbonData 1.5.0(RC2) release

2018-10-10 Thread manishgupta88
+1 Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] Optimizing the writing of min max for a column

2018-09-17 Thread manishgupta88
Hi Xuchuanyin The idea you have mentioned is good and correct. But I feel that the current implementation behavior is better because of the following reasons. 1. Code understanding will be good as per the current implementation. Looking at the thrift anyone can understand the design and come to

Re: [DISCUSSION] Optimizing the writing of min max for a column

2018-09-17 Thread manishgupta88
Hi Dev After discussion with PMC members the property name is modified to 'carbon.minmax.allowed.byte.count' and below is the list of updated configurations. Default value: 200 bytes (100 characters) Max value: 1000 bytes (500 characters) Min value: 10 bytes (5 characters) Regards Manish Gupta

Re: [DISCUSSION] Remove BTree related code

2018-08-23 Thread manishgupta88
+! I agree with the idea of removing the B-Tree code as it is not getting used now. Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] Support Standard Spark's FileFormat interface in Carbondata

2018-08-23 Thread manishgupta88
+1 Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

[DISCUSSION] Support Map type for complex type columns

2018-08-16 Thread manishgupta88
Hi All, I am working on supporting complex type map columns. Please find below the scope for the same. *Scope:* 1. Create Table DDL support for complex map type. 1. Support loading of data for complex map type column [SDK, DataLoad DDL]. 2. Design to consider n level nested support for map values

Re: [VOTE] Apache CarbonData 1.4.1(RC1) release

2018-08-01 Thread manishgupta88
Agree with Liang. -1 Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion] Blocklet DataMap caching in driver

2018-06-21 Thread manishgupta88
Thanks Ravi for the feedback. I completely agree with you that we need to develop the second solution ASAP. Please find my response below for your queries. 1. what if the query comes on noncached columns, will it start read from disk in driver side for minmax ? - If query is on a non-cached colu

Re: Re: COMPACT error

2018-04-30 Thread manishgupta88
Hi We are not able to reproduce this issue here and from the logs attached it is not clear what is the exact root cause for exception. 1. Can you please check the disk space in the machines you are loading the data. 2. Is this issue always reproducible? If yes, please share your sample data and s

Re: COMPACT error

2018-04-26 Thread manishgupta88
Hi, >From the exception it seems there is some problem during processing the data in writer step. Can you please share the complete executor and driver logs to get some idea on the exact issue. Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nab

Re: Change the 'comment' content for column when execute command 'desc formatted table_name'

2018-04-25 Thread manishgupta88
I agree with Liang. We can modify the complete describe formatted command display and show the detailed information as suggested by Liang. Liang we can make a small change in your suggestion. As we are displaying the information to the user we should not include Underscore(_) in the property names

Re: Cannot seek after EOF

2018-04-16 Thread manishgupta88
Hi Can you please provide more details on the issue 1. Steps to reproduce the issue 2. Which Carbondata version you are using Regards Manish Gupta -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: insert carbondata table failed

2017-09-18 Thread manishgupta88
Hi Feng, You can also refer the below links wherein the spark users have tried to resolve this issue by making changes in the configuration. This might help you. https://stackoverflow.com/questions/28901123/why-do-spark-jobs-fail-with-org-apache-spark-shuffle-metadatafetchfailedexceptio https://

Re: 2 problems with loading data into a carbon table

2017-08-29 Thread manishgupta88
Hi Marek, >From the logs it seems that this is a bug in the code. You can raise a jira to track the issue. Regards Manish Gupta -- View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/2-problems-with-loading-data-into-a-carbon-table-tp20819p208

Re: method not found issue when creating table

2017-08-23 Thread manishgupta88
Hi Lionel, Carbon table creation flow is executed on the driver side, Executors do not participate in creation of carbon table. From the logs it seems that spark-catalyst jar is missing which is generally placed under $SPARK_HOME/jars OR $SPARK_HOME/lib directory. Please check if spark jars direct

Re: carbon data performance doubts

2017-07-20 Thread manishgupta88
No Dictionary_Exclude is supported only for String data type columns. Regards Manish Gupta -- View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/carbon-data-performance-doubts-tp18438p18559.html Sent from the Apache CarbonData Dev Mailing List

Re: carbon data performance doubts

2017-07-19 Thread manishgupta88
Hi Swapnil Please find my answers inline. 1. What is the use of *carbon.number.of.cores *property and how is it different from spark's executor cores? -carbon.number.of.cores is used for reading the footer and header of the carbondata file during query execution. Spark executor cores is a proper