Re: Implement delete and update feature in carbondata SDK.

2020-06-22 Thread xubo245
+1。 This is neccsarry requirement for users. Suggestion: change CarbonSDKUID to common name. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion]SDK support to load data from parquet, ORC, CSV, Avro and JSON file.

2020-06-22 Thread xubo245
+1, If carbondata sdk can support load data from parquet, ORC, CSV, Avro and JSON file, it will more convenient for users to use CarbonData. It avoid every user to parser different fileformat and convert to carbondata format by coding. CarbonData SDK can refer spark-sql implementation, but

Re: [VOTE] Apache CarbonData 2.0.0(RC2) release

2020-05-02 Thread xubo245
-1! Why PyCarbon isn't key features and improvements ? PyCarbon: provide python interface for users to use CarbonData by python code https://issues.apache.org/jira/browse/CARBONDATA-3254 Including: 1.PySDK: provide python interface to read and write CarbonData 2.Integrating deep learning

[DISCUSSION] PyCarbon: provide python interface for users to use CarbonData by python code

2019-11-23 Thread xubo245
the number of calling S3 API. But it's not easy for them to use carbon by Java/Scala/C++. So it's better to provide python interface for these users to use CarbonData by python code We already work for these feature several months in https://github.com/xubo245/pycarbon *Goals: 1. Apache CarbonData

回复:sql parser use antlr4?

2019-07-22 Thread xubo245
no yet -- 原始邮件 -- 发件人: "melin li"; 发送时间: 2019年7月22日(星期一) 凌晨0:12 收件人: "dev"; 主题: sql parser use antlr4? sql parser use antlr4?

回复: Apache CarbonData 2 RoadMap Feedback

2019-07-18 Thread xubo245
There are some problem when user handle AI data. For example, it's very slow when user upload or download lots of images from S3. It need about 10 hours when user upload 10 million images(40GB) to S3 by using 1 threads. AI developer also want to manage structured data and unstructured data for

[Discuss] CarbonData supports binary data type

2019-04-11 Thread xubo245
CarbonData supports binary data type Version Changes Owner Date 0.1 Init doc for Supporting binary data typeXubo2019-4-10 Background : Binary is basic data type and widely used in various scenarios. So it’s better to support binary data type in CarbonData. Download data from

[Collection] Collect the requirement of hive + CarbonData

2019-03-07 Thread xubo245
Dear all, Hive is a popular data warehouse software in big data domain. It's better for enhance Hive + CarbonData, which will convenient for hive user to read CarbonData. CarbonData supported hive before, and the hive test case can run in CarbonData-1.5.2, but hive module is very old and not

Re: [discussion] Open check code style of example module

2018-12-29 Thread xubo245
it's ok, we can use // scalastyle:off println -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

[DISCUSSION] Optimize the properties documentation or comments

2018-12-13 Thread xubo245
Optimize the properties documentation or comments: Some properties have not documentation or comments, which will not easy to understand for user. We should add properties documentation or comments. Unify documentation: Some properties have not documentation or comments in code such as

Enable non dynamic configuration can be configured dynamically

2018-12-13 Thread xubo245
Enable non dynamic configuration can be configured dynamically There are only 29 properties can be configured dynamically, a lot of properties can't be configured dynamically, we should analysis related properties: which one can be configured dynamically? and then to support it. It will more

Re: [carbondata-presto enhancements] support reading carbon SDK writer output in presto

2018-12-10 Thread xubo245
+1, It will better if we can unify "carbon" and "carbondata", SparkCarbonFileFormat uses carbon and SparkCarbonTableFormat use carbondata. SDK should support transactional table and non-transactional table. DataFrame also should support different type carbon data. -- Sent from:

Re: [Discussion]Alter table column rename feature

2018-12-10 Thread xubo245
+1 Carbon already support RENAME TABLE, if carbon can support RENAME column name and data type, it's better. Can we support like this? ALTER TABLE table_name CHANGE [COLUMN] col_old_name col_new_name [column_type]; column_type is optional. default is keep the same data type with old column

Re: [DISCUSSION] Complex Delimiter support as per Hive format

2018-12-10 Thread xubo245
why has two mail? http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/DISCUSSION-Complex-Delimiter-support-as-per-Hive-format-td69879.html -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion] Support transactional table in SDK

2018-12-10 Thread xubo245
+1 for Support transactional table in SDK SDK should can read transactional table written by carbonSession. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: SDK support LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE

2018-12-10 Thread xubo245
Whether different data type affects performance? Have you test with long string column? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] Complex Delimiter support as per Hive format

2018-12-10 Thread xubo245
'\001' and '\002' are invisible character, string won't contains these character usually. But sometimes string will contain ¥,# and other visible character -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] Complex Delimiter support as per Hive format

2018-12-10 Thread xubo245
CSDK also used '\001' and '\002' for Array, I think it's better and more common for different scenario. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [ANNOUNCE] Bo Xu as new Apache CarbonData committer

2018-12-07 Thread xubo245
Thanks all. I am very glad that the Apache CarbonData PMC invited me to be a committer. I will continue to work hard to contribute to the Apache CarbonData community. Thank you! Best wishes! Xubo -- Sent from:

SDK support LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE

2018-12-06 Thread xubo245
When user use SDK and want to use LOCAL DICTIONARY, they can't use LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE because SDK only support local_dictionary_threshold and local_dictionary_enable. So we should support LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE in SDK, then use

Re: [ANNOUNCE] Apache CarbonData 1.5.1 release

2018-12-05 Thread xubo245
Nice -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [ANNOUNCE] Apache CarbonData 1.5.1 release

2018-12-05 Thread xubo245
Please update CarbonData-1.5.1 in http://carbondata.apache.org -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [VOTE] Apache CarbonData 1.5.1(RC2) release

2018-12-03 Thread xubo245
This bug can be fixed in next version(1.5.2) -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] CarbonData to support spark 2.4 version in next carbon version

2018-12-03 Thread xubo245
Are there any limit for supporting Spark-2.4? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

[DISCUSSION] CarbonData to support spark 2.3 version in next carbon version

2018-12-03 Thread xubo245
hi, all Spark has released spark-2.4 more than one month. CarbonData should start to support spark-2.4. I want to develop this, and raised a jira for it:https://issues.apache.org/jira/browse/CARBONDATA-3144 is it ok? -- Sent from:

Re: [VOTE] Apache CarbonData 1.5.1(RC2) release

2018-12-02 Thread xubo245
@jackylk @ravipesala @KanakaKumar @kunal642 https://github.com/apache/carbondata/pull/2940 is a bug, please check it. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Is HDFS is mandatory for Carbon Data while creating table?

2018-12-02 Thread xubo245
not mandatory。 CarbonData support local file system,HDFS and S3(Huawei OBS) There are some examples, for example:./carbondata/examples/spark2/src/main/scala/org/apache/carbondata/examples/CarbonSessionExample.scala -- Sent from:

Re: Carbondata support flink feature

2018-11-26 Thread xubo245
+1, there are some users require support this feature, we can implement it. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Streaming Analytics Meetup [24th November 2018 @ Radisson BLU, Bangalore]

2018-11-24 Thread xubo245
is it repeat? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Streaming Analytics Meetup [24th November 2018 @ Radisson BLU, Bangalore]

2018-11-24 Thread xubo245
the Lise of topics is blank. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Streaming Analytics Meetup [24th November 2018 @ Radisson BLU, Bangalore]

2018-11-24 Thread xubo245
is it repeat? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Streaming Analytics Meetup [24th November 2018 @ Radisson BLU, Bangalore]

2018-11-24 Thread xubo245
It's great. Are there any live link for this meetup? can you share the doc/slides/video after this meetup? -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discuession] Support concurrent read for CSDK

2018-11-24 Thread xubo245
I move readAllParallel to https://github.com/xubo245/carbondata/tree/CARBONDATA-3094_cocurrentReadBackupreadAllParallel, not include in this PR, after discussion, I will raise a new PR for it. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion] How to improve C++ SDK performance?

2018-11-23 Thread xubo245
Hi, anyone has good suggestion for it? I want to improve the performance for it. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

[Enhancement] Read schema support S3 in SDK/CSDK

2018-11-23 Thread xubo245
Hi, all SDK/CSDK don't support read schema support S3, which is limit for user to use SDK/CSDK, for example, some user save data in S3 and want to read schema from the data with SDK/CSDK, it will throw some exception. So we should support read schema support S3. Thank you.

Re: [Discussion] Refactor dynamic configuration

2018-11-23 Thread xubo245
I rasie a PR for it and write a demo:https://github.com/apache/carbondata/pull/2914 Please check it. If the demo is ok, I will change other properties in this PR -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion] Add which test framework for C++ SDK?

2018-11-23 Thread xubo245
hi, anyone has good suggestion for it? If not, we will start integrate googletest for CSDK -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Proposal] Thoughts on general guidelines to follow in Apache CarbonData community

2018-11-23 Thread xubo245
+1, for 1,2,3,4,5,6,8,9,10 for 7, can we support do some test automatic? including performance for common function. It's great for CarbonData community, Can we arrange someone or manager to manage all JIRA and PR, and urge reviewer to review fast. The time will become slower after add these

Re: [Help] Collect minor bugs or small requirements/features

2018-11-23 Thread xubo245
maybe some developer has some unfinished low priority bugs/issues, it's nice if you share it to new users. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Help] Collect CarbonData Related Materials

2018-11-23 Thread xubo245
In China, it's not convenient to access out of country internet network. So I download some video from YouTube and upload to China tencent video, which can be access by china users. If there are materials in other country, like India, please tell/give me, thank you very much. -- Sent from:

[Help] Collect CarbonData Related Materials

2018-11-23 Thread xubo245
rent language), slides and others. I collect some CarbonData learning materials in https://github.com/xubo245/CarbonDataLearning/blob/master/docs/learningMaterials/CarbonData%20Learning%20Materials.md. If you find other related materials, please tell me and give it in comments。 A

[Help] Collect minor bugs or small requirements/features

2018-11-23 Thread xubo245
Hi, all Recently, we have a Apache CarbonData & Spark meetup in Shenzhen. There are many new users want to learning CarbonData or Spark, we can guide new user to contribute code for CarbonData. So we can collect minor bugs or small requirements/features for then, then they can learning and fix

Re: [proposal] Parallelize block pruning of default datamap in driver for filter query processing.

2018-11-23 Thread xubo245
+1, Whether will it affect the SDK/CSDK reader after parallelizing block pruning? please check. SDK and CSDK need keep the carbon files sequence/order -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [VOTE] Apache CarbonData 1.5.1(RC1) release

2018-11-22 Thread xubo245
-1, mv module parent version is incorrect, Jonathan.Wei will raise a PR to fix it this week. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Feature Proposal] Proposal for offline and DDL local dictionary support

2018-11-11 Thread xubo245
SDK has supported local dictionary: org.apache.carbondata.sdk.file.CarbonWriterBuilder#localDictionaryThreshold org.apache.carbondata.sdk.file.CarbonWriterBuilder#enableLocalDictionary But don't support LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE. I think we should support it. There

Re: [Discuss] Removing search mode

2018-11-11 Thread xubo245
+1, There are some random error in CI recently. and the performance only has a little improvement between search mode and non search mode, including filter query. -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Discussion] Refactor dynamic configuration

2018-11-01 Thread xubo245
ok, this is better: public static final Property CARBON_BAD_RECORDS_ACTION = Property.buildStringProperty(). .name(“carbon.bad.records.action”) .default(“FAIL”) .doc(“keep the same description as .md file”) .dynamic(true) .build() I will raise

[Discussion] Do we need implement one reader read with different filter many times?

2018-11-01 Thread xubo245
Some users want to build one reader, and then read with different filter many times. But CarbonSDK only support add filter before build, then read with this filter, user can't change filter after build. Do we need implement one reader read with different filter many times? -- Sent from:

[Discussion] Add which test framework for C++ SDK?

2018-11-01 Thread xubo245
For C++ SDK of carbonData, we need add a test framework to manage test case, including unit test. So add which test framework for C++ SDK? we should discuss in here. I research before, find googletest is a popular test framework, we can try to use it. are there any other better test framework?

Re: [Discussion] Refactor dynamic configuration

2018-11-01 Thread xubo245
the annotation mainly providing literal explain whether this parameter can be dynamic configurable. What's more, it will throw exception if add @CarbonProperty and can't be dynamic configurable. If don't add @CarbonProperty for parameter, it won't throw exception and also won't take effect --

[Discussion] Refactor dynamic configuration

2018-10-30 Thread xubo245
*Background:* In CarbonData, there are many configuration in: org.apache.carbondata.core.constants.CarbonCommonConstants, org.apache.carbondata.core.constants.CarbonV3DataFormatConstants, org.apache.carbondata.core.constants.CarbonLoadOptionConstants; and so on. Which one can be dynamic

Re: [Discussion] CarbonReader performance improvement

2018-10-30 Thread xubo245
test name -- Original -- From: "xubo245";<601450...@qq.com>; Send time: Tuesday, Oct 30, 2018 8:15 PM To: "dev"; Subject: Re: [Discussion] CarbonReader performance improvement 1. there are some user want to use filter and have