Re: Fwd: test mail list if work fine:

2016-06-24 Thread Ravindra Pesala
Got it Thank you On Fri, 24 Jun 2016 16:33 Jean-Baptiste Onofré, wrote: > Got it ;) > > On 06/24/2016 01:03 PM, Liang Big data wrote: > > send it again! > > > > > > -- Forwarded message -- > > From: Liang Big data > > Date: 2016-06-24 16:22 GMT+05:30 > > Subject: test mail list

Re: 按quick start说明在spark-shell执行到create table时报错

2016-06-30 Thread Ravindra Pesala
Hi Yangwei, It seems user does not have permission to create files inside the store path (/mnt/resource/opt/cloudera/parcels/CDH-5.6.1-1.cdh5.6.1. p0.3/lib/spark/carbondata/store) you provided. Please make sure the user has read/write permissions to store path. Regards, Ravindra. On 30 June 2016

Re: carbondata建表出现问题

2016-06-30 Thread Ravindra Pesala
Hi, It seems a common error while configuring Mysql hive metastore. It may be issue with charset of database. Please go through the below links to solve the issue. https://qnalist.com/questions/206026/help-regarding-mysql-setup-for-metastore http://www.programering.com/a/MTMygDNwATk.html Regards,

Re: please help me See a bug

2016-07-06 Thread Ravindra Pesala
Hi, If the csv file is present hdfs then please provide absolute hdfs path like hdfs://host:port/tmp/iteblog.csv Thanks, Ravindra. On Thu, 7 Jul 2016 08:23 杨卫, wrote: > I can successful create table iteblog3 ,but I load data > fail. > -rwxrwxrwx. 1 hdfs hdfs 161 Jul 4 22:52 /tmp/iteblog.csv >

Re: Dimension vs. Measure Column

2016-07-14 Thread Ravindra Pesala
Hi Ahmed, Dimensions are usually the fields that cannot be aggregated and the fields you want to track like Country, Product etc. Where as measures, as its name suggests, are those fields that can be measured, aggregated, or used for mathematical operations like Quantity, Price etc. By default all

Re: [Help]carbondata cannot create table

2016-07-20 Thread Ravindra Pesala
Hi zhong, I guess you started the thrift server from Spark so Carbon cannot work because Spark default thrift server HiveContext not CarbonContext. I will soon add the script to start thrift server in carbon. By the time you can use the following script to start the thrift server of carbon. bin/

Re: [Exception] Table is locked for updation

2016-07-26 Thread Ravindra Pesala
Hi, Are you getting this exception continuously for every load? Usually it occurs when you try to load the data concurrently to the same table. So please make sure that no other instance of carbon is running and data load on the same table is not happening. Check if any locks are created under sys

Re: load data error

2016-07-30 Thread Ravindra Pesala
Hi, Exception says input path '/opt/incubator-carbondata/sample.csv' does not exist. So please make sure following things, 1. Whether the the sample.csv file is present in the location ' /opt/incubator-carbondata/' 2. Are you running the Spark in local mode or cluster mode.(If it is running in cl

Re: load data error

2016-08-01 Thread Ravindra Pesala
l.hive.cli.CarbonSQLCLIDriver.main(CarbonSQLCLIDriver.scala) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > at > > > sun.reflect.DelegatingMethodAcc

Re: issues:load local scv file occur exception

2016-08-04 Thread Ravindra Pesala
Hi, Why are you setting carbon.kettle.home=/opt/data-integration. It supposed to be /processing/carbonplugins right? It seems `/opt/data-integration` has some other plugins as well, that is it is throwing this exception. Please keep only carbonplugins as kettle home. Thanks & Regards, Ravindra.

Re: Reg, Running CarbonData on Windows

2016-08-08 Thread Ravindra Pesala
Hi Ahmed, We did not give much preference to run carbondata on windows in the intial phase. It is doable but it may require some changes to code. We may support in future. You are welcome to contribute if you want to add the support :) Thanks & Regards, Ravindra. On Tue, 9 Aug 2016 4:44 am Ahmed

Re: Towards 0.1.0-incubating release

2016-08-09 Thread Ravindra Pesala
Hi JB, Regarding point 2 , currently assembly module creates only single jar file of size 18 MB(it includes carbondata and its dependencies excluding Spark and Hadoop dependencies). So for binary distribution how we should provide? Thanks, Ravi. On 9 August 2016 at 10:29, Jean-Baptiste Onofré

Fwd: load data fail

2016-08-12 Thread Ravindra Pesala
-- Forwarded message -- From: Ravindra Pesala Date: 12 August 2016 at 12:45 Subject: Re: load data fail To: dev Hi, Are you getting this exception continuously for every load? Usually it occurs when you try to load the data concurrently to the same table. So please make sure

Re: Open discussion and Vote: What kind of JIRA issue events need send mail to dev@carbondata.incubator.apache.org

2016-08-18 Thread Ravindra Pesala
+1 for option 2 On 18 August 2016 at 13:32, jarray wrote: > option 2 is better > > > > > Regards > jarray > On 08/18/2016 15:57, Liang Big data wrote: > Hi all > > Please discuss and vote, do you think what kind of JIRA issue events need > send mails to dev@carbondata.incubator.apache.org? > > O

Re: [VOTE] Apache CarbonData 0.1.0-incubating release

2016-08-19 Thread Ravindra Pesala
+1 Thanks, Ravi, On Sat, 20 Aug 2016 8:05 am jarray, wrote: > +1 > > > Regards > jarray > On 08/20/2016 02:57, Jean-Baptiste Onofré wrote: > Hi all, > > I submit the first CarbonData release to your vote. > > Release Notes: > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12

Re: Open discussion and Vote: What kind of JIRA issue events need send mail to dev@carbondata.incubator.apache.org

2016-08-19 Thread Ravindra Pesala
+1 for option 2 (Henry proposed solution). Thanks, Ravi. On Sat, 20 Aug 2016 7:54 am chenliang613, wrote: > Agree for modified Option2 which be proposed by Henry. > We will implement it as per the solution. > > Thanks for all of you participated in discussion and vote. > > Regards > Liang > > >

Re: [VOTE] Apache CarbonData 0.1.0-incubating release

2016-08-22 Thread Ravindra Pesala
Hi, Yes JB. We will avoid this binary file in the source, Actually we are using this binary for only running testcases . We will modify this testcase to generate binary file on fly and delete it once done. Thanks, Ravi. On 22 August 2016 at 18:10, Jean-Baptiste Onofré wrote: > +1 (binding) >

Installation guide updated for yarn deployment

2016-08-26 Thread Ravindra Pesala
Cwiki is updated with yarn and cluster depoyment. Please review it. https://cwiki.apache.org/confluence/display/CARBONDATA/Installation+and+Configuration -- Thanks & Regards, Ravi

Re: [Exception] a thrift related problem occured when trying 0.0.1 release version

2016-08-26 Thread Ravindra Pesala
Hi William, It may be because you are using old carbon store. Please try using new store path. There were changes in thrift so old store won't work on this release. Thanks & Regards, Ravi On 26 August 2016 at 21:05, Zen Wellon wrote: > Hi, guys > > Congratulations for the first stable version

Re: A warning when loading data

2016-08-29 Thread Ravindra Pesala
Hi, Are you getting this exception continuously for every load? Usually it occurs when you try to load the data concurrently to the same table. So please make sure that no other instance of carbon is running and data load on the same table is not happening. Check if any locks are created under sys

Re: A warning when loading data

2016-08-29 Thread Ravindra Pesala
ad data, and I'm sure no other carbon is running because I use my > personal dev spark-cluster, I've also tried to recreate a new table, but > it's still there.. > > 2016-08-29 18:11 GMT+08:00 Ravindra Pesala : > > > Hi, > > > > Are you getting this excep

Re: A warning when loading data

2016-08-29 Thread Ravindra Pesala
I don't think it's raised by lockfile, because I've tried to recreate a > new table with a totally different name. However, I'll check it tomorrow. > > 2016-08-29 23:09 GMT+08:00 Ravindra Pesala : > > > Hi, > > > > Did you check if any locks are cre

Re: Change for DELETE SEGMENT FROM TABLE syntax

2016-09-23 Thread Ravindra Pesala
Hi Jay, Here if you use carbonsqlparser first then it won't be any issue. But if you use hive parser first then this issue arises. But I think there is a same case even with the 'Load' command, how that is handled here. And also the delete commands you mentioned does not look proper. DELETE FROM

Re: [Discuss]Set block_size for table on table level

2016-09-25 Thread Ravindra Pesala
+1 At same time max and min block size should be restricted and validated while creating table. On 26 September 2016 at 07:36, Zhangshunyu wrote: > Purpose: > To configure block file size for each table on column level, so that each > table could has its own blocksize. > My solution: > Add a new

Re: [VOTE] Apache CarbonData 0.1.1-incubating release

2016-09-27 Thread Ravindra Pesala
+1 Thanks & Regards, Ravindra. On Tue, 27 Sep 2016 02:53 Jihong Ma, wrote: > +1 binding. > > Thanks. > > Jenny > > -Original Message- > From: Liang Big data [mailto:chenliang6...@gmail.com] > Sent: Monday, September 26, 2016 2:22 PM > To: dev@carbondata.incubator.apache.org > Subject: R

Re: intellij compiling issue

2016-09-28 Thread Ravindra Pesala
Hi , Please have a look into following jira issue to solve this problem. https://issues.apache.org/jira/browse/CARBONDATA-42 Regards, Ravindra. On 28 September 2016 at 04:33, Qingqing Zhou wrote: > Hi, > > I follow the build instruction and get the lastest carbon built > successfully: > > [INF

Re: intellij compiling issue

2016-09-28 Thread Ravindra Pesala
gt; > > 在 2016年9月28日,下午5:15,Ravindra Pesala 写道: > > > > Hi , > > > > Please have a look into following jira issue to solve this problem. > > https://issues.apache.org/jira/browse/CARBONDATA-42 > > > > Regards, > > Ravindra. > > > > On 28

Re: [discussion]When table properties is repeated it only set the last one

2016-09-28 Thread Ravindra Pesala
+1 for option 1 On Thu, 29 Sep 2016 02:52 Jihong Ma, wrote: > Would prefer error out, vote for option 1. > > Jenny > > -Original Message- > From: Zhangshunyu [mailto:zhangshunyu1...@126.com] > Sent: Wednesday, September 28, 2016 12:11 AM > To: dev@carbondata.incubator.apache.org > Subjec

Re: [discussion]When table properties is repeated it only set the last one

2016-09-28 Thread Ravindra Pesala
guess we better stick to the hive behavior to avoid future problems. Regards, Ravindra. On 29 September 2016 at 08:04, Aniket Adnaik wrote: > +1 for option-1- should throw exception.. > Regards, > Aniket > > On 28 Sep 2016 7:01 p.m., "Ravindra Pesala" wrote: > > &

Discussion regrading design of data load after kettle removal.

2016-10-08 Thread Ravindra Pesala
Hi All, Removing kettle from carbondata is necessary as this legacy kettle framework become overhead to carbondata.This discussion is regarding the design of carbon load with out kettle. The main interface for data loading here is DataLoadProcessorStep. */*** * * This base interface for data lo

Re: Discussion regrading design of data load after kettle removal.

2016-10-09 Thread Ravindra Pesala
gt; -Regards > Kumar Vishal > > On Sat, Oct 8, 2016 at 3:30 PM, Ravindra Pesala > wrote: > > > Hi All, > > > > > > Removing kettle from carbondata is necessary as this legacy kettle > > framework become overhead to carbondata.This discussion is regarding the &g

Re: Discussion regrading design of data load after kettle removal.

2016-10-10 Thread Ravindra Pesala
Hi Jacky, https://drive.google.com/open?id=0B4TWTVbFSTnqeElyWko5NDlBZkdxS3NrMW1PZndzMG5ZM2Y0 1. Yes it calls child step to execute and apply its logic to return iterator just like spark sql. For CarbonOutputFormat it will use RecordBufferedWriterIterator and collects the data in batches. https

Discussion(New feature) regarding single pass data loading solution.

2016-10-11 Thread Ravindra Pesala
Hi All, This discussion is regarding single pass data load solution. Currently data is loading to carbon in 2 pass/jobs 1. Generating global dictionary using spark job. 2. Encode the data with dictionary values and create carbondata files. This 2 pass solution has many disadvantages like it nee

Re: Discussion regrading design of data load after kettle removal.

2016-10-12 Thread Ravindra Pesala
Hi Jacky, 1. Yes. It is better to keep all sorting logic to one step so other types of sorts can be implemented easily. I will update the design. 2. EncoderProcessorStep can do dictionary encoding and converting nodictionary and complex types to byte[] representation. Here encoding interface

Re: Discussion(New feature) regarding single pass data loading solution.

2016-10-13 Thread Ravindra Pesala
rporate online dictionary update, use a lock mechanism to sync up > > should serve the purpose. > > > > In another words, generating global dictionary is an optional step, only > > triggered when needed, not a default step as we do currently. > > > > Jihong > > >

Re: Subscribe mailing list

2016-10-14 Thread Ravindra Pesala
Hi, Please send mail to dev-subscr...@carbondata.incubator.apache.org to subscribe mailing list. Thanks, Ravi. On 14 October 2016 at 11:45, Anurag Srivastava wrote: > Hello , > > I want add my mail in your mailing list. > > -- > *Thanks®ards* > > > *Anurag Srivastava**Software Consultant* >

Re: Discussion(New feature) regarding single pass data loading solution.

2016-10-14 Thread Ravindra Pesala
thout dictionary? > >> > > >> > My thought is we can provide a tool to generate global dictionary > using > >> > sample data set, so the initial global dictionaries is available > before > >> > normal data loading. We shall be able

Re: Discussion(New feature) regarding single pass data loading solution.

2016-10-14 Thread Ravindra Pesala
bal dictionaries is available before > > normal data loading. We shall be able to perform encoding based on that, > > we only need to handle occasionally adding entries while loading. For > > columns specified with global dictionary encoding, but dictionary is not > > pl

Re: Discussion(New feature) regarding single pass data loading solution.

2016-10-15 Thread Ravindra Pesala
stributed map, and leveraging KV > store is overkill if simply just for dictionary generation. > > > > Regards. > > > > Jihong > > > > -Original Message- > > From: Ravindra Pesala [mailto:ravi.pes...@gmail.com] > > Sent: Friday, October 14,

Re: Discussion(New feature) Support Complex Data Type: Map in Carbon Data

2016-10-16 Thread Ravindra Pesala
Hi Vimal, Design doc looks clear, can you also add file format storage design for map datatype. Regards, Ravi. On 17 October 2016 at 07:43, Liang Chen wrote: > Hi Vimal > > Thank you started the discussion. > For keys of Map data only can be primitive, can you list these type which > will be s

Re: [Discussion] Support String Trim For Table Level or Col Level

2016-10-17 Thread Ravindra Pesala
Hi Lionx, Can you give more details on this feature? Are you talking about trim() function while querying? Or trim the data while loading to carbon? Regards, Ravi. On 17 October 2016 at 12:56, 向志强 wrote: > Hi all, > We are trying to support string trim feature in carbon. > The feature will be

Re: Discussion(New feature) regarding single pass data loading solution.

2016-10-18 Thread Ravindra Pesala
t; the map. I suggest you to investigate further to understand the implication > and effort. > > We all understand We couldn't afford any inconsistency on dictionary, that > means we couldn't decode the data back correctly. correctness is even more > critical compared to perf

Re: please vote and comment: remove thrift solution

2016-10-24 Thread Ravindra Pesala
Hi David, I guess keeping the generated code in apache github may not be a good solution, even I am not sure whether it is acceptable to keep generated code in apache. I prefer to decouple the thrift code compilation from main build and provide separate profile to do thrift compilation and upload

Re: List the supported datatypes in carbondata

2016-10-28 Thread Ravindra Pesala
Hi, Following are the supported datatypes in carbon. string,int, integer,tinyint,short,long,bigint,numeric,double,decimal,timestamp,array,struct. Regards, Ravi On 28 October 2016 at 11:49, Swati wrote: > Hi, > > I would like to know about the datatypes which are supported by carbondata > as I

Re: Load Array and Struct data into carbon

2016-11-04 Thread Ravindra Pesala
Hi, It is more or less same as how we load data to hive. Please have a look at ComplexTypeExample.scala in examples package. It is self explanatory. Regards, Ravi. On Fri, Nov 4, 2016, 12:31 PM Pallavi Singh wrote: > Hi, > > > Carbon Supports Array and Struct data, so can you please elaborate

Re: List of File Formats supported to Load Data

2016-11-04 Thread Ravindra Pesala
Hi, At present we support only CSV type format but it is not limited to only comma(,) delimiter. We can use any delimiter here. And also there is a provision to load data from any datasource using data frame save functionality. Please have a look at writeDataframe method inside ExampleUtils class

Re: List the supported datatypes in carbondata

2016-11-04 Thread Ravindra Pesala
Yes,. We need to have a document with the supported datatypes of carbon data. We listed this gap and working towards it. The following are datatypes we should support now. We better follow the datatypes syntax supported by hive as we use there parser. (string, int/integer, smallint, bigint, floa

Re: RE: [VOTE] Apache CarbonData 0.2.0-incubating release

2016-11-09 Thread Ravindra Pesala
+1 On Thu, Nov 10, 2016, 7:07 AM Jay <2550062...@qq.com> wrote: > +1 > > > Regards > Jay > > > -- 原始邮件 -- > 发件人: "Jihong Ma";; > 发送时间: 2016年11月10日(星期四) 上午7:58 > 收件人: "dev@carbondata.incubator.apache.org"< > dev@carbondata.incubator.apache.org>; "chenliang...@apache

Single Pass Data Load Design

2016-11-13 Thread Ravindra Pesala
Hi All, Please find the proposed solutions for single pass data load. https://docs.google.com/document/d/1_sSN9lccCZo4E_X3pNP5PchQACqif3AOXKTuG-YJAcc/edit?usp=sharing -- Thanks & Regards, Ravindra

Re: [Vote] Please provide valuable feedback's and vote for Like filter query performance optimization

2016-11-14 Thread Ravindra Pesala
+1 On Mon, Nov 14, 2016, 3:54 PM sujith chacko wrote: > Hi liang, > Yes, its for high cardinality columns. > Thanks, > Sujith > > On Nov 14, 2016 2:01 PM, "Liang Chen" wrote: > > > Hi > > > > I have one query : for no dictionary columns which are high cardinality > > like phone number, Whether

Re: Please vote and advise on building thrift files

2016-11-16 Thread Ravindra Pesala
+1 for proposal 1 On 17 November 2016 at 08:23, Xiaoqiao He wrote: > +1 for proposal 1. > > On Thu, Nov 17, 2016 at 10:31 AM, ZhuWilliam > wrote: > > > +1 for proposal 1 . > > > > Auto generated code should not be added to project. Also most the of time > > ,people who dive into carbondata may

Re: CarbonData propose major version number increment for next version (to 1.0.0)

2016-11-24 Thread Ravindra Pesala
+1 On Thu, Nov 24, 2016, 10:37 PM manish gupta wrote: > +1 > > Regards > Manish Gupta > > On Thu, Nov 24, 2016 at 7:30 PM, Kumar Vishal > wrote: > > > +1 > > > > -Regards > > Kumar Vishal > > > > On Thu, Nov 24, 2016 at 2:41 PM, Raghunandan S < > > carbondatacontributi...@gmail.com> wrote: > >

Re: Using DataFrame to write carbondata file cause no table found error

2016-11-25 Thread Ravindra Pesala
Hi, In Append mode , the carbon table supposed to be created before other wise load fails as Table do not exist. In Overwrite mode the carbon table would be created (it drops if it already exists) and loads the data. But in your case for overwrite mode it creates the table but it says table not fo

[New Feature] Adding bucketed table feature to Carbondata

2016-11-27 Thread Ravindra Pesala
Hi All, Bucketing concept is based on the hash partition the bucketed column as per configured bucket numbers. Records with same bucketed column always goes to the same same bucket. Physically each bucket is a file/files in table directory. Advantages Bucketed table is useful feature to do the map

[improvement] Support unsafe in-memory sort in carbondata

2016-11-27 Thread Ravindra Pesala
Hi All, In the current carbondata system loading performance is not so encouraging since we need to sort the data at executor level for data loading. Carbondata collects batch of data and sorts before dumping to the temporary files and finally it does merge sort from those temporary files to finis

Re: [New Feature] Adding bucketed table feature to Carbondata

2016-11-27 Thread Ravindra Pesala
atacontributi...@gmail.com> wrote: > How is this different from partitioning? > On Sun, 27 Nov 2016 at 11:21 PM, Ravindra Pesala > wrote: > > > Hi All, > > > > Bucketing concept is based on the hash partition the bucketed column as > per > > configured bucket

Re: Question about RLE support in CarbonData

2016-11-30 Thread Ravindra Pesala
Hi, Here some encodings can be done on each field level and some can be done on blocklet(batch of column data) level. So DICTIONARY encoding is done on each field level and this FieldConverter is only encoding data on field level. RLE is applied on blocklet level so it is applied while writing the

Re: Why INT type is stored like BIGINT?

2016-12-04 Thread Ravindra Pesala
Hi, Since we use delta compression for measure types in carbondata , it stores the data with least datatype as per the values in blocklet. So it does not matter whether we store INT or BIGINT in carbondata files, it always use least datatype to store. Regards, Ravi On 4 December 2016 at 13:28, S

Re: About measure in carbon

2016-12-04 Thread Ravindra Pesala
Hi, These constants are used for converting data to respective datatype while loading into carbondata. It is not required for short or int type as we store as bigint. Regards, Ravi On 4 December 2016 at 12:40, Sea <261810...@qq.com> wrote: > Hi, all: > I find the following codes in carbon C

Re: About hive integration

2016-12-04 Thread Ravindra Pesala
Hi, Yes, we have plans for integrating carbondata to hive engine but it is not our high priority work now so we will take it up this task gradually. Any contributions towards it are welcome. Regards, Ravi On 4 December 2016 at 12:30, Sea <261810...@qq.com> wrote: > Hi, all: > Now carbondata

Re: select return error when filter string column in where clause

2016-12-05 Thread Ravindra Pesala
Hi, Please provide table schema, load command and sample data to reproduce this issue, you may create the JIRA for it. Regards, Ravi On 6 December 2016 at 07:05, Lu Cao wrote: > Hi Dev team, > I have loaded some data into carbondata table. But when I put the id > column(String type) in where c

Re: [Discussion] Some confused properties

2016-12-08 Thread Ravindra Pesala
Hi, Carbon takes store location from CarbonContext and sets to CarbonProperties as carbon.storelocation , so it is not required to add store location in properties file. And carbon.ddl.base.hdfs.url is not a mandatory property it is just used when load path is provided with prefix then it appends

Re: [DISCUSSION] CarbonData loading solution discussion

2016-12-15 Thread Ravindra Pesala
+1 to have separate output formats, now user can have flexibility to choose as per scenario. On Fri, Dec 16, 2016, 2:47 AM Jihong Ma wrote: > > It is great idea to have separate OutputFormat for regular Carbon data > files, index files as well as meta data files, For instance: dictionary > file,

Re: Dictionary file is locked for updation

2016-12-27 Thread Ravindra Pesala
Hi, It seems the store path location is taking default location. Did you set the store location properly? Which spark version you are using? Regards, Ravindra On Tue, Dec 27, 2016, 1:38 PM 251469031 <251469...@qq.com> wrote: > Hi Kumar, > > > thx to your repley, the full logs is as follows: >

Re: Float Data Type Support in carbondata Querry

2016-12-27 Thread Ravindra Pesala
Hi, >From carbon it supposed to return float data when you use float data type. Please check whether you are converting the data to float or not in ScannedResultCollector implementation classes. Regards, Ravindra On 27 December 2016 at 20:23, Rahul Kumar wrote: > Hello Ravindra, > > I am worki

Re: CatalystAnalysy

2016-12-27 Thread Ravindra Pesala
Have you used 'mvn clean'? On 28 December 2016 at 07:18, rahulforallp wrote: > hey QiangCai, > thank you for your reply . i have spark 1.6.2. and also tried with > -Dspark.version=1.6.2 . But result is same . Still i am getting same > exception. > > Is this exception possibe if i have different

Re: carbon shell is not working with spark 2.0 version

2017-01-03 Thread Ravindra Pesala
Yes, it is not working because the support is not yet added, right now it is low priority task as user can directly use spark-shell to create carbonsession and execute the queries. On 4 January 2017 at 12:40, anubhavtarar wrote: > carbon shell is not working with spark 2.0 version > here are the

Re: carbon thrift server for spark 2.0 showing unusual behaviour

2017-01-03 Thread Ravindra Pesala
Hi, I did not understand the issue, what is the error it throws? On 4 January 2017 at 10:03, Anubhav Tarar wrote: > here is the script ./bin/spark-submit --conf > spark.sql.hive.thriftServer.singleSession=true --class > org.apache.carbondata.spark.thriftserver.CarbonThriftServer > /opt/spark-2.

Re: why there is a table name option in carbon source format?

2017-01-03 Thread Ravindra Pesala
you can directly use the other sql create table command like in 1.6. CREATE TABLE IF NOT EXISTS t3 (ID Int, date Timestamp, country String, name String, phonetype String, serialname char(10), salary Int) STORED BY 'carbondata' On 4 January 2017 at 10:02, Anubhav Tarar wrote: > exactly my point

Re: Select query is not working.

2017-01-05 Thread Ravindra Pesala
Hi, Its an issue, we are working on the fix. On 5 January 2017 at 17:26, Anurag Srivastava wrote: > Hello, > > I have taken latest code at today (5/01/2017) and build code with spark > 1.6. After that I put the latest jar in carbonlib in spark and start thrift > server. > > When I have started

Re: TestCase failed

2017-01-10 Thread Ravindra Pesala
Hi, Please make sure the store path of "flightdb2" is given properly in side CarbonInputMapperTest class. Please provide complete stack trace of error. On 10 January 2017 at 17:54, 彭 wrote: > Hi,all: > Recently, i meet a failed TestCase, Is there anyone know it? > http://apache-carbon

Re: Unable to Assign Jira to me

2017-01-13 Thread Ravindra Pesala
Please provide Jira user name and mail id. We will add you as a contributor so that you can assign issues to yourself. On Fri, Jan 13, 2017, 16:49 Anurag Srivastava wrote: > Hello Team, > > I am working on JIRA [CARBONDATA-542] and want to assign this JIRA to me. > But I am not able to assign it

Re: Re: Failed to APPEND_FILE, hadoop.hdfs.protocol.AlreadyBeingCreatedException

2017-01-20 Thread Ravindra Pesala
Hi, Please use "mvn clean -DskipTests -Pspark-1.5 -Dspark.version=1.5.2 -Phadoop-2.7.2 package" Regards, Ravindra On 20 January 2017 at 15:42, manish gupta wrote: > Can you try compiling with hadoop-2.7.2 version and use it and let us know > if the issue still persists. > > "mvn package -Dsk

Re: [VOTE] Apache CarbonData 1.0.0-incubating release (RC2)

2017-01-20 Thread Ravindra Pesala
+1 Done sanity for all major features, it is fine. Regards, Ravindra. On Sat, Jan 21, 2017, 07:51 Liang Chen wrote: > +1(binding) > > I checked: > - name contains incubating > - disclaimer exists > - signatures and hash correct > - NOTICE good > - LICENSE is good > - Source files have ASF heade

Re: store location can't be found

2017-02-03 Thread Ravindra Pesala
Hi Mars, Please try creating carbonsession with storepath as follow. val carbon = SparkSession.builder().config(sc.getConf). getOrCreateCarbonSession("hdfs://localhost:9000/carbon/store ") Regards, Ravindra. On 4 February 2017 at 08:12, Mars Xu wrote: > Hello All, > I met a problem o

Re: Aggregate performace

2017-02-08 Thread Ravindra Pesala
Hi, The performance is depends on the query plan, when you submit the query like [Select attributeA , count(*) from tableB group by attributeA] in case of spark it asks carbon to give only attributeA column. So Carbon reads only attributeA column from all files send the result to spark to aggreg

Re: query exception: Path is not a file when carbon 1.0.0

2017-02-08 Thread Ravindra Pesala
Hi, This exception is actually ignored in class SegmentUpdateStatusManager line number 696. This exception does not create any problem. Usually this exception won't be printed in any server logs as we are ignoring it. May be in spark-shell it is printing. we will look into it. Regards, Ravindra.

Re: Discussion about getting excution duration about a query when using sparkshell+carbondata

2017-02-08 Thread Ravindra Pesala
Hi Libis, spark-sql CLI is not supported by carbondata. Why don't you use carbon thrift server and beeline, it is also same as spark-sql CLI and it gives execution time for each query. Start carbondata thrift server script. bin/spark-submit --class org.apache.carbondata.spark.thriftserver.CarbonT

Re: data lost when loading data from csv file to carbon table

2017-02-14 Thread Ravindra Pesala
Hi, Please set carbon.badRecords.location in carbon.properties and check any bad records are added to that location. Regards, Ravindra. On 14 February 2017 at 15:24, Yinwei Li <251469...@qq.com> wrote: > Hi all, > > > I met an data lost problem when loading data from csv file to carbon > tab

Re: data lost when loading data from csv file to carbon table

2017-02-14 Thread Ravindra Pesala
gt; and the following are bad record logs: > > > INFO 15-02 09:43:24,393 - [Executor task launch > worker-0][partitionID:_1g_web_sales_d59af854-773c-429c-b7e6-031d602fe2be] > Total copy time (ms) to copy file /tmp/1039730591739247/0/_1g/ > web_sales/Fact/Part0/Segment_0/0/

Introducing V3 format.

2017-02-15 Thread Ravindra Pesala
Problems in current format. 1. IO read is slower since it needs to go for multiple seeks on the file to read column blocklets. Current size of blocklet is 12, so it needs to read multiple times from file to scan the data on that column. Alternatively we can increase the blocklet size but it suf

Re: Introducing V3 format.

2017-02-15 Thread Ravindra Pesala
Please find the thrift file in below location. https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b1NqSTU2b2g4dkhkVDRj On 15 February 2017 at 17:14, Ravindra Pesala wrote: > Problems in current format. > 1. IO read is slower since it needs to go for multiple seeks on the fil

Re: data lost when loading data from csv file to carbon table

2017-02-15 Thread Ravindra Pesala
> but it occured an Exception: java.lang.RuntimeException: > carbon.kettle.home is not set > > > the configuration in my carbon.properties is: > carbon.kettle.home=/opt/spark-2.1.0/carbonlib/carbonplugins, but it seems > not work. > > > how can I solve this problem. >

Re: Introducing V3 format.

2017-02-15 Thread Ravindra Pesala
ber of false positive blocks will improve the > >>filter query performance. Separating uncompression of data from reader > >>layer will improve the overall query performance. > >> > >>-Regards > >>Kumar Vishal > >> > >>On Wed, Feb 15, 2017 a

Re: 回复: data lost when loading data from csv file to carbon table

2017-02-15 Thread Ravindra Pesala
Hi Yinwei, Thank you for pointing out the issue, I will check with TPC-DS data and verify the data load with new flow. Regards, Ravindra. On 16 February 2017 at 09:35, QiangCai wrote: > Maybe you can check PR594, it will fix a bug which will impact the result > of > loading. > > > > -- > View

Re: 回复: data lost when loading data from csv file to carbon table

2017-02-15 Thread Ravindra Pesala
Hi Yinwei, Can you provide create table scripts for both the tables store_returns and web_sales. Regards, Ravindra. On 16 February 2017 at 10:07, Ravindra Pesala wrote: > Hi Yinwei, > > Thank you for pointing out the issue, I will check with TPC-DS data and > verify the data l

Re: data lost when loading data from csv file to carbon table

2017-02-15 Thread Ravindra Pesala
7;DICTIONARY_INCLUDE'='ws_sold_date_sk, ws_sold_time_sk, ws_ship_date_sk, > ws_item_sk, ws_bill_customer_sk, ws_bill_cdemo_sk, ws_bill_hdemo_sk, > ws_bill_addr_sk, ws_ship_customer_sk, ws_ship_cdemo_sk, ws_ship_hdemo_sk, > ws_ship_addr_sk, ws_web_page_sk, ws_web_site_sk, ws_ship_mode_sk

Re: 回复: data lost when loading data from csv file to carbon table

2017-02-16 Thread Ravindra Pesala
Hi QiangCai, PR594 fix does not solve the data lost issue, it fixes the data mismatch in some cases. Regards, Ravindra. On 16 February 2017 at 09:35, QiangCai wrote: > Maybe you can check PR594, it will fix a bug which will impact the result > of > loading. > > > > -- > View this message in co

Re: whether carbondata can be used in hive on spark?

2017-02-16 Thread Ravindra Pesala
Hi, We have so far integrated only to the Spark, not yet integrated to Hive. So carbondata cannot be used in Hive on Spark at this moment. Regards, Ravindra. On 16 February 2017 at 14:35, wangzheng <18031...@qq.com> wrote: > we use cdh5.7, it remove the thriftserver of spark, so sparksql is not

Re: question about the order between original values and its encoded values

2017-02-16 Thread Ravindra Pesala
Hi, Yes, it works because we are sorting the column values before assigning dictionary values to it. So it can work only if you have loaded the data only once( it means there is no incremental load). If you do incremental load and some more dictionary values are added to store then there is no gua

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-17 Thread Ravindra Pesala
Hi Xiaoqiao, Is the problem still exists? Can you try with clean build with "mvn clean -DskipTests -Pspark-1.6 package" command. Regards, Ravindra. On 16 February 2017 at 08:36, Xiaoqiao He wrote: > hi Liang Chen, > > Thank for your help. It is true that i install and configure carbondata on

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-20 Thread Ravindra Pesala
t; > at > > org.apache.spark.sql.execution.command.LoadTable. > run(carbonTableSchema.scala:360) > > at > > org.apache.spark.sql.execution.ExecutedCommand. > sideEffectResult$lzycompute(commands.scala:58) > > at > > org.apache.spark.sql.execution.ExecutedCommand. > sideEffectResult(commands.scala

Re: [ANNOUNCE] Hexiaoqiao as new Apache CarbonData committer

2017-02-20 Thread Ravindra Pesala
Congratulations Hexiaoqiao. Regards, Ravindra. On 21 February 2017 at 10:15, Xiaoqiao He wrote: > Hi PPMC, Liang, > > It is my honor that receive the invitation, and very happy to have chance > that participate to build CarbonData community also. I will keep > contributing to Apache CarbonData

Re: Exception throws when I load data using carbondata-1.0.0

2017-02-21 Thread Ravindra Pesala
Hi, Please create the carbon context as follows. val cc = new CarbonContext(sc, storeLocation) Here storeLocation is hdfs://hacluster/tmp/carbondata/carbon.store in your case. Regards, Ravindra On 21 February 2017 at 08:30, Ravindra Pesala wrote: > Hi, > > How did you create Carb

Re: carbondata performance test under benchmark tpc-ds

2017-02-21 Thread Ravindra Pesala
Hi, We are working on TPC-H performance report now, and have improved the performance with new format, we have already raised the PR(584 and 586) for the same, It is still under review and it will be merged soon. Once these PR's are merged we will start verify the TPC-DS performace as well. Regar

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-26 Thread Ravindra Pesala
Hi, I feel there are more disadvantages than advantages in this approach. In your current scenario you want to set dictionary only for columns which are used as filters, but the usage of dictionary is not only limited for filters, it can reduce the store size and improve the aggregation queries. I

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-27 Thread Ravindra Pesala
> data size will increase. Late decoding is one of main advantage, no > > dictionary column aggregation will be slower. Filter query will suffer as > > in case of dictionary column we are comparing on byte pack value, in case > > of no dictionary it will be on actual value. > >

Re: Block B-tree loading failed

2017-02-28 Thread Ravindra Pesala
Hi, Have you loaded data freshly and try to execute the query? Or you are trying to query the old store you already has loaded? Regards, Ravindra. On 28 February 2017 at 17:20, ericzgy <1987zhangguang...@163.com> wrote: > Now when I load data into CarbonData table using spark1.6.2 and > carbond

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Ravindra Pesala
Hi Likun, You mentioned that if user does not specify dictionary columns then by default those are chosen as no dictionary columns. But we have many disadvantages as I mentioned in above mail if you keep no dictionary as default. We have initially introduced no dictionary columns to handle high ca

Re: [DISCUSS] For the dimension default should be no dictionary

2017-02-28 Thread Ravindra Pesala
columns. I feel > preventing such misusage is important in order to encourage more users to > use carbondata. > > Any suggestion on solving this issue? > > > Regards, > Likun > > > > 在 2017年2月28日,下午10:20,Ravindra Pesala 写道: > > > > Hi Likun, > > >

  1   2   3   4   >