Got it
Thank you
On Fri, 24 Jun 2016 16:33 Jean-Baptiste Onofré, wrote:
> Got it ;)
>
> On 06/24/2016 01:03 PM, Liang Big data wrote:
> > send it again!
> >
> >
> > -- Forwarded message --
> > From: Liang Big data
> > Date: 2016-06-24 16:22 GMT+05:30
> > Subject: test mail list
Hi Yangwei,
It seems user does not have permission to create files inside the store
path (/mnt/resource/opt/cloudera/parcels/CDH-5.6.1-1.cdh5.6.1.
p0.3/lib/spark/carbondata/store) you provided. Please make sure the user
has read/write permissions to store path.
Regards,
Ravindra.
On 30 June 2016
Hi,
It seems a common error while configuring Mysql hive metastore. It may be
issue with charset of database. Please go through the below links to solve
the issue.
https://qnalist.com/questions/206026/help-regarding-mysql-setup-for-metastore
http://www.programering.com/a/MTMygDNwATk.html
Regards,
Hi,
If the csv file is present hdfs then please provide absolute hdfs path like
hdfs://host:port/tmp/iteblog.csv
Thanks,
Ravindra.
On Thu, 7 Jul 2016 08:23 杨卫, wrote:
> I can successful create table iteblog3 ,but I load data
> fail.
> -rwxrwxrwx. 1 hdfs hdfs 161 Jul 4 22:52 /tmp/iteblog.csv
>
Hi Ahmed,
Dimensions are usually the fields that cannot be aggregated and the fields
you want to track like Country, Product etc. Where as measures, as its name
suggests, are those fields that can be measured, aggregated, or used for
mathematical operations like Quantity, Price etc.
By default all
Hi zhong,
I guess you started the thrift server from Spark so Carbon cannot work
because Spark default thrift server HiveContext not CarbonContext. I will
soon add the script to start thrift server in carbon.
By the time you can use the following script to start the thrift server of
carbon.
bin/
Hi,
Are you getting this exception continuously for every load? Usually it
occurs when you try to load the data concurrently to the same table. So
please make sure that no other instance of carbon is running and data load
on the same table is not happening.
Check if any locks are created under sys
Hi,
Exception says input path '/opt/incubator-carbondata/sample.csv' does not
exist. So please make sure following things,
1. Whether the the sample.csv file is present in the location '
/opt/incubator-carbondata/'
2. Are you running the Spark in local mode or cluster mode.(If it is
running in cl
l.hive.cli.CarbonSQLCLIDriver.main(CarbonSQLCLIDriver.scala)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> >
> sun.reflect.DelegatingMethodAcc
Hi,
Why are you setting carbon.kettle.home=/opt/data-integration. It supposed
to be /processing/carbonplugins right?
It seems `/opt/data-integration` has some other plugins as well, that is it
is throwing this exception. Please keep only carbonplugins as kettle home.
Thanks & Regards,
Ravindra.
Hi Ahmed,
We did not give much preference to run carbondata on windows in the intial
phase. It is doable but it may require some changes to code. We may support
in future. You are welcome to contribute if you want to add the support :)
Thanks & Regards,
Ravindra.
On Tue, 9 Aug 2016 4:44 am Ahmed
Hi JB,
Regarding point 2 , currently assembly module creates only single jar file
of size 18 MB(it includes carbondata and its dependencies excluding Spark
and Hadoop dependencies). So for binary distribution how we should provide?
Thanks,
Ravi.
On 9 August 2016 at 10:29, Jean-Baptiste Onofré
-- Forwarded message --
From: Ravindra Pesala
Date: 12 August 2016 at 12:45
Subject: Re: load data fail
To: dev
Hi,
Are you getting this exception continuously for every load? Usually it
occurs when you try to load the data concurrently to the same table. So
please make sure
+1 for option 2
On 18 August 2016 at 13:32, jarray wrote:
> option 2 is better
>
>
>
>
> Regards
> jarray
> On 08/18/2016 15:57, Liang Big data wrote:
> Hi all
>
> Please discuss and vote, do you think what kind of JIRA issue events need
> send mails to dev@carbondata.incubator.apache.org?
>
> O
+1
Thanks,
Ravi,
On Sat, 20 Aug 2016 8:05 am jarray, wrote:
> +1
>
>
> Regards
> jarray
> On 08/20/2016 02:57, Jean-Baptiste Onofré wrote:
> Hi all,
>
> I submit the first CarbonData release to your vote.
>
> Release Notes:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12
+1 for option 2 (Henry proposed solution).
Thanks,
Ravi.
On Sat, 20 Aug 2016 7:54 am chenliang613, wrote:
> Agree for modified Option2 which be proposed by Henry.
> We will implement it as per the solution.
>
> Thanks for all of you participated in discussion and vote.
>
> Regards
> Liang
>
>
>
Hi,
Yes JB. We will avoid this binary file in the source, Actually we are using
this binary for only running testcases . We will modify this testcase to
generate binary file on fly and delete it once done.
Thanks,
Ravi.
On 22 August 2016 at 18:10, Jean-Baptiste Onofré wrote:
> +1 (binding)
>
Cwiki is updated with yarn and cluster depoyment. Please review it.
https://cwiki.apache.org/confluence/display/CARBONDATA/Installation+and+Configuration
--
Thanks & Regards,
Ravi
Hi William,
It may be because you are using old carbon store. Please try using new
store path. There were changes in thrift so old store won't work on this
release.
Thanks & Regards,
Ravi
On 26 August 2016 at 21:05, Zen Wellon wrote:
> Hi, guys
>
> Congratulations for the first stable version
Hi,
Are you getting this exception continuously for every load? Usually it
occurs when you try to load the data concurrently to the same table. So
please make sure that no other instance of carbon is running and data load
on the same table is not happening.
Check if any locks are created under sys
ad data, and I'm sure no other carbon is running because I use my
> personal dev spark-cluster, I've also tried to recreate a new table, but
> it's still there..
>
> 2016-08-29 18:11 GMT+08:00 Ravindra Pesala :
>
> > Hi,
> >
> > Are you getting this excep
I don't think it's raised by lockfile, because I've tried to recreate a
> new table with a totally different name. However, I'll check it tomorrow.
>
> 2016-08-29 23:09 GMT+08:00 Ravindra Pesala :
>
> > Hi,
> >
> > Did you check if any locks are cre
Hi Jay,
Here if you use carbonsqlparser first then it won't be any issue. But if
you use hive parser first then this issue arises. But I think there is a
same case even with the 'Load' command, how that is handled here.
And also the delete commands you mentioned does not look proper.
DELETE FROM
+1
At same time max and min block size should be restricted and validated
while creating table.
On 26 September 2016 at 07:36, Zhangshunyu wrote:
> Purpose:
> To configure block file size for each table on column level, so that each
> table could has its own blocksize.
> My solution:
> Add a new
+1
Thanks & Regards,
Ravindra.
On Tue, 27 Sep 2016 02:53 Jihong Ma, wrote:
> +1 binding.
>
> Thanks.
>
> Jenny
>
> -Original Message-
> From: Liang Big data [mailto:chenliang6...@gmail.com]
> Sent: Monday, September 26, 2016 2:22 PM
> To: dev@carbondata.incubator.apache.org
> Subject: R
Hi ,
Please have a look into following jira issue to solve this problem.
https://issues.apache.org/jira/browse/CARBONDATA-42
Regards,
Ravindra.
On 28 September 2016 at 04:33, Qingqing Zhou
wrote:
> Hi,
>
> I follow the build instruction and get the lastest carbon built
> successfully:
>
> [INF
gt;
> > 在 2016年9月28日,下午5:15,Ravindra Pesala 写道:
> >
> > Hi ,
> >
> > Please have a look into following jira issue to solve this problem.
> > https://issues.apache.org/jira/browse/CARBONDATA-42
> >
> > Regards,
> > Ravindra.
> >
> > On 28
+1 for option 1
On Thu, 29 Sep 2016 02:52 Jihong Ma, wrote:
> Would prefer error out, vote for option 1.
>
> Jenny
>
> -Original Message-
> From: Zhangshunyu [mailto:zhangshunyu1...@126.com]
> Sent: Wednesday, September 28, 2016 12:11 AM
> To: dev@carbondata.incubator.apache.org
> Subjec
guess we better stick to the hive behavior to avoid future problems.
Regards,
Ravindra.
On 29 September 2016 at 08:04, Aniket Adnaik
wrote:
> +1 for option-1- should throw exception..
> Regards,
> Aniket
>
> On 28 Sep 2016 7:01 p.m., "Ravindra Pesala" wrote:
>
> &
Hi All,
Removing kettle from carbondata is necessary as this legacy kettle
framework become overhead to carbondata.This discussion is regarding the
design of carbon load with out kettle.
The main interface for data loading here is DataLoadProcessorStep.
*/***
* * This base interface for data lo
gt; -Regards
> Kumar Vishal
>
> On Sat, Oct 8, 2016 at 3:30 PM, Ravindra Pesala
> wrote:
>
> > Hi All,
> >
> >
> > Removing kettle from carbondata is necessary as this legacy kettle
> > framework become overhead to carbondata.This discussion is regarding the
&g
Hi Jacky,
https://drive.google.com/open?id=0B4TWTVbFSTnqeElyWko5NDlBZkdxS3NrMW1PZndzMG5ZM2Y0
1. Yes it calls child step to execute and apply its logic to return
iterator just like spark sql. For CarbonOutputFormat it will use
RecordBufferedWriterIterator and collects the data in batches.
https
Hi All,
This discussion is regarding single pass data load solution.
Currently data is loading to carbon in 2 pass/jobs
1. Generating global dictionary using spark job.
2. Encode the data with dictionary values and create carbondata files.
This 2 pass solution has many disadvantages like it nee
Hi Jacky,
1. Yes. It is better to keep all sorting logic to one step so other types
of sorts can be implemented easily. I will update the design.
2. EncoderProcessorStep can do dictionary encoding and converting
nodictionary and complex types to byte[] representation.
Here encoding interface
rporate online dictionary update, use a lock mechanism to sync up
> > should serve the purpose.
> >
> > In another words, generating global dictionary is an optional step, only
> > triggered when needed, not a default step as we do currently.
> >
> > Jihong
> >
>
Hi,
Please send mail to dev-subscr...@carbondata.incubator.apache.org to
subscribe mailing list.
Thanks,
Ravi.
On 14 October 2016 at 11:45, Anurag Srivastava wrote:
> Hello ,
>
> I want add my mail in your mailing list.
>
> --
> *Thanks®ards*
>
>
> *Anurag Srivastava**Software Consultant*
>
thout dictionary?
> >> >
> >> > My thought is we can provide a tool to generate global dictionary
> using
> >> > sample data set, so the initial global dictionaries is available
> before
> >> > normal data loading. We shall be able
bal dictionaries is available before
> > normal data loading. We shall be able to perform encoding based on that,
> > we only need to handle occasionally adding entries while loading. For
> > columns specified with global dictionary encoding, but dictionary is not
> > pl
stributed map, and leveraging KV
> store is overkill if simply just for dictionary generation.
> >
> > Regards.
> >
> > Jihong
> >
> > -Original Message-
> > From: Ravindra Pesala [mailto:ravi.pes...@gmail.com]
> > Sent: Friday, October 14,
Hi Vimal,
Design doc looks clear, can you also add file format storage design for map
datatype.
Regards,
Ravi.
On 17 October 2016 at 07:43, Liang Chen wrote:
> Hi Vimal
>
> Thank you started the discussion.
> For keys of Map data only can be primitive, can you list these type which
> will be s
Hi Lionx,
Can you give more details on this feature?
Are you talking about trim() function while querying? Or trim the data
while loading to carbon?
Regards,
Ravi.
On 17 October 2016 at 12:56, 向志强 wrote:
> Hi all,
> We are trying to support string trim feature in carbon.
> The feature will be
t; the map. I suggest you to investigate further to understand the implication
> and effort.
>
> We all understand We couldn't afford any inconsistency on dictionary, that
> means we couldn't decode the data back correctly. correctness is even more
> critical compared to perf
Hi David,
I guess keeping the generated code in apache github may not be a good
solution, even I am not sure whether it is acceptable to keep generated
code in apache.
I prefer to decouple the thrift code compilation from main build and
provide separate profile to do thrift compilation and upload
Hi,
Following are the supported datatypes in carbon.
string,int,
integer,tinyint,short,long,bigint,numeric,double,decimal,timestamp,array,struct.
Regards,
Ravi
On 28 October 2016 at 11:49, Swati wrote:
> Hi,
>
> I would like to know about the datatypes which are supported by carbondata
> as I
Hi,
It is more or less same as how we load data to hive. Please have a look at
ComplexTypeExample.scala in examples package. It is self explanatory.
Regards,
Ravi.
On Fri, Nov 4, 2016, 12:31 PM Pallavi Singh
wrote:
> Hi,
>
>
> Carbon Supports Array and Struct data, so can you please elaborate
Hi,
At present we support only CSV type format but it is not limited to only
comma(,) delimiter. We can use any delimiter here.
And also there is a provision to load data from any datasource using data
frame save functionality. Please have a look at writeDataframe method
inside ExampleUtils class
Yes,. We need to have a document with the supported datatypes of carbon
data. We listed this gap and working towards it.
The following are datatypes we should support now. We better follow the
datatypes syntax supported by hive as we use there parser.
(string, int/integer, smallint, bigint, floa
+1
On Thu, Nov 10, 2016, 7:07 AM Jay <2550062...@qq.com> wrote:
> +1
>
>
> Regards
> Jay
>
>
> -- 原始邮件 --
> 发件人: "Jihong Ma";;
> 发送时间: 2016年11月10日(星期四) 上午7:58
> 收件人: "dev@carbondata.incubator.apache.org"<
> dev@carbondata.incubator.apache.org>; "chenliang...@apache
Hi All,
Please find the proposed solutions for single pass data load.
https://docs.google.com/document/d/1_sSN9lccCZo4E_X3pNP5PchQACqif3AOXKTuG-YJAcc/edit?usp=sharing
--
Thanks & Regards,
Ravindra
+1
On Mon, Nov 14, 2016, 3:54 PM sujith chacko
wrote:
> Hi liang,
> Yes, its for high cardinality columns.
> Thanks,
> Sujith
>
> On Nov 14, 2016 2:01 PM, "Liang Chen" wrote:
>
> > Hi
> >
> > I have one query : for no dictionary columns which are high cardinality
> > like phone number, Whether
+1 for proposal 1
On 17 November 2016 at 08:23, Xiaoqiao He wrote:
> +1 for proposal 1.
>
> On Thu, Nov 17, 2016 at 10:31 AM, ZhuWilliam
> wrote:
>
> > +1 for proposal 1 .
> >
> > Auto generated code should not be added to project. Also most the of time
> > ,people who dive into carbondata may
+1
On Thu, Nov 24, 2016, 10:37 PM manish gupta
wrote:
> +1
>
> Regards
> Manish Gupta
>
> On Thu, Nov 24, 2016 at 7:30 PM, Kumar Vishal
> wrote:
>
> > +1
> >
> > -Regards
> > Kumar Vishal
> >
> > On Thu, Nov 24, 2016 at 2:41 PM, Raghunandan S <
> > carbondatacontributi...@gmail.com> wrote:
> >
Hi,
In Append mode , the carbon table supposed to be created before other wise
load fails as Table do not exist.
In Overwrite mode the carbon table would be created (it drops if it already
exists) and loads the data. But in your case for overwrite mode it creates
the table but it says table not fo
Hi All,
Bucketing concept is based on the hash partition the bucketed column as per
configured bucket numbers. Records with same bucketed column always goes to
the same same bucket. Physically each bucket is a file/files in table
directory.
Advantages
Bucketed table is useful feature to do the map
Hi All,
In the current carbondata system loading performance is not so encouraging
since we need to sort the data at executor level for data loading.
Carbondata collects batch of data and sorts before dumping to the temporary
files and finally it does merge sort from those temporary files to finis
atacontributi...@gmail.com> wrote:
> How is this different from partitioning?
> On Sun, 27 Nov 2016 at 11:21 PM, Ravindra Pesala
> wrote:
>
> > Hi All,
> >
> > Bucketing concept is based on the hash partition the bucketed column as
> per
> > configured bucket
Hi,
Here some encodings can be done on each field level and some can be done on
blocklet(batch of column data) level. So DICTIONARY encoding is done on
each field level and this FieldConverter is only encoding data on field
level.
RLE is applied on blocklet level so it is applied while writing the
Hi,
Since we use delta compression for measure types in carbondata , it stores
the data with least datatype as per the values in blocklet. So it does not
matter whether we store INT or BIGINT in carbondata files, it always use
least datatype to store.
Regards,
Ravi
On 4 December 2016 at 13:28, S
Hi,
These constants are used for converting data to respective datatype while
loading into carbondata. It is not required for short or int type as we
store as bigint.
Regards,
Ravi
On 4 December 2016 at 12:40, Sea <261810...@qq.com> wrote:
> Hi, all:
> I find the following codes in carbon C
Hi,
Yes, we have plans for integrating carbondata to hive engine but it is not
our high priority work now so we will take it up this task gradually. Any
contributions towards it are welcome.
Regards,
Ravi
On 4 December 2016 at 12:30, Sea <261810...@qq.com> wrote:
> Hi, all:
> Now carbondata
Hi,
Please provide table schema, load command and sample data to reproduce this
issue, you may create the JIRA for it.
Regards,
Ravi
On 6 December 2016 at 07:05, Lu Cao wrote:
> Hi Dev team,
> I have loaded some data into carbondata table. But when I put the id
> column(String type) in where c
Hi,
Carbon takes store location from CarbonContext and sets to CarbonProperties
as carbon.storelocation , so it is not required to add store location in
properties file. And carbon.ddl.base.hdfs.url is not a mandatory property
it is just used when load path is provided with prefix then it appends
+1 to have separate output formats, now user can have flexibility to choose
as per scenario.
On Fri, Dec 16, 2016, 2:47 AM Jihong Ma wrote:
>
> It is great idea to have separate OutputFormat for regular Carbon data
> files, index files as well as meta data files, For instance: dictionary
> file,
Hi,
It seems the store path location is taking default location. Did you set
the store location properly? Which spark version you are using?
Regards,
Ravindra
On Tue, Dec 27, 2016, 1:38 PM 251469031 <251469...@qq.com> wrote:
> Hi Kumar,
>
>
> thx to your repley, the full logs is as follows:
>
Hi,
>From carbon it supposed to return float data when you use float data type.
Please check whether you are converting the data to float or not
in ScannedResultCollector implementation classes.
Regards,
Ravindra
On 27 December 2016 at 20:23, Rahul Kumar wrote:
> Hello Ravindra,
>
> I am worki
Have you used 'mvn clean'?
On 28 December 2016 at 07:18, rahulforallp wrote:
> hey QiangCai,
> thank you for your reply . i have spark 1.6.2. and also tried with
> -Dspark.version=1.6.2 . But result is same . Still i am getting same
> exception.
>
> Is this exception possibe if i have different
Yes, it is not working because the support is not yet added, right now it
is low priority task as user can directly use spark-shell to create
carbonsession and execute the queries.
On 4 January 2017 at 12:40, anubhavtarar wrote:
> carbon shell is not working with spark 2.0 version
> here are the
Hi,
I did not understand the issue, what is the error it throws?
On 4 January 2017 at 10:03, Anubhav Tarar wrote:
> here is the script ./bin/spark-submit --conf
> spark.sql.hive.thriftServer.singleSession=true --class
> org.apache.carbondata.spark.thriftserver.CarbonThriftServer
> /opt/spark-2.
you can directly use the other sql create table command like in 1.6.
CREATE TABLE IF NOT EXISTS t3
(ID Int, date Timestamp, country String,
name String, phonetype String, serialname char(10), salary Int)
STORED BY 'carbondata'
On 4 January 2017 at 10:02, Anubhav Tarar wrote:
> exactly my point
Hi,
Its an issue, we are working on the fix.
On 5 January 2017 at 17:26, Anurag Srivastava wrote:
> Hello,
>
> I have taken latest code at today (5/01/2017) and build code with spark
> 1.6. After that I put the latest jar in carbonlib in spark and start thrift
> server.
>
> When I have started
Hi,
Please make sure the store path of "flightdb2" is given properly in side
CarbonInputMapperTest class.
Please provide complete stack trace of error.
On 10 January 2017 at 17:54, 彭 wrote:
> Hi,all:
> Recently, i meet a failed TestCase, Is there anyone know it?
> http://apache-carbon
Please provide Jira user name and mail id. We will add you as a contributor
so that you can assign issues to yourself.
On Fri, Jan 13, 2017, 16:49 Anurag Srivastava wrote:
> Hello Team,
>
> I am working on JIRA [CARBONDATA-542] and want to assign this JIRA to me.
> But I am not able to assign it
Hi,
Please use
"mvn clean -DskipTests -Pspark-1.5 -Dspark.version=1.5.2 -Phadoop-2.7.2
package"
Regards,
Ravindra
On 20 January 2017 at 15:42, manish gupta wrote:
> Can you try compiling with hadoop-2.7.2 version and use it and let us know
> if the issue still persists.
>
> "mvn package -Dsk
+1
Done sanity for all major features, it is fine.
Regards,
Ravindra.
On Sat, Jan 21, 2017, 07:51 Liang Chen wrote:
> +1(binding)
>
> I checked:
> - name contains incubating
> - disclaimer exists
> - signatures and hash correct
> - NOTICE good
> - LICENSE is good
> - Source files have ASF heade
Hi Mars,
Please try creating carbonsession with storepath as follow.
val carbon = SparkSession.builder().config(sc.getConf).
getOrCreateCarbonSession("hdfs://localhost:9000/carbon/store ")
Regards,
Ravindra.
On 4 February 2017 at 08:12, Mars Xu wrote:
> Hello All,
> I met a problem o
Hi,
The performance is depends on the query plan, when you submit the query
like [Select attributeA , count(*) from tableB group by attributeA] in
case of spark it asks carbon to give only attributeA column. So Carbon
reads only attributeA column from all files send the result to spark to
aggreg
Hi,
This exception is actually ignored in class SegmentUpdateStatusManager line
number 696. This exception does not create any problem. Usually this
exception won't be printed in any server logs as we are ignoring it. May be
in spark-shell it is printing. we will look into it.
Regards,
Ravindra.
Hi Libis,
spark-sql CLI is not supported by carbondata.
Why don't you use carbon thrift server and beeline, it is also same as
spark-sql CLI and it gives execution time for each query.
Start carbondata thrift server script.
bin/spark-submit --class
org.apache.carbondata.spark.thriftserver.CarbonT
Hi,
Please set carbon.badRecords.location in carbon.properties and check any
bad records are added to that location.
Regards,
Ravindra.
On 14 February 2017 at 15:24, Yinwei Li <251469...@qq.com> wrote:
> Hi all,
>
>
> I met an data lost problem when loading data from csv file to carbon
> tab
gt; and the following are bad record logs:
>
>
> INFO 15-02 09:43:24,393 - [Executor task launch
> worker-0][partitionID:_1g_web_sales_d59af854-773c-429c-b7e6-031d602fe2be]
> Total copy time (ms) to copy file /tmp/1039730591739247/0/_1g/
> web_sales/Fact/Part0/Segment_0/0/
Problems in current format.
1. IO read is slower since it needs to go for multiple seeks on the file to
read column blocklets. Current size of blocklet is 12, so it needs to
read multiple times from file to scan the data on that column.
Alternatively we can increase the blocklet size but it suf
Please find the thrift file in below location.
https://drive.google.com/open?id=0B4TWTVbFSTnqZEdDRHRncVItQ242b1NqSTU2b2g4dkhkVDRj
On 15 February 2017 at 17:14, Ravindra Pesala wrote:
> Problems in current format.
> 1. IO read is slower since it needs to go for multiple seeks on the fil
> but it occured an Exception: java.lang.RuntimeException:
> carbon.kettle.home is not set
>
>
> the configuration in my carbon.properties is:
> carbon.kettle.home=/opt/spark-2.1.0/carbonlib/carbonplugins, but it seems
> not work.
>
>
> how can I solve this problem.
>
ber of false positive blocks will improve the
> >>filter query performance. Separating uncompression of data from reader
> >>layer will improve the overall query performance.
> >>
> >>-Regards
> >>Kumar Vishal
> >>
> >>On Wed, Feb 15, 2017 a
Hi Yinwei,
Thank you for pointing out the issue, I will check with TPC-DS data and
verify the data load with new flow.
Regards,
Ravindra.
On 16 February 2017 at 09:35, QiangCai wrote:
> Maybe you can check PR594, it will fix a bug which will impact the result
> of
> loading.
>
>
>
> --
> View
Hi Yinwei,
Can you provide create table scripts for both the tables store_returns and
web_sales.
Regards,
Ravindra.
On 16 February 2017 at 10:07, Ravindra Pesala wrote:
> Hi Yinwei,
>
> Thank you for pointing out the issue, I will check with TPC-DS data and
> verify the data l
7;DICTIONARY_INCLUDE'='ws_sold_date_sk, ws_sold_time_sk, ws_ship_date_sk,
> ws_item_sk, ws_bill_customer_sk, ws_bill_cdemo_sk, ws_bill_hdemo_sk,
> ws_bill_addr_sk, ws_ship_customer_sk, ws_ship_cdemo_sk, ws_ship_hdemo_sk,
> ws_ship_addr_sk, ws_web_page_sk, ws_web_site_sk, ws_ship_mode_sk
Hi QiangCai,
PR594 fix does not solve the data lost issue, it fixes the data mismatch in
some cases.
Regards,
Ravindra.
On 16 February 2017 at 09:35, QiangCai wrote:
> Maybe you can check PR594, it will fix a bug which will impact the result
> of
> loading.
>
>
>
> --
> View this message in co
Hi,
We have so far integrated only to the Spark, not yet integrated to Hive. So
carbondata cannot be used in Hive on Spark at this moment.
Regards,
Ravindra.
On 16 February 2017 at 14:35, wangzheng <18031...@qq.com> wrote:
> we use cdh5.7, it remove the thriftserver of spark, so sparksql is not
Hi,
Yes, it works because we are sorting the column values before assigning
dictionary values to it. So it can work only if you have loaded the data
only once( it means there is no incremental load). If you do incremental
load and some more dictionary values are added to store then there is no
gua
Hi Xiaoqiao,
Is the problem still exists?
Can you try with clean build with "mvn clean -DskipTests -Pspark-1.6
package" command.
Regards,
Ravindra.
On 16 February 2017 at 08:36, Xiaoqiao He wrote:
> hi Liang Chen,
>
> Thank for your help. It is true that i install and configure carbondata on
t; > at
> > org.apache.spark.sql.execution.command.LoadTable.
> run(carbonTableSchema.scala:360)
> > at
> > org.apache.spark.sql.execution.ExecutedCommand.
> sideEffectResult$lzycompute(commands.scala:58)
> > at
> > org.apache.spark.sql.execution.ExecutedCommand.
> sideEffectResult(commands.scala
Congratulations Hexiaoqiao.
Regards,
Ravindra.
On 21 February 2017 at 10:15, Xiaoqiao He wrote:
> Hi PPMC, Liang,
>
> It is my honor that receive the invitation, and very happy to have chance
> that participate to build CarbonData community also. I will keep
> contributing to Apache CarbonData
Hi,
Please create the carbon context as follows.
val cc = new CarbonContext(sc, storeLocation)
Here storeLocation is hdfs://hacluster/tmp/carbondata/carbon.store in your case.
Regards,
Ravindra
On 21 February 2017 at 08:30, Ravindra Pesala wrote:
> Hi,
>
> How did you create Carb
Hi,
We are working on TPC-H performance report now, and have improved the
performance with new format, we have already raised the PR(584 and 586) for
the same, It is still under review and it will be merged soon. Once these
PR's are merged we will start verify the TPC-DS performace as well.
Regar
Hi,
I feel there are more disadvantages than advantages in this approach. In
your current scenario you want to set dictionary only for columns which are
used as filters, but the usage of dictionary is not only limited for
filters, it can reduce the store size and improve the aggregation queries.
I
> data size will increase. Late decoding is one of main advantage, no
> > dictionary column aggregation will be slower. Filter query will suffer as
> > in case of dictionary column we are comparing on byte pack value, in case
> > of no dictionary it will be on actual value.
> >
Hi,
Have you loaded data freshly and try to execute the query? Or you are
trying to query the old store you already has loaded?
Regards,
Ravindra.
On 28 February 2017 at 17:20, ericzgy <1987zhangguang...@163.com> wrote:
> Now when I load data into CarbonData table using spark1.6.2 and
> carbond
Hi Likun,
You mentioned that if user does not specify dictionary columns then by
default those are chosen as no dictionary columns.
But we have many disadvantages as I mentioned in above mail if you keep no
dictionary as default. We have initially introduced no dictionary columns
to handle high ca
columns. I feel
> preventing such misusage is important in order to encourage more users to
> use carbondata.
>
> Any suggestion on solving this issue?
>
>
> Regards,
> Likun
>
>
> > 在 2017年2月28日,下午10:20,Ravindra Pesala 写道:
> >
> > Hi Likun,
> >
>
1 - 100 of 380 matches
Mail list logo