Re: [Discussion]Simplify the deployment of carbondata

2016-12-25 Thread Liang Chen
Hi

Thanks you started a good discussion.

For 1 and 2, i agree.  In 1.0.0 version, will support it.
For 3 : Need keep the parameter, users can specify carbon's store location. 
If users don't specify the carbon store location, can use the default
location what you suggested: "spark.sql.warehouse.dir"(spark2) or
"hive.metastore.warehouse.dir"(spark1)

Regards
Liang

QiangCai wrote
> hi all,
>   
>   I suggest to simplify deployment of CarbonData as following.
>   1. remove kettle dependency completely, no need to deploy
> "carbonplugins" folder on each node, no need to set "carbhon.kettle.home"
>   2. remove carbon.properties file from executor side, pass CarbonData
> configuration to executor side from driver side 
>   3. use "spark.sql.warehouse.dir"(spark2) or
> "hive.metastore.warehouse.dir"(spark1) instead of "carbon.storelocation"
> 
>   So we will just need to deploy CarbonData jars on cluster mode in the
> future.
> 
>   What's your opinion?
> 
> Best Regards 
> David Cai





--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussion-Simplify-the-deployment-of-carbondata-tp5000p5006.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: [jira] [Created] (CARBONDATA-559) Job failed at last step

2016-12-25 Thread Liang Chen
Copied the below information from Apache JIRA.
--
Hi Lionel
Global dictionary is generated successfully but data loading graph is not
started because it seems that kettle home at executor size is not set
properly as displayed in logs.
NFO 23-12 16:58:47,461 - Executor task launch worker-4
{carbon.graph.rowset.size=10, carbon.enable.quick.filter=false,
carbon.number.of.cores=4, carbon.sort.file.buffer.size=20,
carbon.kettle.home=$/carbonlib/carbonplugins,
Carbon property is referred from
/opt/cloudera/parcels/CDH-5.8.0-1.cdh5.8.0.p0.42/lib/spark/conf/carbon.properties
path . 
Below suggested solution can work.
a.. Correct the Kettle home path in carbon.properties and try again .
b. use without Kettle flow (please refer examples) .
Thanks
Babu
Reply
  
Cao, Lionel added a comment - 13 hours ago
Thank you Babu! It works!


Lu Cao wrote
> Hi team,
> Could you help look into this issue?
> I have attached the log in the Jira ticket.
> 
> Thanks & Best Regards,
> Lionel
> 
> On Fri, Dec 23, 2016 at 5:47 PM, Cao, Lionel (JIRA) 

> jira@

>  wrote:
> 
>> Cao, Lionel created CARBONDATA-559:
>> --
>>
>>  Summary: Job failed at last step
>>  Key: CARBONDATA-559
>>  URL:
>> https://issues.apache.org/jira/browse/CARBONDATA-559
>>  Project: CarbonData
>>   Issue Type: Bug
>>   Components: core
>> Affects Versions: 0.2.0-incubating
>>  Environment: carbon version: branch-0.2
>> hadoop 2.4.0
>> spark 1.6.0
>> OS centOS
>> Reporter: Cao, Lionel
>>
>>
>> Hi team,
>> My job alwasy failed at last step:
>> it said 'yarn' user don't have write access to target data
>> path(storeLocation).
>> But I tested twice with 1 rows data, both successed. could you help
>> look into the log? Please refer to the attachment.
>> Search 'access=WRITE' you can see the exception.
>> Search 'Exception' for other exceptions.
>>
>> thanks,
>> Lionel
>>
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.3.4#6332)
>>





--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/jira-Created-CARBONDATA-559-Job-failed-at-last-step-tp4932p5004.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: Int accepting range of BigInt

2016-12-25 Thread Anurag Srivastava
Hello everyone,

I am working on resolving jira bug
https://issues.apache.org/jira/browse/CARBONDATA-370
While resolving this I found that when I create a table with Int as one of
the columns and then if I do Desc table, it shows the Int column being
created as BigInt :-

So, the Int column is created as BigInt in CarbonData.

Query :

cc.sql("""
   CREATE TABLE IF NOT EXISTS t3
   (ID BigInt, date Timestamp, country String,
   name String, phonetype String, serialname char(10), salary Int)
   STORED BY 'carbondata'
   """)



Output :

+---+---+--+
|  col_name | data_type | comment|
+---+---+--+
|id  |   bigint  |   |
|  date| timestamp |   |
|   country  |   string   |   |
|  name  |   string   |   |
| phonetype|   string   |   |
|serialname|   string   |   |
|salary   |   bigint   |   |
+--+-+-+


My query is that to resolve the issue of out of range values being accepted
for Int, should we be creating the Int column as Int and not BigInt.

Thanks and Regards
Anurag Srivastava

On Fri, Nov 4, 2016 at 11:58 AM, manish gupta 
wrote:

> Hi Pallavi,
>
> This is a bug in the system. We store Int as bigInt but value beyond the
> integer range should not be allowed for storage in integer datatype column.
> From my opinion you can raise a Jira issue for this.
>
> Regards
> Manish Gupta
>
> On Thu, Nov 3, 2016 at 11:06 PM, Kumar Vishal 
> wrote:
>
> > Hi Pallavi,
> >Currently in carbon int data type is stored as big int.
> Can
> > u please re-upload the image.
> >
> > -Regards
> > Kumar Vishal
> >
> > On Thu, Nov 3, 2016 at 10:09 AM, Pallavi Singh  >
> > wrote:
> >
> > > Hi,
> > >
> > > I would like to know when we execute the query :
> > > create table employee(id int, name String) stored by 'carbondata';
> > > the table created is :
> > >
> > > [image: Inline image 1]
> > >
> > > the Int is converted into BigInt and Int supports range beyond
> > > (-2147483648 to 2147483647)
> > > --
> > > Regards | Pallavi Singh
> > >
> > >
> >
>



-- 
*Thanks*


*Anurag Srivastava**Software Consultant*
*Knoldus Software LLP*

*India - US - Canada*
* Twitter  | FB
 | LinkedIn
*


[Discussion]Simplify the deployment of carbondata

2016-12-25 Thread QiangCai
hi all,
  
  I suggest to simplify deployment of CarbonData as following.
  1. remove kettle dependency completely, no need to deploy "carbonplugins"
folder on each node, no need to set "carbhon.kettle.home"
  2. remove carbon.properties file from executor side, pass CarbonData
configuration to executor side from driver side 
  3. use "spark.sql.warehouse.dir"(spark2) or
"hive.metastore.warehouse.dir"(spark1) instead of "carbon.storelocation"

  So we will just need to deploy CarbonData jars on cluster mode in the
future.

  What's your opinion?

Best Regards 
David Cai



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Discussion-Simplify-the-deployment-of-carbondata-tp5000.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


[jira] [Created] (CARBONDATA-563) Select Queries are not working with spark 1.6.2.

2016-12-25 Thread Babulal (JIRA)
Babulal created CARBONDATA-563:
--

 Summary: Select Queries are  not working with spark 1.6.2.
 Key: CARBONDATA-563
 URL: https://issues.apache.org/jira/browse/CARBONDATA-563
 Project: CarbonData
  Issue Type: Bug
  Components: core, data-query
Affects Versions: 0.2.0-incubating
Reporter: Babulal


Create carbon table 
create table x (a int ,b string) stored by 'carbondata'
Load data to carbon table 

run query  select count(*) from x;  

Java.lang.ClassCastException:[Ljava.lang.Object;can not be cast to 
org.apache.sql.catalyst.InternalRow

Log snap shot in attached. 







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)