Re: carbon thrift server for spark 2.0 showing unusual behaviour

2017-01-03 Thread anubhavtarar
have a look at these logs 

CREATE TABLE Bug212(int string)USING org.apache.spark.sql.CarbonSource
OPTIONS("bucketnumber"="1", "bucketcolumns"="String","tableName"="t100");

Error:
org.apache.carbondata.spark.exception.MalformedCarbonCommandException: Table
default.t 100 can not be created without key columns. Please use
DICTIONARY_INCLUDE or DICTIONARY_EXCLUDE to set at least one key column if
all specified columns are numeric types (state=,code=0)

2 minutes later

 CREATE TABLE Bug211(int int)USING org.apache.spark.sql.CarbonSource
OPTIONS("bucketnumber"="1", "bucketcolumns"="String","tableName"="t 100");

CREATE TABLE Bug211(int int)USING org.apache.spark.sql.CarbonSource
OPTIONS("bucketnumber"="1", "bucketcolumns"="String","tableName"="t 100");

it is blocking for spark2




--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/carbon-thrift-server-for-spark-2-0-showing-unusual-behaviour-tp5384p5444.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: why there is a table name option in carbon source format?

2017-01-03 Thread Anubhav Tarar
ya but if user wanted to create table using carbon session and there is a
option of using tableName it is a kind of bug plus table name in options
also allow to create table with spaces see logs

CREATE TABLE Bug212(int string)USING org.apache.spark.sql.CarbonSource
OPTIONS("bucketnumber"="1", "bucketcolumns"="int","tableName"="t 100");
+-+--+
| Result  |
+-+--+
+-+--+

table t 100 is there in hdfs

this should not happened


On Wed, Jan 4, 2017 at 12:56 PM, Ravindra Pesala 
wrote:

> you can directly use the other sql create table command like in 1.6.
>
> CREATE TABLE IF NOT EXISTS t3
> (ID Int, date Timestamp, country String,
> name String, phonetype String, serialname char(10), salary Int)
> STORED BY 'carbondata'
>
>
> On 4 January 2017 at 10:02, Anubhav Tarar 
> wrote:
>
> > exactly my point if table name in create table statement and table name
> in
> > carbon source option is different consider this example
> >
> > 0: jdbc:hive2://localhost:1> CREATE TABLE testing2(String
> string)USING
> > org.apache.spark.sql.CarbonSource OPTIONS("bucketnumber"="1",
> > "bucketcolumns"="String",tableName=" testing1");
> >
> > then the table which get created in hdfs is testing1 not testing testing2
> > it is quite confusing from user side
> >
> > On Wed, Jan 4, 2017 at 8:33 AM, QiangCai  wrote:
> >
> > > For Spark 2,  when using SparkSession to create carbon table, need
> > > tableName
> > > option to create carbon schema in store location folder. Better to use
> > > CarbonSession to create carbon table now.
> > >
> > >
> > >
> > > --
> > > View this message in context: http://apache-carbondata-
> > > mailing-list-archive.1130556.n5.nabble.com/why-there-is-a-
> > > table-name-option-in-carbon-source-format-tp5385p5420.html
> > > Sent from the Apache CarbonData Mailing List archive mailing list
> archive
> > > at Nabble.com.
> > >
> >
> >
> >
> > --
> > Thanks and Regards
> >
> > *   Anubhav Tarar *
> >
> >
> > * Software Consultant*
> >   *Knoldus Software LLP    *
> >LinkedIn  Twitter
> > fb 
> >   mob : 8588915184
> >
>
>
>
> --
> Thanks & Regards,
> Ravi
>



-- 
Thanks and Regards

*   Anubhav Tarar *


* Software Consultant*
  *Knoldus Software LLP    *
   LinkedIn  Twitter
fb 
  mob : 8588915184


Re: carbon thrift server for spark 2.0 showing unusual behaviour

2017-01-03 Thread Anubhav Tarar
see these logs
jdbc:hive2://localhost:1>  CREATE TABLE Bug2217559156(int int)USING
org.apache.spark.sql.CarbonSource;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (1.809 seconds)
it does not give any exception in any query most of the times but sometimes
it works


On Wed, Jan 4, 2017 at 12:54 PM, Ravindra Pesala 
wrote:

> Hi,
>
> I did not understand the issue, what is the error it throws?
>
> On 4 January 2017 at 10:03, Anubhav Tarar 
> wrote:
>
> > here is the script ./bin/spark-submit --conf
> > spark.sql.hive.thriftServer.singleSession=true --class
> > org.apache.carbondata.spark.thriftserver.CarbonThriftServer
> > /opt/spark-2.0.0-bin-hadoop2.7/carbonlib/carbondata_2.11-1.
> > 0.0-incubating-SNAPSHOT-shade-hadoop2.2.0.jar
> > hdfs://localhost:54310/opt/carbonStore
> >
> > it works fine only 1 out of 10 times
> >
> > On Wed, Jan 4, 2017 at 8:28 AM, QiangCai  wrote:
> >
> > > Can you show the JDBCServer startup script?
> > >
> > >
> > >
> > > --
> > > View this message in context: http://apache-carbondata-
> > > mailing-list-archive.1130556.n5.nabble.com/carbon-thrift-
> > > server-for-spark-2-0-showing-unusual-behaviour-tp5384p5419.html
> > > Sent from the Apache CarbonData Mailing List archive mailing list
> archive
> > > at Nabble.com.
> > >
> >
> >
> >
> > --
> > Thanks and Regards
> >
> > *   Anubhav Tarar *
> >
> >
> > * Software Consultant*
> >   *Knoldus Software LLP    *
> >LinkedIn  Twitter
> > fb 
> >   mob : 8588915184
> >
>
>
>
> --
> Thanks & Regards,
> Ravi
>



-- 
Thanks and Regards

*   Anubhav Tarar *


* Software Consultant*
  *Knoldus Software LLP    *
   LinkedIn  Twitter
fb 
  mob : 8588915184


Re: why there is a table name option in carbon source format?

2017-01-03 Thread Ravindra Pesala
you can directly use the other sql create table command like in 1.6.

CREATE TABLE IF NOT EXISTS t3
(ID Int, date Timestamp, country String,
name String, phonetype String, serialname char(10), salary Int)
STORED BY 'carbondata'


On 4 January 2017 at 10:02, Anubhav Tarar  wrote:

> exactly my point if table name in create table statement and table name in
> carbon source option is different consider this example
>
> 0: jdbc:hive2://localhost:1> CREATE TABLE testing2(String string)USING
> org.apache.spark.sql.CarbonSource OPTIONS("bucketnumber"="1",
> "bucketcolumns"="String",tableName=" testing1");
>
> then the table which get created in hdfs is testing1 not testing testing2
> it is quite confusing from user side
>
> On Wed, Jan 4, 2017 at 8:33 AM, QiangCai  wrote:
>
> > For Spark 2,  when using SparkSession to create carbon table, need
> > tableName
> > option to create carbon schema in store location folder. Better to use
> > CarbonSession to create carbon table now.
> >
> >
> >
> > --
> > View this message in context: http://apache-carbondata-
> > mailing-list-archive.1130556.n5.nabble.com/why-there-is-a-
> > table-name-option-in-carbon-source-format-tp5385p5420.html
> > Sent from the Apache CarbonData Mailing List archive mailing list archive
> > at Nabble.com.
> >
>
>
>
> --
> Thanks and Regards
>
> *   Anubhav Tarar *
>
>
> * Software Consultant*
>   *Knoldus Software LLP    *
>LinkedIn  Twitter
> fb 
>   mob : 8588915184
>



-- 
Thanks & Regards,
Ravi


Re: carbon thrift server for spark 2.0 showing unusual behaviour

2017-01-03 Thread Ravindra Pesala
Hi,

I did not understand the issue, what is the error it throws?

On 4 January 2017 at 10:03, Anubhav Tarar  wrote:

> here is the script ./bin/spark-submit --conf
> spark.sql.hive.thriftServer.singleSession=true --class
> org.apache.carbondata.spark.thriftserver.CarbonThriftServer
> /opt/spark-2.0.0-bin-hadoop2.7/carbonlib/carbondata_2.11-1.
> 0.0-incubating-SNAPSHOT-shade-hadoop2.2.0.jar
> hdfs://localhost:54310/opt/carbonStore
>
> it works fine only 1 out of 10 times
>
> On Wed, Jan 4, 2017 at 8:28 AM, QiangCai  wrote:
>
> > Can you show the JDBCServer startup script?
> >
> >
> >
> > --
> > View this message in context: http://apache-carbondata-
> > mailing-list-archive.1130556.n5.nabble.com/carbon-thrift-
> > server-for-spark-2-0-showing-unusual-behaviour-tp5384p5419.html
> > Sent from the Apache CarbonData Mailing List archive mailing list archive
> > at Nabble.com.
> >
>
>
>
> --
> Thanks and Regards
>
> *   Anubhav Tarar *
>
>
> * Software Consultant*
>   *Knoldus Software LLP    *
>LinkedIn  Twitter
> fb 
>   mob : 8588915184
>



-- 
Thanks & Regards,
Ravi


Re: carbon shell is not working with spark 2.0 version

2017-01-03 Thread Ravindra Pesala
Yes, it is not working because the support is not yet added, right now it
is low priority task as user can directly use spark-shell to create
carbonsession and execute the queries.

On 4 January 2017 at 12:40, anubhavtarar  wrote:

> carbon shell is not working with spark 2.0 version
> here are the logs
>
> java.lang.ClassNotFoundException: org.apache.spark.repl.carbon.Main
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at org.apache.spark.util.Utils$.classForName(Utils.scala:225)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:686)
> at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
> at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.
> scala:124)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/carbon-shell-is-
> not-working-with-spark-2-0-version-tp5436.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>



-- 
Thanks & Regards,
Ravi


carbon shell is not working with spark 2.0 version

2017-01-03 Thread anubhavtarar
carbon shell is not working with spark 2.0 version
here are the logs

java.lang.ClassNotFoundException: org.apache.spark.repl.carbon.Main
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:225)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:686)
at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/carbon-shell-is-not-working-with-spark-2-0-version-tp5436.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: Failed to APPEND_FILE, hadoop.hdfs.protocol.AlreadyBeingCreatedException

2017-01-03 Thread Harsh Sharma
A minor update in the email, carbon table uniqdata was generated with the
below query,

*CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION
string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1
int)  STORED BY 'org.apache.carbondata.format' TBLPROPERTIES
("TABLE_BLOCKSIZE"= "256 MB");*


Thank You


Best Regards |
*Harsh Sharma*
Sr. Software Consultant
Facebook  | Twitter
 | Linked In

harshs...@gmail.com
Skype*: khandal60*
*+91-8447307237*

On Wed, Jan 4, 2017 at 10:48 AM, Harsh Sharma  wrote:

> Hello Team,
> We performed the following scenario,
>
>
>- Created a hive table as *unitdatahive* as below,
>
> *CREATE TABLE uniqdatahive (CUST_ID int,CUST_NAME
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp,
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10),
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2
> double,INTEGER_COLUMN1 int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
> LINES TERMINATED BY '\n' STORED AS TEXTFILE;*
>
>
>- Created a carbon file as *uniqdata* as below,
>
>
> *CREATE TABLE uniqdatahive1 (CUST_ID int,CUST_NAME
> String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp,
> BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10),
> DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2
> double,INTEGER_COLUMN1 int)  STORED BY 'org.apache.carbondata.format'
> TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");*
>
>
>- Loaded the 2013 records into hive table uniqdatahive as below,
>
>
> *LOAD DATA inpath
> 'hdfs://hadoop-master:54311/data/2000_UniqData_tabdelm.csv' OVERWRITE INTO
> TABLE uniqdatahive;*
>
>
>- Inserted around 7000 records into carbon table with 3 runs of the
>following query,
>
>
> *insert into table uniqdata select * from uniqdatahive;*
>
>
>- After 7000 records stored in carbon table, the following exception
>began to appear in every run of the insert query again and again,
>
>
> 0: jdbc:hive2://hadoop-master:1> *insert into table uniqdata select *
> from uniqdatahive;*
> Error: org.apache.spark.SparkException: Job aborted due to stage failure:
> Task 1 in stage 42.0 failed 4 times, most recent failure: Lost task 1.3 in
> stage 42.0 (TID 125, hadoop-slave-1):
> *org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException)*:
> *Failed to APPEND_FILE* /user/hive/warehouse/carbon.
> store/hivetest/uniqdata/Metadata/04936ba3-dbb6-45c2-8858-d9ed864034c1.dict
> for DFSClient_NONMAPREDUCE_-1303665369_59 on 192.168.2.130 *because
> DFSClient_NONMAPREDUCE_-1303665369_59 is already the current lease holder.*
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
> recoverLeaseInternal(FSNamesystem.java:2882)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(
> FSNamesystem.java:2683)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
> appendFileInt(FSNamesystem.java:2982)
> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
> appendFile(FSNamesystem.java:2950)
> at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.
> append(NameNodeRpcServer.java:654)
> at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSi
> deTranslatorPB.append(ClientNamenodeProtocolServerSi
> deTranslatorPB.java:421)
> at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$
> ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.
> java)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(
> ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(
> UserGroupInformation.java:1657)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
>
> *After this we are unable to insert more than 7000 records in the carbon
> table from hive table.*
>
>
> Thank You
>
>
> Best Regards |
> *Harsh Sharma*
> Sr. Software Consultant
> Facebook  | Twitter
>  | Linked In
> 
> harshs...@gmail.com
> Skype*: khandal60*
> *+91-8447307237*
>


Failed to APPEND_FILE, hadoop.hdfs.protocol.AlreadyBeingCreatedException

2017-01-03 Thread Harsh Sharma
Hello Team,
We performed the following scenario,


   - Created a hive table as *unitdatahive* as below,

*CREATE TABLE uniqdatahive (CUST_ID int,CUST_NAME
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp,
BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10),
DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2
double,INTEGER_COLUMN1 int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' STORED AS TEXTFILE;*


   - Created a carbon file as *uniqdata* as below,


*CREATE TABLE uniqdatahive1 (CUST_ID int,CUST_NAME
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp,
BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10),
DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2
double,INTEGER_COLUMN1 int)  STORED BY 'org.apache.carbondata.format'
TBLPROPERTIES ("TABLE_BLOCKSIZE"= "256 MB");*


   - Loaded the 2013 records into hive table uniqdatahive as below,


*LOAD DATA inpath
'hdfs://hadoop-master:54311/data/2000_UniqData_tabdelm.csv' OVERWRITE INTO
TABLE uniqdatahive;*


   - Inserted around 7000 records into carbon table with 3 runs of the
   following query,


*insert into table uniqdata select * from uniqdatahive;*


   - After 7000 records stored in carbon table, the following exception
   began to appear in every run of the insert query again and again,


0: jdbc:hive2://hadoop-master:1> *insert into table uniqdata select *
from uniqdatahive;*
Error: org.apache.spark.SparkException: Job aborted due to stage failure:
Task 1 in stage 42.0 failed 4 times, most recent failure: Lost task 1.3 in
stage 42.0 (TID 125, hadoop-slave-1):
*org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException)*:
*Failed to APPEND_FILE*
/user/hive/warehouse/carbon.store/hivetest/uniqdata/Metadata/04936ba3-dbb6-45c2-8858-d9ed864034c1.dict
for DFSClient_NONMAPREDUCE_-1303665369_59 on 192.168.2.130 *because
DFSClient_NONMAPREDUCE_-1303665369_59 is already the current lease holder.*
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:2882)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2683)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:2982)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:2950)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:654)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:421)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

*After this we are unable to insert more than 7000 records in the carbon
table from hive table.*


Thank You


Best Regards |
*Harsh Sharma*
Sr. Software Consultant
Facebook  | Twitter
 | Linked In

harshs...@gmail.com
Skype*: khandal60*
*+91-8447307237*


[jira] [Created] (CARBONDATA-589) carbon spark shell is not working with spark 2.0

2017-01-03 Thread anubhav tarar (JIRA)
anubhav tarar created CARBONDATA-589:


 Summary: carbon spark shell is not working with spark 2.0
 Key: CARBONDATA-589
 URL: https://issues.apache.org/jira/browse/CARBONDATA-589
 Project: CarbonData
  Issue Type: Bug
  Components: build
Affects Versions: 1.0.0-incubating
Reporter: anubhav tarar
Priority: Minor


carbon shell is not working with spark 2.0 version 
here are the logs

ava.lang.ClassNotFoundException: org.apache.spark.repl.carbon.Main
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:225)
at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:686)
at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: carbontable compact throw err

2017-01-03 Thread geda
hello,1,2 is ok ,3 throw error



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/carbontable-compact-throw-err-tp5382p5430.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: carbon thrift server for spark 2.0 showing unusual behaviour

2017-01-03 Thread Anubhav Tarar
here is the script ./bin/spark-submit --conf
spark.sql.hive.thriftServer.singleSession=true --class
org.apache.carbondata.spark.thriftserver.CarbonThriftServer
/opt/spark-2.0.0-bin-hadoop2.7/carbonlib/carbondata_2.11-1.0.0-incubating-SNAPSHOT-shade-hadoop2.2.0.jar
hdfs://localhost:54310/opt/carbonStore

it works fine only 1 out of 10 times

On Wed, Jan 4, 2017 at 8:28 AM, QiangCai  wrote:

> Can you show the JDBCServer startup script?
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/carbon-thrift-
> server-for-spark-2-0-showing-unusual-behaviour-tp5384p5419.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>



-- 
Thanks and Regards

*   Anubhav Tarar *


* Software Consultant*
  *Knoldus Software LLP    *
   LinkedIn  Twitter
fb 
  mob : 8588915184


Re: why there is a table name option in carbon source format?

2017-01-03 Thread Anubhav Tarar
exactly my point if table name in create table statement and table name in
carbon source option is different consider this example

0: jdbc:hive2://localhost:1> CREATE TABLE testing2(String string)USING
org.apache.spark.sql.CarbonSource OPTIONS("bucketnumber"="1",
"bucketcolumns"="String",tableName=" testing1");

then the table which get created in hdfs is testing1 not testing testing2
it is quite confusing from user side

On Wed, Jan 4, 2017 at 8:33 AM, QiangCai  wrote:

> For Spark 2,  when using SparkSession to create carbon table, need
> tableName
> option to create carbon schema in store location folder. Better to use
> CarbonSession to create carbon table now.
>
>
>
> --
> View this message in context: http://apache-carbondata-
> mailing-list-archive.1130556.n5.nabble.com/why-there-is-a-
> table-name-option-in-carbon-source-format-tp5385p5420.html
> Sent from the Apache CarbonData Mailing List archive mailing list archive
> at Nabble.com.
>



-- 
Thanks and Regards

*   Anubhav Tarar *


* Software Consultant*
  *Knoldus Software LLP    *
   LinkedIn  Twitter
fb 
  mob : 8588915184


Re: 回复: how to make carbon run faster

2017-01-03 Thread QiangCai
HI, 
Total memory of cluster is 16G*40 = 640GB,  total size of data file is
600/1024GB*60 = 35GB, so I suggest to load all data files(60 days) once.

BTW, if loading small data file(total size) for each data loading, it will
generate small carbon files. 
You can add "carbon.enable.auto.load.merge=true" to carbon.properties file
to enable compaction feature.



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/how-to-make-carbon-run-faster-tp5305p5427.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: carbontable compact throw err

2017-01-03 Thread QiangCai
You can check as following and show the result.

1)
select * from test limit 1

2)
show segments for table test limit 1000

3)
alter table test compact 'major'

Better to provide more log info.



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/carbontable-compact-throw-err-tp5382p5426.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: Dictionary file is locked for Updation, unable to Load

2017-01-03 Thread QiangCai
I think you can have a look this maillist.
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Dictionary-file-is-locked-for-updation-td5076.html

Have a look the following guide and pay attention to carbon.properties file. 

https://cwiki.apache.org/confluence/display/CARBONDATA/Cluster+deployment+guide


For spark yarn cluster mode, 
1. both driver side and executor side need same carbon.properties file.
2. set carbon.lock.type=HDFSLOCK 
3. set carbon.properties.filepath
spark.executor.extraJavaOptions
-Dcarbon.properties.filepath=
spark.driver.extraJavaOptions  
-Dcarbon.properties.filepath=



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/Dictionary-file-is-locked-for-Updation-unable-to-Load-tp5359p5422.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: why there is a table name option in carbon source format?

2017-01-03 Thread QiangCai
For Spark 2,  when using SparkSession to create carbon table, need tableName
option to create carbon schema in store location folder. Better to use
CarbonSession to create carbon table now.



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/why-there-is-a-table-name-option-in-carbon-source-format-tp5385p5420.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: carbon thrift server for spark 2.0 showing unusual behaviour

2017-01-03 Thread QiangCai
Can you show the JDBCServer startup script?



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/carbon-thrift-server-for-spark-2-0-showing-unusual-behaviour-tp5384p5419.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


[jira] [Created] (CARBONDATA-588) cleanup WriterCompressModel

2017-01-03 Thread Jihong MA (JIRA)
Jihong MA created CARBONDATA-588:


 Summary: cleanup WriterCompressModel
 Key: CARBONDATA-588
 URL: https://issues.apache.org/jira/browse/CARBONDATA-588
 Project: CarbonData
  Issue Type: Improvement
  Components: core
Reporter: Jihong MA
Assignee: Jihong MA
Priority: Minor
 Fix For: 1.0.0-incubating


a separate compression type field is unnecessary and error-prone as it has been 
captured in compressionFinder abstraction. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CARBONDATA-587) Able to use reserved keyword in Carbon Table Commands

2017-01-03 Thread anubhav tarar (JIRA)
anubhav tarar created CARBONDATA-587:


 Summary: Able to use reserved keyword in Carbon Table Commands
 Key: CARBONDATA-587
 URL: https://issues.apache.org/jira/browse/CARBONDATA-587
 Project: CarbonData
  Issue Type: Bug
Affects Versions: 1.0.0-incubating
 Environment: cluster
Reporter: anubhav tarar
Priority: Minor


I Am Able to use reserved keyword in Carbon Table Commands

CREATE TABLE Bug221755915(int int)USING org.apache.spark.sql.CarbonSource;
+-+--+
| Result  |
+-+--+
+-+--+

jdbc:hive2://localhost:1>  CREATE TABLE null(int int)USING 
org.apache.spark.sql.CarbonSource;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.267 seconds)

there is no check on identifiers in carbon data





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


why there is a table name option in carbon source format?

2017-01-03 Thread anubhavtarar
why there is table name option in carbon source format when we are alreday
giving table name with create table command

here are the logs
0: jdbc:hive2://localhost:1> CREATE TABLE testing2(String string)USING
org.apache.spark.sql.CarbonSource OPTIONS("bucketnumber"="1",
"bucketcolumns"="String",tableName=" ");



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/why-there-is-a-table-name-option-in-carbon-source-format-tp5385.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


carbon thrift server for spark 2.0 showing unusual behaviour

2017-01-03 Thread anubhavtarar
carbon thrift server is not parsing carbon query randomly with spark 2

here are the logs

: jdbc:hive2://localhost:1> CREATE TABLE Bug221755915(int int)USING
org.apache.spark.sql.CarbonSource;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.216 seconds)

can anyone help?




--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/carbon-thrift-server-for-spark-2-0-showing-unusual-behaviour-tp5384.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


carbontable compact throw err

2017-01-03 Thread geda
Hello:
in spark shell ,from carbonconext 
cc.sql("ALTER TABLE test COMPACT 'MINOR'")
error happend.
how to sloved it ?


7/01/03 18:52:32 INFO CarbonDataRDDFactory$: main Acquired the compaction
lock for table test
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 0
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 1
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 2
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 3
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 4
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 5
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 6
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 7
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 8
17/01/03 18:52:32 INFO CarbonDataRDDFactory$: main loads identified for
merge is 9
17/01/03 18:52:32 INFO Compactor$: pool-26-thread-1 spark.executor.instances
property is set to =20
17/01/03 18:52:32 INFO BlockBTreeBuilder: pool-26-thread-1
Total Number Rows In BTREE: 1
17/01/03 18:52:32 INFO BlockBTreeBuilder: pool-26-thread-1
Total Number Rows In BTREE: 1
17/01/03 18:52:32 INFO BlockBTreeBuilder: pool-26-thread-1
Total Number Rows In BTREE: 1
17/01/03 18:52:32 INFO BlockBTreeBuilder: pool-26-thread-1
Total Number Rows In BTREE: 1
17/01/03 18:52:32 ERROR CarbonDataRDDFactory$: main Exception in compaction
thread java.io.IOException: java.lang.NullPointerException
17/01/03 18:52:32 ERROR CarbonDataRDDFactory$: main Exception in compaction
thread java.io.IOException: java.lang.NullPointerException



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/carbontable-compact-throw-err-tp5382.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


Re: [jira] [Created] (CARBONDATA-586) Create table with 'Char' data type but it workes as 'String' data type

2017-01-03 Thread kumarvishal09
Hi 
Currently in carbon char is stored as string and there is not validation for
the length of string.
-Regards
Kumar Vishal



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/jira-Created-CARBONDATA-586-Create-table-with-Char-data-type-but-it-workes-as-String-data-type-tp5370p5380.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


?????? how to make carbon run faster

2017-01-03 Thread Jay
checked. 
one big possible reason is too many small file, compacted firstly ,then we can 
try


regards
Jay




--  --
??: "??";;
: 2017??1??3??(??) 3:29
??: "dev"; 

: Re: how to make carbon run faster



17/01/03 15:20:17 STATISTIC QueryStatisticsRecorderImpl: Time taken for
Carbon Optimizer to optimize: 2
17/01/03 15:20:23 STATISTIC DriverQueryStatisticsRecorderImpl: Print query
statistic for query id: 6047652460423237
+++-++
|  Module|  Operation Step| Total Query Cost|  Query
Cost|
+++-++
|  Driver|  Load blocks driver| |
5518 |
|++
++
|Part|Block allocation|5536 |
17 |
|++
++
||Block identification| |
 1 |
+++-++

i find the the slowest task stderr

+--++++---+---+-+---+
|
task_id|load_blocks_time|load_dictionary_time|scan_blocks_time|total_executor_time|scan_blocks_num|total_blockletvalid_scan_blocklet|result_size|
+--++++---+---+-+---+
|6047652460423237_7|  0 |  4 | 48 |
 1650 | 9 |   9 |
  3 |
+--++++---+---+-+---+

then how to debug total_executor_time use 1650 .
this is the more faster
+---++++---+---+-+---+
|
 
task_id|load_blocks_time|load_dictionary_time|scan_blocks_time|total_executor_time|scan_blocks_num|total_blockletvalid_scan_blocklet|result_size|
+---++++---+---+-+---+
|6047652460423237_19|  0 |  4 | 12
|   107 | 9 |   9 |
2 |
+---++++---+---+-+---+

1650 vs 107 ? scan_blocks_num is the same 9.




2017-01-03 14:35 GMT+08:00 Jay <2550062...@qq.com>:

> in carbon.properties, set  enable.query.statistics=true,
> then the query detail can be get, and then u can check.
>
>
> regards
> Jay
>
>
>
>
> --  --
> ??: "??";;
> : 2017??1??3??(??) 12:15
> ??: "dev";
>
> : Re: how to make carbon run faster
>
>
>
> is  has a way ,to show carbon use index info,scan nums of lows, used time
> ,i use like multier index  filter may be quicker,like a_id =1 and b_id=2
> and c_id=3 and day='2017-01-01' ,the same to orc sql, day is used
> partition,but *_id not has index, but orc is faster or near equal.i thank
> this case ,carbon should be better ?so i wan't to know to carbon use index
> info or my load csv data to carbon is wrong ,so not used index ?
>
> 2017-01-03 11:08 GMT+08:00 Jay <2550062...@qq.com>:
>
> > hi, beidou
> >
> >
> >1.  the amount of your data is 36GB, for 1 GB 1 block, 40 cores is
> > enough,
> > but i think every task may takes too long time, so i suggest to
> > increase parallelism(for example, change --executor-cores 1 to 5)
> > then enable.blocklet.distribution=true may make more effect.
> >2.  try not use date function. change "date(a.create_time)>=
> > '2016-11-01'" to "a.create_time>= '2016-11-01 00:00:00'", something like
> > this.
> >
> >
> > regards
> > Jay
> >
> >
> > --  --
> > ??: "??";;
> > : 2017??1??2??(??) 9:35
> > ??: "dev";
> >
> > : Re: how to make carbon run faster
> >
> >
> >
> > 1.
> > You can add the date as filter condition also, for example :  select *
> from
> > test_carbon where
> > status = xx (give a specific value) and date(a.create_time)>=
> '2016-11-01'
> > and  date(a.create_time)<=
> > > '2016-12-26'.
> >
> > this case test before , slow than orc
> >
> > What are your exact business cases? Partition and indexes both are good
> way
> > to improve performance, suggest you increasing data set to more than 1
> > billion rows, and try it again.
> >
> > 2.Each machine only has one cpu core ?
> > --
> > yes ,for