[DISCUSSION] Whether Carbondata should work with Presto in the next release version(1.2.0)

2017-06-11 Thread Bhavya Aggarwal
Hi All,

We can add the Presto integration as one of the items for 1.2.0 release, we
can add support for Presto to read from Carbondata as Presto is used by
many people for query execution. Please vot and discuss on Presto
integration in this mail thread.


Thanks and regards
Bhavya


[DISCUSSION] Whether Carbondata should support Hive in the next release version(1.2.0)

2017-06-11 Thread Bhavya Aggarwal
Hi Guys,

Should we add Hive Integration with CarbonData in release 1.2.0, it will be
good if we can come up with features that needs to be supported in the Hive
Integration. Please vote and give your comments for the same in this
discussion.


Thanks and regards
Bhavya


Re: can't apply mappartitions to dataframe generated from carboncontext

2017-06-11 Thread Erlu Chen
Hi, Mic sun

Can you ping your error message directly ?

It seems I can't get access to your appendix.


Thanks in advance.

Regards.
Chenerlu.



--
View this message in context: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/can-t-apply-mappartitions-to-dataframe-generated-from-carboncontext-tp14565p14570.html
Sent from the Apache CarbonData Dev Mailing List archive mailing list archive 
at Nabble.com.


can't apply mappartitions to dataframe generated from carboncontext

2017-06-11 Thread sunerhan1...@sina.com
hi:
appendix are the full error message
I try to modify dataframe row in non-sql way and get Task not 
serializable,my test procedure like follows:
1.val df=cc.sql(select * from t1)
2.def function1 (iterator: Iterator[Row]):Iterator[Row]={
var list=scala.collection.mutable.ListBuffer[Row]()
while (iterator.hasNext) {
var r=iterator.next

if(r.getAs[String]("col1").toString.equalsIgnoreCase(r.getAs[String]("col2").toString))
 list+=r}
list.iterator
}
3.df.mapPartitions(r=>function1 (r))
   I also apply function1 to dataframe generated from sqlContext work fine,
so i believe carboncontext is refering some outer variables that are not 
serializable.





sunerhan1...@sina.com


[jira] [Created] (CARBONDATA-1153) Can not add column because it is aborted

2017-06-11 Thread cen yuhai (JIRA)
cen yuhai created CARBONDATA-1153:
-

 Summary: Can not add column because it is aborted
 Key: CARBONDATA-1153
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1153
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.2.0
Reporter: cen yuhai


why can't I add column? no one are altering the table...
{code}
scala> carbon.sql("alter table temp.yuhai_carbon add columns(test1 string)")
17/06/11 22:09:13 AUDIT 
[org.apache.spark.sql.execution.command.AlterTableAddColumns(207) -- main]: 
[sh-hadoop-datanode-250-104.elenet.me][master][Thread-1]Alter table add columns 
request has been received for temp.yuhai_carbon
17/06/11 22:10:22 ERROR [org.apache.spark.scheduler.TaskSetManager(70) -- 
task-result-getter-3]: Task 0 in stage 0.0 failed 4 times; aborting job
17/06/11 22:10:22 ERROR 
[org.apache.spark.sql.execution.command.AlterTableAddColumns(141) -- main]: 
main Alter table add columns failed :Job aborted due to stage failure: Task 0 
in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 
(TID 3, sh-hadoop-datanode-368.elenet.me, executor 7): 
java.lang.RuntimeException: Dictionary file test1 is locked for updation. 
Please try after some time
at scala.sys.package$.error(package.scala:27)
at 
org.apache.carbondata.spark.util.GlobalDictionaryUtil$.loadDefaultDictionaryValueForNewColumn(GlobalDictionaryUtil.scala:857)
at 
org.apache.carbondata.spark.rdd.AlterTableAddColumnRDD$$anon$1.(AlterTableAddColumnRDD.scala:83)
at 
org.apache.carbondata.spark.rdd.AlterTableAddColumnRDD.compute(AlterTableAddColumnRDD.scala:68)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:331)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:295)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:88)
at org.apache.spark.scheduler.Task.run(Task.scala:104)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:351)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)