Re: Re: Getting [Problem in loading segment blocks] error after doing multi update operations

2018-03-30 Thread BabuLal
Hi yixu2001

Can you please verify your issue with  PR
https://github.com/apache/carbondata/pull/2097 .
PR is for Branch 1.3 because you are using carbondata1.3 . 

Let me know if issue still exists.

Thanks
Babu



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Re: Getting [Problem in loading segment blocks] error after doing multi update operations

2018-03-23 Thread BabuLal
Hi 

Issue is fixed and PR is raised. 

1. PR :- https://github.com/apache/carbondata/pull/2097

2. Below situation is handled in PR 
a. Skip 0 byte deletedelta 
b. On  OutputStream Close/flush if any error is thrown from hdfs
(SpaceQuota/No Lease ..ect) then Exception was not thrown to caller ,it
shows delete successfull but actually it is failed .  Now it is handled and
exception will be thrown to caller.

3. Since i simulated issue using SpaceQuota, Can you please share me your
full executor logs (where Exception from HDFS ) so ensure Exception handled
in Fix. 

Thanks
Babu



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: Re: Getting [Problem in loading segment blocks] error after doing multi update operations

2018-02-26 Thread yixu2001
dev 
 1.this script is running after yarn-cluster mode
2.no special configuration in carbon.properties,therefore using default 
configuration
3.filter data location and find a deletedelta file which is of size 0.
  hdfs dfs -du -h /user/ip_crm/public/c_compact1/*/*/*/*.deletedelta |grep "0  
/"
  0  
/user/ip_crm/public/c_compact1/Fact/Part0/Segment_1/part-0-0_batchno0-0-1519639964744.deletedelta
4.delete this deletedelta file,table can select,but get "Multiple input rows 
matched for same row." when doing update operation.
5.following is shell contents :
/usr/lib/spark-2.1.1-bin-hadoop2.7/bin/spark-shell \
 --driver-memory 3g \
 --executor-memory 3g \
 --executor-cores 1 \
 --jars carbondata_2.11-1.3.0-shade-hadoop2.7.2.jar \
 --driver-class-path /home/ip_crm/testdata/ojdbc14.jar \
 --queue ip_crm \
 --master yarn \
 --deploy-mode client \
 --keytab  /etc/security/keytabs/ip_crm.keytab \
 --principal ip_crm \
 --files /usr/hdp/2.4.0.0-169/hadoop/conf/hdfs-site.xml \
 --conf "spark.driver.extraJavaOptions=-server -XX:+AggressiveOpts 
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=512m -XX:+AlwaysPreTouch 
-XX:+UseG1GC -XX:+ScavengeBeforeFullGC -Djava.net.preferIPv4Stack=true -Xss16m 
-Dhdp.version=2.4.0.0-169 
-Dcarbon.properties.filepath=/home/ip_crm/testdata/carbon.conf" \
--conf "spark.executor.extraJavaOptions=-server -XX:+AggressiveOpts 
-XX:MaxMetaspaceSize=256m -XX:CompressedClassSpaceSize=512m -XX:+AlwaysPreTouch 
-XX:+UseG1GC -XX:+ScavengeBeforeFullGC -Djava.net.preferIPv4Stack=true -Xss16m 
-Dhdp.version=2.4.0.0-169 
-Dcarbon.properties.filepath=/home/ip_crm/testdata/carbon.conf" \
 --conf "spark.dynamicAllocation.enabled=true" \
 --conf "spark.network.timeout=300" \
 --conf "spark.sql.shuffle.partitions=200" \
 --conf "spark.default.parallelism=200" \
 --conf "spark.serializer=org.apache.spark.serializer.KryoSerializer" \
 --conf "spark.kryo.referenceTracking=false" \
 --conf "spark.kryoserializer.buffer.max=1g" \
 --conf "spark.debug.maxToStringFields=1000" \
 --conf "spark.dynamicAllocation.executorIdleTimeout=30" \
 --conf "spark.dynamicAllocation.maxExecutors=30" \
 --conf "spark.dynamicAllocation.minExecutors=1" \
 --conf "spark.dynamicAllocation.sustainedSchedulerBacklogTimeout=1s" \
 --conf "spark.yarn.executor.memoryOverhead=2048" \
 --conf "spark.yarn.driver.memoryOverhead=1024" \
 --conf "spark.speculation=true" \
 --conf "spark.sql.warehouse.dir=/apps/hive/warehouse" \
 --conf "spark.rpc.askTimeout=300" \
 --conf "spark.locality.wait=0"


yixu2001
 
From: sounak
Date: 2018-02-26 17:04
To: dev
Subject: Re: Getting [Problem in loading segment blocks] error after doing 
multi update operations
Hi,
 
I tried to reproduce the issue but it is running fine. Are you running this
script in a cluster and any special configuration you have set in
carbon.properties?
 
The script almost ran 200 times but no problem was observed.
 
 
On Sun, Feb 25, 2018 at 1:59 PM, 杨义 <yixu2...@163.com> wrote:
 
>  I'm using carbondata1.3+spark2.1.1+hadoop2.7.1 to do multi update
> operations
> here is the replay step:
>
> import org.apache.spark.sql.SparkSession
> import org.apache.spark.sql.CarbonSession._
> val cc = SparkSession.builder().config(sc.getConf).
> getOrCreateCarbonSession("hdfs://ns1/user/ip_crm")
> // create table
> cc.sql("CREATE TABLE IF NOT EXISTS public.c_compact3 (id string,qqnum
> string,nick string,age string,gender string,auth string,qunnum string,mvcc
> string) STORED BY 'carbondata' TBLPROPERTIES ('SORT_COLUMNS'='id')").show;
> // data prepare
> import org.apache.spark.sql.types._
> import org.apache.spark.sql.Row
> val schema = StructType(StructField("id",StringType,true)::StructField(
> "qqnum",StringType,true)::StructField("nick",StringType,
> true)::StructField("age",StringType,true)::StructField(
> "gender",StringType,true)::StructField("auth",StringType,
> true)::StructField("qunnum",StringType,true)::StructField(
> "mvcc",IntegerType,true)::Nil)
> val data = cc.sparkContext.parallelize(1 to 5000,4).map { i =>
> Row.fromSeq(Seq(i.toString,i.toString.concat("").
> concat(i.toString),"2009-05-27",i.toString.concat("c").
> concat(i.toString),"1","1",i.toString.concat("dd").
> concat(i.toString),1))
> }
> cc.createDataFrame(data, schema).createOrReplaceTempView("ddd")
> cc.sql("insert into public.c_compact3 select * from ddd").show;
>
> // update table multi times in while loop
> import scala.util.Random
>