Re: ClassNotFound error when insert carbontable from hive table
Just a configuration error. The carbondata jar was in wrong place with which configured in spark-defaults.conf, after corrected the path it was fixed. thanks, Lionel On Tue, Aug 22, 2017 at 3:17 PM, Liang Chen wrote: > Hi lionel > > Can you share with us how did you fix this issue? > > Regards > Liang > > > lionel061201 wrote > > This issue had been fixed. > > > > On Mon, Aug 21, 2017 at 4:04 PM, Lu Cao < > > > whucaolu@ > > > > wrote: > > > >> Hi dev, > >> > >> I'm trying to insert data from a hive table to carbon table: > >> > >> cc.sql("insert into carbon_test select * from target_table where pt = > >> '20170101'") > >> > >> > >> Any one knows how to fix this error? > >> > >> [Stage 8:>(0 + > 4) > >> / 156]17/08/21 15:59:01 WARN scheduler.TaskSetManager: Lost task 1.0 in > >> stage 8.0 (TID 48, , executor 16): java.lang.ClassNotFoundException: > >> org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD > >> > >> at org.apache.spark.repl.ExecutorClassLoader.findClass( > >> ExecutorClassLoader.scala:82) > >> > >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > >> > >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > >> > >> at java.lang.Class.forName0(Native Method) > >> > >> at java.lang.Class.forName(Class.java:270) > >> > >> at org.apache.spark.serializer.JavaDeserializationStream$$ > >> anon$1.resolveClass(JavaSerializer.scala:67) > >> > >> at > >> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) > >> > >> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) > >> > >> at java.io.ObjectInputStream.readOrdinaryObject( > >> ObjectInputStream.java:1771) > >> > >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > >> > >> at java.io.ObjectInputStream.defaultReadFields( > >> ObjectInputStream.java:1990) > >> > >> at java.io.ObjectInputStream.readSerialData( > ObjectInputStream.java:1915) > >> > >> at java.io.ObjectInputStream.readOrdinaryObject( > >> ObjectInputStream.java:1798) > >> > >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > >> > >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > >> > >> at org.apache.spark.serializer.JavaDeserializationStream. > >> readObject(JavaSerializer.scala:75) > >> > >> at org.apache.spark.serializer.JavaSerializerInstance. > >> deserialize(JavaSerializer.scala:114) > >> > >> at org.apache.spark.scheduler.ShuffleMapTask.runTask( > >> ShuffleMapTask.scala:85) > >> > >> at org.apache.spark.scheduler.ShuffleMapTask.runTask( > >> ShuffleMapTask.scala:53) > >> > >> at org.apache.spark.scheduler.Task.run(Task.scala:99) > >> > >> at org.apache.spark.executor.Executor$TaskRunner.run( > Executor.scala:322) > >> > >> at java.util.concurrent.ThreadPoolExecutor.runWorker( > >> ThreadPoolExecutor.java:1145) > >> > >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( > >> ThreadPoolExecutor.java:615) > >> > >> at java.lang.Thread.run(Thread.java:745) > >> > >> Caused by: java.lang.ClassNotFoundException: > org.apache.carbondata.spark. > >> rdd.CarbonBlockDistinctValuesCombineRDD > >> > >> at java.lang.ClassLoader.findClass(ClassLoader.java:531) > >> > >> at org.apache.spark.util.ParentClassLoader.findClass( > >> ParentClassLoader.scala:26) > >> > >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > >> > >> at org.apache.spark.util.ParentClassLoader.loadClass( > >> ParentClassLoader.scala:34) > >> > >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > >> > >> at org.apache.spark.util.ParentClassLoader.loadClass( > >> ParentClassLoader.scala:30) > >> > >> at org.apache.spark.repl.ExecutorClassLoader.findClass( > >> ExecutorClassLoader.scala:77) > >> > >> ... 23 more > >> > >> > >> [Stage 8:>(0 + > 4) > >> / 156]17/08/21 15:59:02 ERROR scheduler.TaskSetManager: Task 1 in stage > >> 8.0 > >> failed 4 times; aborting job > >> > >> 17/08/21 15:59:02 ERROR util.GlobalDictionaryUtil$: main generate global > >> dictionary failed > >> > >> org.apache.spark.SparkException: Job aborted due to stage failure: > Task 1 > >> in stage 8.0 failed 4 times, most recent failure: Lost task 1.3 in stage > >> 8.0 (TID 61, scsp00382.saicdt.com, executor 16): > >> java.lang.ClassNotFoundException: > >> org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD > >> > >> at org.apache.spark.repl.ExecutorClassLoader.findClass( > >> ExecutorClassLoader.scala:82) > >> > >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > >> > >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > >> > >> at java.lang.Class.forName0(Native Method) > >> > >> at java.lang.Class.forName(Class.java:270) > >> > >> at org.apache.spark.serializer.JavaDeserializationStream$$ > >> anon$1.resolveClass(JavaSerializer.scala:67) > >> > >> at > >> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) > >> > >> at java.io.Obje
Compatibility among carbon[1.1.1][1.1.0]+spark[1.6][2.1.0]
Hiļ¼all Recently i want to test carbon1.1.1+spark2.1.0 performance and test environment already have carbon1.1.0+spark1.6 and carbon1.1.1+spark1.6 and carbon tables generated. I tried reading carbon tables generated from carbon1.1.0+spark1.6 and carbon1.1.1+spark1.6as well as delete in subquery data generated from carbon1.1.1+spark1.6 using carbon1.1.1+spark2.1.0 roughly and all worked ok. My question is do they have compatibility problems and may impare performance?Or should I regenerate carbon table using carbon1.1.1+spark2.1.0 ? And I break my question into 3 parts: 1.Will IUD and other opeartion(like sub-query) works fine if use carbon1.1.1+spark2.1.0 to operate carbon1.1.0+spark1.6? 2.Will use carbon1.1.1+spark2.1.0 to operate carbon1.1.0+spark1.6 impare performance or should regenerate data using carbon1.1.1+spark2.1.0 while tesing carbon1.1.1+spark2.1.0? 3.Will use carbon1.1.1+spark2.1.0 to operate carbon1.1.1+spark1.6 impare performance or should regenerate data using carbon1.1.1+spark2.1.0 while tesing carbon1.1.1+spark2.1.0? sunerhan1...@sina.com
Re: ClassNotFound error when insert carbontable from hive table
Hi lionel Can you share with us how did you fix this issue? Regards Liang lionel061201 wrote > This issue had been fixed. > > On Mon, Aug 21, 2017 at 4:04 PM, Lu Cao < > whucaolu@ > > wrote: > >> Hi dev, >> >> I'm trying to insert data from a hive table to carbon table: >> >> cc.sql("insert into carbon_test select * from target_table where pt = >> '20170101'") >> >> >> Any one knows how to fix this error? >> >> [Stage 8:>(0 + 4) >> / 156]17/08/21 15:59:01 WARN scheduler.TaskSetManager: Lost task 1.0 in >> stage 8.0 (TID 48, , executor 16): java.lang.ClassNotFoundException: >> org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD >> >> at org.apache.spark.repl.ExecutorClassLoader.findClass( >> ExecutorClassLoader.scala:82) >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> >> at java.lang.Class.forName0(Native Method) >> >> at java.lang.Class.forName(Class.java:270) >> >> at org.apache.spark.serializer.JavaDeserializationStream$$ >> anon$1.resolveClass(JavaSerializer.scala:67) >> >> at >> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) >> >> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) >> >> at java.io.ObjectInputStream.readOrdinaryObject( >> ObjectInputStream.java:1771) >> >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> at java.io.ObjectInputStream.defaultReadFields( >> ObjectInputStream.java:1990) >> >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> at java.io.ObjectInputStream.readOrdinaryObject( >> ObjectInputStream.java:1798) >> >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) >> >> at org.apache.spark.serializer.JavaDeserializationStream. >> readObject(JavaSerializer.scala:75) >> >> at org.apache.spark.serializer.JavaSerializerInstance. >> deserialize(JavaSerializer.scala:114) >> >> at org.apache.spark.scheduler.ShuffleMapTask.runTask( >> ShuffleMapTask.scala:85) >> >> at org.apache.spark.scheduler.ShuffleMapTask.runTask( >> ShuffleMapTask.scala:53) >> >> at org.apache.spark.scheduler.Task.run(Task.scala:99) >> >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322) >> >> at java.util.concurrent.ThreadPoolExecutor.runWorker( >> ThreadPoolExecutor.java:1145) >> >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >> ThreadPoolExecutor.java:615) >> >> at java.lang.Thread.run(Thread.java:745) >> >> Caused by: java.lang.ClassNotFoundException: org.apache.carbondata.spark. >> rdd.CarbonBlockDistinctValuesCombineRDD >> >> at java.lang.ClassLoader.findClass(ClassLoader.java:531) >> >> at org.apache.spark.util.ParentClassLoader.findClass( >> ParentClassLoader.scala:26) >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> >> at org.apache.spark.util.ParentClassLoader.loadClass( >> ParentClassLoader.scala:34) >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> >> at org.apache.spark.util.ParentClassLoader.loadClass( >> ParentClassLoader.scala:30) >> >> at org.apache.spark.repl.ExecutorClassLoader.findClass( >> ExecutorClassLoader.scala:77) >> >> ... 23 more >> >> >> [Stage 8:>(0 + 4) >> / 156]17/08/21 15:59:02 ERROR scheduler.TaskSetManager: Task 1 in stage >> 8.0 >> failed 4 times; aborting job >> >> 17/08/21 15:59:02 ERROR util.GlobalDictionaryUtil$: main generate global >> dictionary failed >> >> org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 >> in stage 8.0 failed 4 times, most recent failure: Lost task 1.3 in stage >> 8.0 (TID 61, scsp00382.saicdt.com, executor 16): >> java.lang.ClassNotFoundException: >> org.apache.carbondata.spark.rdd.CarbonBlockDistinctValuesCombineRDD >> >> at org.apache.spark.repl.ExecutorClassLoader.findClass( >> ExecutorClassLoader.scala:82) >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> >> at java.lang.Class.forName0(Native Method) >> >> at java.lang.Class.forName(Class.java:270) >> >> at org.apache.spark.serializer.JavaDeserializationStream$$ >> anon$1.resolveClass(JavaSerializer.scala:67) >> >> at >> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) >> >> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) >> >> at java.io.ObjectInputStream.readOrdinaryObject( >> ObjectInputStream.java:1771) >> >> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) >> >> at java.io.ObjectInputStream.defaultReadFields( >> ObjectInputStream.java:1990) >> >> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) >> >> at java.io.ObjectInputStream.readOrdinaryObject( >> ObjectInputStream.java:1798) >> >> at java.io.ObjectInputStream.readObject0(Objec