Re: Encountering BufferUnderflowException when querying from Phoenix
Dug into one of the row that was having a similar problem, throwing IllegalArgumentException [1] instead of BufferUnderflowException, but both seemed to be data issue on how the varchar array is stored in an unexpected format in HBase. The row looks like: *A_VARCHAR_OF_170_CHARS*\x00\x00\x00\x80\x01\x00\x00\x02\xAD\x00\x00\x00 I could not make sense of it based on the 4.13 encoding (hence Phoenix is throwing an exception), and I looked back to 4.8 and it doesn't seem like the old format either... Anyone recognize the hex encoding by any chance, or is this some sort of data corruption? Thanks, - Will [1] java.lang.IllegalArgumentException at java.nio.Buffer.position(Buffer.java:244) at org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1025) at org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375) at org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65) at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011) at org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) at org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:609) at sqlline.Rows$Row.(Rows.java:183) at sqlline.BufferedRows.(BufferedRows.java:38) at sqlline.SqlLine.print(SqlLine.java:1660) at sqlline.Commands.execute(Commands.java:833) at sqlline.Commands.sql(Commands.java:732) at sqlline.SqlLine.dispatch(SqlLine.java:813) at sqlline.SqlLine.begin(SqlLine.java:686) at sqlline.SqlLine.start(SqlLine.java:398) at sqlline.SqlLine.main(SqlLine.java:291) On Wed, Oct 17, 2018 at 3:21 PM William Shen wrote: > Thank Jaanai. > > At first we thought it was data issue too, but as we restored the table > from snapshot to a separate schema on the same cluster to triage, the > exception no longer happens... Does that give further clue on what the > issue might've been? > > 0: jdbc:phoenix:journalnode,test> SELECT A, B, C, D FROM SCHEMA.TABLE > where A = 13100423; > > java.nio.BufferUnderflowException > > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151) > > at java.nio.ByteBuffer.get(ByteBuffer.java:715) > > at > org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028) > > at > org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375) > > at > org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65) > > at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011) > > at > org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) > > at > org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:609) > > at sqlline.Rows$Row.(Rows.java:183) > > at sqlline.BufferedRows.(BufferedRows.java:38) > > at sqlline.SqlLine.print(SqlLine.java:1660) > > at sqlline.Commands.execute(Commands.java:833) > > at sqlline.Commands.sql(Commands.java:732) > > at sqlline.SqlLine.dispatch(SqlLine.java:813) > > at sqlline.SqlLine.begin(SqlLine.java:686) > > at sqlline.SqlLine.start(SqlLine.java:398) > > at sqlline.SqlLine.main(SqlLine.java:291) > > > > 0: jdbc:phoenix:journalnode,test> SELECT A, B, C, D FROM SCHEMA.CORRUPTION > where A = 13100423; > > +---+++-+ > > |A | B | C |D | > > +---+++-+ > > | 13100423 | 5159 | 7 | ['female'] | > > +---+++-+ > > 1 row selected (1.76 seconds) > > On Sun, Oct 14, 2018 at 8:39 PM Jaanai Zhang > wrote: > >> It looks a bug that the remained part greater than retrieved the length >> in ByteBuffer, Maybe the position of ByteBuffer or the length of target >> byte array exists some problems. >> >> >>Jaanai Zhang >>Best regards! >> >> >> >> William Shen 于2018年10月12日周五 下午11:53写道: >> >>> Hi all, >>> >>> We are running Phoenix 4.13, and periodically we would encounter the >>> following exception when querying from Phoenix in our staging environment. >>> Initially, we thought we had some incompatible client version connecting >>> and creating data corruption, but after ensuring that we are only >>> connecting with 4.13 clients, we still see this issue come up from time to >>> time. So far, fortunately, since it is in staging, we are able to identify >>> and delete the data to restore service. >>> >>> However, would like to ask for guidance on what else we could look for >>> to identify the cause of this exception. Could this perhaps caused by >>> something other than data corruption? >>> >>> Thanks in advance! >>> >>> The exception looks like: >>> >>> 18/10/12 15:45:58 WARN scheduler.TaskSetManager: Lost task 32.2 in stage >>> 14.0 (TID 1275, ...datanode..., executor 82): >>> java.nio.BufferUnderflowException >>> >>> at java.nio.HeapByteBuffer.
Re: Encountering BufferUnderflowException when querying from Phoenix
Thank Jaanai. At first we thought it was data issue too, but as we restored the table from snapshot to a separate schema on the same cluster to triage, the exception no longer happens... Does that give further clue on what the issue might've been? 0: jdbc:phoenix:journalnode,test> SELECT A, B, C, D FROM SCHEMA.TABLE where A = 13100423; java.nio.BufferUnderflowException at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151) at java.nio.ByteBuffer.get(ByteBuffer.java:715) at org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028) at org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375) at org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65) at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011) at org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) at org.apache.phoenix.jdbc.PhoenixResultSet.getString(PhoenixResultSet.java:609) at sqlline.Rows$Row.(Rows.java:183) at sqlline.BufferedRows.(BufferedRows.java:38) at sqlline.SqlLine.print(SqlLine.java:1660) at sqlline.Commands.execute(Commands.java:833) at sqlline.Commands.sql(Commands.java:732) at sqlline.SqlLine.dispatch(SqlLine.java:813) at sqlline.SqlLine.begin(SqlLine.java:686) at sqlline.SqlLine.start(SqlLine.java:398) at sqlline.SqlLine.main(SqlLine.java:291) 0: jdbc:phoenix:journalnode,test> SELECT A, B, C, D FROM SCHEMA.CORRUPTION where A = 13100423; +---+++-+ |A | B | C |D | +---+++-+ | 13100423 | 5159 | 7 | ['female'] | +---+++-+ 1 row selected (1.76 seconds) On Sun, Oct 14, 2018 at 8:39 PM Jaanai Zhang wrote: > It looks a bug that the remained part greater than retrieved the length in > ByteBuffer, Maybe the position of ByteBuffer or the length of target byte > array exists some problems. > > >Jaanai Zhang >Best regards! > > > > William Shen 于2018年10月12日周五 下午11:53写道: > >> Hi all, >> >> We are running Phoenix 4.13, and periodically we would encounter the >> following exception when querying from Phoenix in our staging environment. >> Initially, we thought we had some incompatible client version connecting >> and creating data corruption, but after ensuring that we are only >> connecting with 4.13 clients, we still see this issue come up from time to >> time. So far, fortunately, since it is in staging, we are able to identify >> and delete the data to restore service. >> >> However, would like to ask for guidance on what else we could look for to >> identify the cause of this exception. Could this perhaps caused by >> something other than data corruption? >> >> Thanks in advance! >> >> The exception looks like: >> >> 18/10/12 15:45:58 WARN scheduler.TaskSetManager: Lost task 32.2 in stage >> 14.0 (TID 1275, ...datanode..., executor 82): >> java.nio.BufferUnderflowException >> >> at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151) >> >> at java.nio.ByteBuffer.get(ByteBuffer.java:715) >> >> at >> org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028) >> >> at >> org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375) >> >> at >> org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65) >> >> at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011) >> >> at >> org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) >> >> at >> org.apache.phoenix.jdbc.PhoenixResultSet.getObject(PhoenixResultSet.java:525) >> >> at >> org.apache.phoenix.spark.PhoenixRecordWritable$$anonfun$readFields$1.apply$mcVI$sp(PhoenixRecordWritable.scala:96) >> >> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) >> >> at >> org.apache.phoenix.spark.PhoenixRecordWritable.readFields(PhoenixRecordWritable.scala:93) >> >> at >> org.apache.phoenix.mapreduce.PhoenixRecordReader.nextKeyValue(PhoenixRecordReader.java:168) >> >> at >> org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:174) >> >> at >> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) >> >> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) >> >> at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1596) >> >> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157) >> >> at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157) >> >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870) >> >> at >> org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870) >> >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) >> >> at org.apache.spark.scheduler.Task.run(Task.scala:89) >> >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:229) >> >> at >> java.uti
Re: Encountering BufferUnderflowException when querying from Phoenix
It looks a bug that the remained part greater than retrieved the length in ByteBuffer, Maybe the position of ByteBuffer or the length of target byte array exists some problems. Jaanai Zhang Best regards! William Shen 于2018年10月12日周五 下午11:53写道: > Hi all, > > We are running Phoenix 4.13, and periodically we would encounter the > following exception when querying from Phoenix in our staging environment. > Initially, we thought we had some incompatible client version connecting > and creating data corruption, but after ensuring that we are only > connecting with 4.13 clients, we still see this issue come up from time to > time. So far, fortunately, since it is in staging, we are able to identify > and delete the data to restore service. > > However, would like to ask for guidance on what else we could look for to > identify the cause of this exception. Could this perhaps caused by > something other than data corruption? > > Thanks in advance! > > The exception looks like: > > 18/10/12 15:45:58 WARN scheduler.TaskSetManager: Lost task 32.2 in stage > 14.0 (TID 1275, ...datanode..., executor 82): > java.nio.BufferUnderflowException > > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151) > > at java.nio.ByteBuffer.get(ByteBuffer.java:715) > > at > org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028) > > at > org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375) > > at > org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65) > > at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011) > > at > org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) > > at > org.apache.phoenix.jdbc.PhoenixResultSet.getObject(PhoenixResultSet.java:525) > > at > org.apache.phoenix.spark.PhoenixRecordWritable$$anonfun$readFields$1.apply$mcVI$sp(PhoenixRecordWritable.scala:96) > > at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) > > at > org.apache.phoenix.spark.PhoenixRecordWritable.readFields(PhoenixRecordWritable.scala:93) > > at > org.apache.phoenix.mapreduce.PhoenixRecordReader.nextKeyValue(PhoenixRecordReader.java:168) > > at > org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:174) > > at > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) > > at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1596) > > at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157) > > at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157) > > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870) > > at > org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870) > > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > > at org.apache.spark.scheduler.Task.run(Task.scala:89) > > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:229) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > > at java.lang.Thread.run(Thread.java:748) > > >
Encountering BufferUnderflowException when querying from Phoenix
Hi all, We are running Phoenix 4.13, and periodically we would encounter the following exception when querying from Phoenix in our staging environment. Initially, we thought we had some incompatible client version connecting and creating data corruption, but after ensuring that we are only connecting with 4.13 clients, we still see this issue come up from time to time. So far, fortunately, since it is in staging, we are able to identify and delete the data to restore service. However, would like to ask for guidance on what else we could look for to identify the cause of this exception. Could this perhaps caused by something other than data corruption? Thanks in advance! The exception looks like: 18/10/12 15:45:58 WARN scheduler.TaskSetManager: Lost task 32.2 in stage 14.0 (TID 1275, ...datanode..., executor 82): java.nio.BufferUnderflowException at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:151) at java.nio.ByteBuffer.get(ByteBuffer.java:715) at org.apache.phoenix.schema.types.PArrayDataType.createPhoenixArray(PArrayDataType.java:1028) at org.apache.phoenix.schema.types.PArrayDataType.toObject(PArrayDataType.java:375) at org.apache.phoenix.schema.types.PVarcharArray.toObject(PVarcharArray.java:65) at org.apache.phoenix.schema.types.PDataType.toObject(PDataType.java:1011) at org.apache.phoenix.compile.ExpressionProjector.getValue(ExpressionProjector.java:75) at org.apache.phoenix.jdbc.PhoenixResultSet.getObject(PhoenixResultSet.java:525) at org.apache.phoenix.spark.PhoenixRecordWritable$$anonfun$readFields$1.apply$mcVI$sp(PhoenixRecordWritable.scala:96) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.phoenix.spark.PhoenixRecordWritable.readFields(PhoenixRecordWritable.scala:93) at org.apache.phoenix.mapreduce.PhoenixRecordReader.nextKeyValue(PhoenixRecordReader.java:168) at org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:174) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) at org.apache.spark.util.Utils$.getIteratorSize(Utils.scala:1596) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157) at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:1157) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870) at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1870) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:229) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748)