[ https://issues.apache.org/jira/browse/THRIFT-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Mollitor updated THRIFT-5288: ----------------------------------- Description: {code:java|title=TCompactProtocol.java} /** * Write a byte array, using a varint for the size. */ public void writeBinary(ByteBuffer bin) throws TException { int length = bin.limit() - bin.position(); writeBinary(bin.array(), bin.position() + bin.arrayOffset(), length); } {code} I was working on something with Parquet and this code was causing some issues: {code:java} java.lang.Exception: java.nio.ReadOnlyBufferException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.nio.ReadOnlyBufferException at java.nio.ByteBuffer.array(ByteBuffer.java:996) at shaded.parquet.org.apache.thrift.protocol.TCompactProtocol.writeBinary(TCompactProtocol.java:375) at org.apache.parquet.format.InterningProtocol.writeBinary(InterningProtocol.java:135) at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:945) at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:820) at org.apache.parquet.format.ColumnIndex.write(ColumnIndex.java:728) at org.apache.parquet.format.Util.write(Util.java:372) at org.apache.parquet.format.Util.writeColumnIndex(Util.java:69) at org.apache.parquet.hadoop.ParquetFileWriter.serializeColumnIndexes(ParquetFileWriter.java:1087) at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1050) {code} This happens, because not all {{Buffer}} allow for direct access to the backing Array,... for example a ByteBuffer tied to a file does not have an Array. Read-only (immutable) {{ByteBuffer}} do not allow for this kind of access to the array since it could then be modified. There are two approaches here: # Assert and throw Exception if the backing array must be allowed for access # Deal natively with the ByteBuffer I propose the latter. However, the initial naive I approach I propose is to "deal natively" with the ByteBuffer by making a copy of the contents. was: {code:java|title=TCompactProtocol.java} /** * Write a byte array, using a varint for the size. */ public void writeBinary(ByteBuffer bin) throws TException { int length = bin.limit() - bin.position(); writeBinary(bin.array(), bin.position() + bin.arrayOffset(), length); } {code} I was working on something with Parquet and this code was causing some issues: {code} java.lang.Exception: java.nio.ReadOnlyBufferException at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.nio.ReadOnlyBufferException at java.nio.ByteBuffer.array(ByteBuffer.java:996) at shaded.parquet.org.apache.thrift.protocol.TCompactProtocol.writeBinary(TCompactProtocol.java:375) at org.apache.parquet.format.InterningProtocol.writeBinary(InterningProtocol.java:135) at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:945) at org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:820) at org.apache.parquet.format.ColumnIndex.write(ColumnIndex.java:728) at org.apache.parquet.format.Util.write(Util.java:372) at org.apache.parquet.format.Util.writeColumnIndex(Util.java:69) at org.apache.parquet.hadoop.ParquetFileWriter.serializeColumnIndexes(ParquetFileWriter.java:1087) at org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1050) {code} This happens, because not all {{Buffer}} allow for direct access to the backing Array,... for example a ByteBuffer tied to a file does not have an Array. Read-only (immutable) {{ByteBuffer}} do not allow for this kind of access to the array since it could then be modified. There are two approaches here: # Assert and throw Exception if the backing array must be allowed for access # Make "deal directly" with the ByteBuffer I propose the latter. However, the initial naive I approach I propose is to "deal directly" with the ByteBuffer by making a copy of the contents. > Better Support for ByteBuffer in Compact Protocol > ------------------------------------------------- > > Key: THRIFT-5288 > URL: https://issues.apache.org/jira/browse/THRIFT-5288 > Project: Thrift > Issue Type: Improvement > Reporter: David Mollitor > Assignee: David Mollitor > Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > {code:java|title=TCompactProtocol.java} > /** > * Write a byte array, using a varint for the size. > */ > public void writeBinary(ByteBuffer bin) throws TException { > int length = bin.limit() - bin.position(); > writeBinary(bin.array(), bin.position() + bin.arrayOffset(), length); > } > {code} > I was working on something with Parquet and this code was causing some issues: > {code:java} > java.lang.Exception: java.nio.ReadOnlyBufferException > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) > Caused by: java.nio.ReadOnlyBufferException > at java.nio.ByteBuffer.array(ByteBuffer.java:996) > at > shaded.parquet.org.apache.thrift.protocol.TCompactProtocol.writeBinary(TCompactProtocol.java:375) > at > org.apache.parquet.format.InterningProtocol.writeBinary(InterningProtocol.java:135) > at > org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:945) > at > org.apache.parquet.format.ColumnIndex$ColumnIndexStandardScheme.write(ColumnIndex.java:820) > at org.apache.parquet.format.ColumnIndex.write(ColumnIndex.java:728) > at org.apache.parquet.format.Util.write(Util.java:372) > at org.apache.parquet.format.Util.writeColumnIndex(Util.java:69) > at > org.apache.parquet.hadoop.ParquetFileWriter.serializeColumnIndexes(ParquetFileWriter.java:1087) > at > org.apache.parquet.hadoop.ParquetFileWriter.end(ParquetFileWriter.java:1050) > {code} > This happens, because not all {{Buffer}} allow for direct access to the > backing Array,... for example a ByteBuffer tied to a file does not have an > Array. Read-only (immutable) {{ByteBuffer}} do not allow for this kind of > access to the array since it could then be modified. > There are two approaches here: > # Assert and throw Exception if the backing array must be allowed for access > # Deal natively with the ByteBuffer > I propose the latter. However, the initial naive I approach I propose is to > "deal natively" with the ByteBuffer by making a copy of the contents. -- This message was sent by Atlassian Jira (v8.3.4#803005)