laskoviymishka opened a new issue, #337:
URL: https://github.com/apache/arrow-go/issues/337

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   i'm implementing streaming (https://github.com/transferia/iceberg/pull/3) 
sink from kafka-like sources into iceberg, and utilize iceberg to create 
parquet files, I expirience some weird panic in pq.Write method:
   
   ```
   panic: runtime error: invalid memory address or nil pointer dereference
     [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x3e2f7b4]
     
     goroutine 415 [running]:
     github.com/apache/arrow-go/v18/arrow/memory.(*Buffer).Bytes(...)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/[email protected]/arrow/memory/buffer.go:106
     github.com/apache/arrow-go/v18/parquet/file.(*page).Data(...)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/[email protected]/parquet/file/page_reader.go:90
     
github.com/apache/arrow-go/v18/parquet/file.(*columnWriter).TotalBytesWritten(...)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/[email protected]/parquet/file/column_writer.go:203
     
github.com/apache/arrow-go/v18/parquet/file.(*rowGroupWriter).Close(0xc0015267e0)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/[email protected]/parquet/file/row_group_writer.go:237
 +0x8b
     
github.com/apache/arrow-go/v18/parquet/pqarrow.(*FileWriter).Close(0xc000612690)
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/[email protected]/parquet/pqarrow/file_writer.go:303
 +0x38
     
github.com/apache/arrow-go/v18/parquet/pqarrow.(*FileWriter).Write(0xc000612690,
 {0x699b5c0, 0xc001928360})
        
/home/runner/go/pkg/mod/github.com/apache/arrow-go/[email protected]/parquet/pqarrow/file_writer.go:243
 +0x2a5
     github.com/transferia/iceberg.writeFile({0xc00170bab0?, 0x61?}, 
0xc00060c2a0, {0xc001553b00, 0x1, 0x1})
        /home/runner/work/iceberg/iceberg/s3_writer.go:59 +0x43b
     github.com/transferia/iceberg.(*SinkStreaming).writeBatch(0xc00184e1b0, 
0xc00060c2a0, {0xc001553b00, 0x1, 0x1})
        /home/runner/work/iceberg/iceberg/sink_streaming.go:173 +0xde
     github.com/transferia/iceberg.(*SinkStreaming).writeDataToTable(...)
        /home/runner/work/iceberg/iceberg/sink_streaming.go:162
     github.com/transferia/iceberg.(*SinkStreaming).processTable(0xc00184e1b0, 
{0xc001553b00, 0x1, 0x1})
        /home/runner/work/iceberg/iceberg/sink_streaming.go:105 +0x195
     github.com/transferia/iceberg.(*SinkStreaming).Push(0xc00184e1b0, 
{0xc0015539e0, 0x1, 0x4aeb8e?})
        /home/runner/work/iceberg/iceberg/sink_streaming.go:80 +0x327
     
github.com/transferia/transferia/pkg/middlewares.(*errorTracker).Push(0xc0010837d0,
 {0xc0015539e0?, 0xc000af16f0?, 0xc0015539e0?})
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/middlewares/error_tracker.go:35
 +0x29
     
github.com/transferia/transferia/pkg/middlewares.(*outputDataMetering).Push(0xc000567810?,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/middlewares/metering.go:65
 +0x2a
     
github.com/transferia/transferia/pkg/middlewares.(*statistician).Push(0xc001ce63c0,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/middlewares/statistician.go:58
 +0x92
     
github.com/transferia/transferia/pkg/middlewares.(*filter).Push(0xc001852810, 
{0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/middlewares/filter.go:76
 +0xa3
     
github.com/transferia/transferia/pkg/middlewares.(*nonRowSeparator).Push(0xc000583e30,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/middlewares/nonrow_separator.go:50
 +0x37f
     
github.com/transferia/transferia/pkg/middlewares.(*inputDataMetering).Push(0xc000af18f0?,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/middlewares/metering.go:43
 +0x2a
     
github.com/transferia/transferia/pkg/middlewares/async.(*synchronizer).AsyncPush(0xc001852840,
 {0xc0015539e0?, 0x1, 0xc0001bc060?})
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/middlewares/async/synchronizer.go:61
 +0xe5
     
github.com/transferia/transferia/pkg/middlewares/async.(*measurer).AsyncPush(0xc000c8b820,
 {0xc0015539e0, 0x1, 0x1})
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/middlewares/async/measurer.go:59
 +0x146
     
github.com/transferia/transferia/pkg/parsequeue.(*ParseQueue[...]).pushLoop(0x69c73e0)
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/parsequeue/parsequeue.go:88
 +0x13a
     created by github.com/transferia/transferia/pkg/parsequeue.New[...] in 
goroutine 405
        
/home/runner/go/pkg/mod/github.com/transferia/[email protected]/pkg/parsequeue/parsequeue.go:161
 +0x1f7
   
   ```
   
   Here is some diagnostic that I did collect:
   
   ```
   Converting 1 items to Arrow Record with schema: schema:
     fields: 8
       - id: type=int32, nullable
       - level: type=utf8, nullable
       - caller: type=utf8, nullable
       - msg: type=utf8, nullable
       - _timestamp: type=timestamp[us, tz=UTC]
       - _partition: type=binary
       - _offset: type=int64
       - _idx: type=int64
   Processing field 0: id (type: int32)
   Item 0, Field id: Value type is int32
   Processing field 1: level (type: utf8)
   Item 0, Field level: Value type is string
   Processing field 2: caller (type: utf8)
   Item 0, Field caller: Value type is string
   Processing field 3: msg (type: utf8)
   Item 0, Field msg: Value type is string
   Processing field 4: _timestamp (type: timestamp[us, tz=UTC])
   Item 0, Field _timestamp: Value type is time.Time
   Processing field 5: _partition (type: binary)
   Item 0, Field _partition: Value type is string
   Processing field 6: _offset (type: int64)
   Item 0, Field _offset: Value type is uint64
   Processing field 7: _idx (type: int64)
   Item 0, Field _idx: Value type is uint32
   Writing record with 1 rows and 8 columns to 
s3://warehouse/streaming/topic1/data/00000-0-7711175e-7cbe-48a0-a534-4142d4bacede-0-00003.parquet
   Recovered from panic in Write: 
   runtime error: invalid memory address or nil pointer dereferenceRecord 
details:
     NumRows: 1
     NumCols: 8
     Column 0: id (type: int32)
     Column 1: level (type: utf8)
     Column 2: caller (type: utf8)
     Column 3: msg (type: utf8)
     Column 4: _timestamp (type: timestamp[us, tz=UTC])
     Column 5: _partition (type: binary)
     Column 6: _offset (type: int64)
     Column 7: _idx (type: int64)
   ```
   
   ### Component(s)
   
   Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to