[jira] [Commented] (ARROW-17169) [Go] goPanicIndex in firstTimeBitmapWriter.Finish()

2022-09-20 Thread Matthew Topol (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17607395#comment-17607395
 ] 

Matthew Topol commented on ARROW-17169:
---

[~Purdom] I accidentally ended up being able to replicate this and narrowed 
down the cause. I've got a fix that works for this and I'm going to try to see 
if I can construct a test for it. This issue will get updated when I put the PR 
up for it.

> [Go] goPanicIndex in firstTimeBitmapWriter.Finish()
> ---
>
> Key: ARROW-17169
> URL: https://issues.apache.org/jira/browse/ARROW-17169
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Affects Versions: 9.0.0, 8.0.1
> Environment: go (1.18.3), Linux, AMD64
>Reporter: Robert Purdom
>Priority: Critical
>
> I'm working with complex parquet files with 500+ "root" columns where some 
> fields are lists of structs, internally referred to as 'topics'.  Some of 
> these structs have 100's of columns.  When reading a particular topic, I get 
> an Index Panic at the line indicated below. This error occurs when the value 
> for the topic is Null, as in, for this particular root record, this topic has 
> no data.  The root is household data, the topic is auto, so the error occurs 
> when the household has no autos.  The auto field is a Nullable List of Struct.
>  
> {code:go}
> /* Finish() was called from defLevelsToBitmapInternal.
> data values when panic occurs
> bw.length == 17531
> bw.bitMask == 1
> bw.pos == 3424
> bw.length == 17531
> len(bw.Buf) == 428
> cap(bw.Buf) == 448
> bw.byteOffset == 428
> bw.curByte == 0
> */
> // bitmap_writer.go
> func (bw *firstTimeBitmapWriter) Finish() {
> // store curByte into the bitmap
>  if bw.length >0&& bw.bitMask !=0x01|| bw.pos < bw.length {
>   bw.buf[int(bw.byteOffset)] = bw.curByte   // < Panic index
>  }
> }
> {code}
> In every case, when the panic occurs, bw.byteOffset == len(bw.Buf). I tested 
> the below modification and it does remedy the bug. However, it's probably 
> only masking the actual bug.
> {code:go}
> // Test version: No Panic
> func (bw *firstTimeBitmapWriter) Finish() {
>   // store curByte into the bitmap
>   if bw.length > 0 && bw.bitMask != 0x01 || bw.pos < bw.length {
> if int(bw.byteOffset) == len(bw.Buf) {
>  bw.buf = append(bw.buf, bw.curByte)
> } else {
>bw.buf[int(bw.byteOffset)] = bw.curByte
>}
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17169) [Go] goPanicIndex in firstTimeBitmapWriter.Finish()

2022-08-12 Thread Matthew Topol (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17579003#comment-17579003
 ] 

Matthew Topol commented on ARROW-17169:
---

[~Purdom] i'll leave this open for now, if you're able to create any reproducer 
for this I'll definitely take a look into it.

> [Go] goPanicIndex in firstTimeBitmapWriter.Finish()
> ---
>
> Key: ARROW-17169
> URL: https://issues.apache.org/jira/browse/ARROW-17169
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Affects Versions: 9.0.0, 8.0.1
> Environment: go (1.18.3), Linux, AMD64
>Reporter: Robert Purdom
>Priority: Critical
>
> I'm working with complex parquet files with 500+ "root" columns where some 
> fields are lists of structs, internally referred to as 'topics'.  Some of 
> these structs have 100's of columns.  When reading a particular topic, I get 
> an Index Panic at the line indicated below. This error occurs when the value 
> for the topic is Null, as in, for this particular root record, this topic has 
> no data.  The root is household data, the topic is auto, so the error occurs 
> when the household has no autos.  The auto field is a Nullable List of Struct.
>  
> {code:go}
> /* Finish() was called from defLevelsToBitmapInternal.
> data values when panic occurs
> bw.length == 17531
> bw.bitMask == 1
> bw.pos == 3424
> bw.length == 17531
> len(bw.Buf) == 428
> cap(bw.Buf) == 448
> bw.byteOffset == 428
> bw.curByte == 0
> */
> // bitmap_writer.go
> func (bw *firstTimeBitmapWriter) Finish() {
> // store curByte into the bitmap
>  if bw.length >0&& bw.bitMask !=0x01|| bw.pos < bw.length {
>   bw.buf[int(bw.byteOffset)] = bw.curByte   // < Panic index
>  }
> }
> {code}
> In every case, when the panic occurs, bw.byteOffset == len(bw.Buf). I tested 
> the below modification and it does remedy the bug. However, it's probably 
> only masking the actual bug.
> {code:go}
> // Test version: No Panic
> func (bw *firstTimeBitmapWriter) Finish() {
>   // store curByte into the bitmap
>   if bw.length > 0 && bw.bitMask != 0x01 || bw.pos < bw.length {
> if int(bw.byteOffset) == len(bw.Buf) {
>  bw.buf = append(bw.buf, bw.curByte)
> } else {
>bw.buf[int(bw.byteOffset)] = bw.curByte
>}
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17169) [Go] goPanicIndex in firstTimeBitmapWriter.Finish()

2022-08-10 Thread Robert Purdom (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17578149#comment-17578149
 ] 

Robert Purdom commented on ARROW-17169:
---

[~zeroshade] I'm having difficulty reproducing the issue.  When I created a 
little test program with generated data, it doesn't happen.  

> [Go] goPanicIndex in firstTimeBitmapWriter.Finish()
> ---
>
> Key: ARROW-17169
> URL: https://issues.apache.org/jira/browse/ARROW-17169
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Affects Versions: 9.0.0, 8.0.1
> Environment: go (1.18.3), Linux, AMD64
>Reporter: Robert Purdom
>Priority: Critical
>
> I'm working with complex parquet files with 500+ "root" columns where some 
> fields are lists of structs, internally referred to as 'topics'.  Some of 
> these structs have 100's of columns.  When reading a particular topic, I get 
> an Index Panic at the line indicated below. This error occurs when the value 
> for the topic is Null, as in, for this particular root record, this topic has 
> no data.  The root is household data, the topic is auto, so the error occurs 
> when the household has no autos.  The auto field is a Nullable List of Struct.
>  
> {code:go}
> /* Finish() was called from defLevelsToBitmapInternal.
> data values when panic occurs
> bw.length == 17531
> bw.bitMask == 1
> bw.pos == 3424
> bw.length == 17531
> len(bw.Buf) == 428
> cap(bw.Buf) == 448
> bw.byteOffset == 428
> bw.curByte == 0
> */
> // bitmap_writer.go
> func (bw *firstTimeBitmapWriter) Finish() {
> // store curByte into the bitmap
>  if bw.length >0&& bw.bitMask !=0x01|| bw.pos < bw.length {
>   bw.buf[int(bw.byteOffset)] = bw.curByte   // < Panic index
>  }
> }
> {code}
> In every case, when the panic occurs, bw.byteOffset == len(bw.Buf). I tested 
> the below modification and it does remedy the bug. However, it's probably 
> only masking the actual bug.
> {code:go}
> // Test version: No Panic
> func (bw *firstTimeBitmapWriter) Finish() {
>   // store curByte into the bitmap
>   if bw.length > 0 && bw.bitMask != 0x01 || bw.pos < bw.length {
> if int(bw.byteOffset) == len(bw.Buf) {
>  bw.buf = append(bw.buf, bw.curByte)
> } else {
>bw.buf[int(bw.byteOffset)] = bw.curByte
>}
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17169) [Go] goPanicIndex in firstTimeBitmapWriter.Finish()

2022-08-09 Thread Matthew Topol (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17577569#comment-17577569
 ] 

Matthew Topol commented on ARROW-17169:
---

[~Purdom] any updates here?

> [Go] goPanicIndex in firstTimeBitmapWriter.Finish()
> ---
>
> Key: ARROW-17169
> URL: https://issues.apache.org/jira/browse/ARROW-17169
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Affects Versions: 9.0.0, 8.0.1
> Environment: go (1.18.3), Linux, AMD64
>Reporter: Robert Purdom
>Priority: Critical
>
> I'm working with complex parquet files with 500+ "root" columns where some 
> fields are lists of structs, internally referred to as 'topics'.  Some of 
> these structs have 100's of columns.  When reading a particular topic, I get 
> an Index Panic at the line indicated below. This error occurs when the value 
> for the topic is Null, as in, for this particular root record, this topic has 
> no data.  The root is household data, the topic is auto, so the error occurs 
> when the household has no autos.  The auto field is a Nullable List of Struct.
>  
> {code:go}
> /* Finish() was called from defLevelsToBitmapInternal.
> data values when panic occurs
> bw.length == 17531
> bw.bitMask == 1
> bw.pos == 3424
> bw.length == 17531
> len(bw.Buf) == 428
> cap(bw.Buf) == 448
> bw.byteOffset == 428
> bw.curByte == 0
> */
> // bitmap_writer.go
> func (bw *firstTimeBitmapWriter) Finish() {
> // store curByte into the bitmap
>  if bw.length >0&& bw.bitMask !=0x01|| bw.pos < bw.length {
>   bw.buf[int(bw.byteOffset)] = bw.curByte   // < Panic index
>  }
> }
> {code}
> In every case, when the panic occurs, bw.byteOffset == len(bw.Buf). I tested 
> the below modification and it does remedy the bug. However, it's probably 
> only masking the actual bug.
> {code:go}
> // Test version: No Panic
> func (bw *firstTimeBitmapWriter) Finish() {
>   // store curByte into the bitmap
>   if bw.length > 0 && bw.bitMask != 0x01 || bw.pos < bw.length {
> if int(bw.byteOffset) == len(bw.Buf) {
>  bw.buf = append(bw.buf, bw.curByte)
> } else {
>bw.buf[int(bw.byteOffset)] = bw.curByte
>}
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17169) [Go] goPanicIndex in firstTimeBitmapWriter.Finish()

2022-07-25 Thread Robert Purdom (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17571079#comment-17571079
 ] 

Robert Purdom commented on ARROW-17169:
---

Yeah, I can.  Will be a couple of days before I can come back to it.  Perhaps 
later in the week.

> [Go] goPanicIndex in firstTimeBitmapWriter.Finish()
> ---
>
> Key: ARROW-17169
> URL: https://issues.apache.org/jira/browse/ARROW-17169
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Affects Versions: 9.0.0, 8.0.1
> Environment: go (1.18.3), Linux, AMD64
>Reporter: Robert Purdom
>Priority: Critical
>
> I'm working with complex parquet files with 500+ "root" columns where some 
> fields are lists of structs, internally referred to as 'topics'.  Some of 
> these structs have 100's of columns.  When reading a particular topic, I get 
> an Index Panic at the line indicated below. This error occurs when the value 
> for the topic is Null, as in, for this particular root record, this topic has 
> no data.  The root is household data, the topic is auto, so the error occurs 
> when the household has no autos.  The auto field is a Nullable List of Struct.
>  
> {code:go}
> /* Finish() was called from defLevelsToBitmapInternal.
> data values when panic occurs
> bw.length == 17531
> bw.bitMask == 1
> bw.pos == 3424
> bw.length == 17531
> len(bw.Buf) == 428
> cap(bw.Buf) == 448
> bw.byteOffset == 428
> bw.curByte == 0
> */
> // bitmap_writer.go
> func (bw *firstTimeBitmapWriter) Finish() {
> // store curByte into the bitmap
>  if bw.length >0&& bw.bitMask !=0x01|| bw.pos < bw.length {
>   bw.buf[int(bw.byteOffset)] = bw.curByte   // < Panic index
>  }
> }
> {code}
> In every case, when the panic occurs, bw.byteOffset == len(bw.Buf). I tested 
> the below modification and it does remedy the bug. However, it's probably 
> only masking the actual bug.
> {code:go}
> // Test version: No Panic
> func (bw *firstTimeBitmapWriter) Finish() {
>   // store curByte into the bitmap
>   if bw.length > 0 && bw.bitMask != 0x01 || bw.pos < bw.length {
> if int(bw.byteOffset) == len(bw.Buf) {
>  bw.buf = append(bw.buf, bw.curByte)
> } else {
>bw.buf[int(bw.byteOffset)] = bw.curByte
>}
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (ARROW-17169) [Go] goPanicIndex in firstTimeBitmapWriter.Finish()

2022-07-25 Thread Matthew Topol (Jira)


[ 
https://issues.apache.org/jira/browse/ARROW-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17571022#comment-17571022
 ] 

Matthew Topol commented on ARROW-17169:
---

[~Purdom] Would you be able to construct a reproducer or something I could use 
to take a look at this? Looking at the code and the variables you've provided 
it looks like the issue is that the length that was used to initialize the 
writer was larger than expected rather than corresponding appropriately to the 
length of the buffer that was provided. 

ie: len(defLevels) was significantly larger than len(out.ValidBits)*8 which 
resulted in the check `bw.pos < bw.length` being true when it should have been 
false leading to the panic you saw. If you can provide a reproducer of some 
kind i'll take a look and see if i can figure out how it got into that state 
and determine a good fix for this.

> [Go] goPanicIndex in firstTimeBitmapWriter.Finish()
> ---
>
> Key: ARROW-17169
> URL: https://issues.apache.org/jira/browse/ARROW-17169
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Affects Versions: 9.0.0, 8.0.1
> Environment: go (1.18.3), Linux, AMD64
>Reporter: Robert Purdom
>Priority: Critical
>
> I'm working with complex parquet files with 500+ "root" columns where some 
> fields are lists of structs, internally referred to as 'topics'.  Some of 
> these structs have 100's of columns.  When reading a particular topic, I get 
> an Index Panic at the line indicated below. This error occurs when the value 
> for the topic is Null, as in, for this particular root record, this topic has 
> no data.  The root is household data, the topic is auto, so the error occurs 
> when the household has no autos.  The auto field is a Nullable List of Struct.
>  
> {code:go}
> /* Finish() was called from defLevelsToBitmapInternal.
> data values when panic occurs
> bw.length == 17531
> bw.bitMask == 1
> bw.pos == 3424
> bw.length == 17531
> len(bw.Buf) == 428
> cap(bw.Buf) == 448
> bw.byteOffset == 428
> bw.curByte == 0
> */
> // bitmap_writer.go
> func (bw *firstTimeBitmapWriter) Finish() {
> // store curByte into the bitmap
>  if bw.length >0&& bw.bitMask !=0x01|| bw.pos < bw.length {
>   bw.buf[int(bw.byteOffset)] = bw.curByte   // < Panic index
>  }
> }
> {code}
> In every case, when the panic occurs, bw.byteOffset == len(bw.Buf). I tested 
> the below modification and it does remedy the bug. However, it's probably 
> only masking the actual bug.
> {code:go}
> // Test version: No Panic
> func (bw *firstTimeBitmapWriter) Finish() {
>   // store curByte into the bitmap
>   if bw.length > 0 && bw.bitMask != 0x01 || bw.pos < bw.length {
> if int(bw.byteOffset) == len(bw.Buf) {
>  bw.buf = append(bw.buf, bw.curByte)
> } else {
>bw.buf[int(bw.byteOffset)] = bw.curByte
>}
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)