[jira] [Commented] (DRILL-5530) IntVectors are sometimes not zero-filled on allocation

2020-02-21 Thread Oleg Zinoviev (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041670#comment-17041670
 ] 

Oleg Zinoviev commented on DRILL-5530:
--

[~Paul.Rogers]
Thank you very much for the detailed explanations.


> IntVectors are sometimes not zero-filled on allocation
> --
>
> Key: DRILL-5530
> URL: https://issues.apache.org/jira/browse/DRILL-5530
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Much of Drill's vector code is based on the assumption that vectors are 
> initialized to 0s.
> For example, consider the "bit" (is-set) vector used for nullable types. The 
> vector is presumed initialized to 0 so that all fields are set to not-set 
> (e.g. null) by default. For example, to allocate a new vector the code 
> (slightly edited) is:
> {code}
>   private void allocateBytes(final long size) {
> final int curSize = (int) size;
> clear();
> data = allocator.buffer(curSize);
> data.setZero(0, data.capacity()); // Zero whole vector
> allocationSizeInBytes = curSize;
>   }
> {code}
> The code to reallocate (grow) the vector zeros the new slots:
> {code}
>   public void reAlloc() {
> final long newAllocationSize = allocationSizeInBytes * 2L;
> final int curSize = (int)newAllocationSize;
> final DrillBuf newBuf = allocator.buffer(curSize);
> newBuf.setZero(0, newBuf.capacity());
> newBuf.setBytes(0, data, 0, data.capacity()); // Zero new part
> data.release();
> data = newBuf;
> allocationSizeInBytes = curSize;
>   }
> {code}
> However, it turns out that other vectors omit the initial zero step. Consider 
> the {{IntVector}}:
> {code}
>   private void allocateBytes(final long size) {
> final int curSize = (int)size;
> clear();
> data = allocator.buffer(curSize);
> // Notice no call here to zero the buffer
> data.readerIndex(0);
> allocationSizeInBytes = curSize;
>   }
> {code}
> Now things start to get strange. If the buffer is newly allocated by Netty 
> from the OS, it will contain zeros. Thus, most test cases will get a zero 
> buffer. But, [Netty does *not* zero the 
> buffer|http://netty.io/wiki/using-as-a-generic-library.html#performance] if 
> the buffer is recycled from Netty's free list.
> Further, the reallocation for ints *does* zero the new half of the buffer:
> {code}
>   public void reAlloc() {
> final long newAllocationSize = allocationSizeInBytes * 2L;
> final DrillBuf newBuf = allocator.buffer((int)newAllocationSize);
> newBuf.setBytes(0, data, 0, data.capacity());
> final int halfNewCapacity = newBuf.capacity() / 2;
> newBuf.setZero(halfNewCapacity, halfNewCapacity); // Zero new bytes
> newBuf.writerIndex(data.writerIndex());
> data.release(1);
> data = newBuf;
> allocationSizeInBytes = (int)newAllocationSize;
>   }
> {code}
> The result is that, most times, the bytes of the vector will be zero. But, 
> the first allocation may be non-zero if Netty reuses an existing block from 
> Netty's free list.
> If any code assumes that values default to zero, that code will fail -- but 
> only for the first few bytes and only if run long enough for the buffer to be 
> a "recycled" "dirty" one.
> This was found by a test that filled vectors with runs of 'X' values for one 
> test, followed by another test that allocated an int vector and inspected its 
> contents.
> This asymmetry is just asking for bugs to appear. To make clear that only bit 
> vectors are zero filled:
> * Remove the {{setZero()}} call when reallocating vectors.
> * When assertions are on, fill the buffer with dummy data (such as 
> 0b1010_1010) to flush out any code that happens to depend on zero values.
> * Code that needs to "back-fill" values (such as when columns are missing in 
> some rows in a reader) should do the back-filling explicitly rather than 
> assuming zeros.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-5530) IntVectors are sometimes not zero-filled on allocation

2020-02-20 Thread Paul Rogers (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041252#comment-17041252
 ] 

Paul Rogers commented on DRILL-5530:


Hi [~le.louch] thanks for the update! I'd forgotten about this old issue.

Since I filed this, I've learned more about how the allocator works. 
Zero-filling a buffer takes time – time which is actually (in all but one case) 
unnecessary. Considerable care (and bug fixes!) went into vector code to ensure 
that each value of a non-nullable vector is set to some value. For nullable 
vectors, null values are undefined in the "values" vector and can contain 
garbage.

The one place where Drill requires zero-fill is in the "bits" vector for 
nullable types. Here the assumption is we zero-fill up front so we don't have 
to set the "isSet" flag to 0 for null values. For things like JSON, doing so 
would be very hard since an input record is free to completely omit a column.

It is easy to make a mistake and either intentionally or accidentally write 
code that assumes zero fill. When Drill first allocates a buffer from Unsafe, 
Unsafe zero-fills it. However, when the buffer is freed, it goes into the Netty 
free list and Netty *does not* zero-fill buffers on reallocation. (Like Drill, 
Netty does not want to pay the cost of zero-fill unnecessarily.)

When running test, you almost always get "virgin" buffers direct from Unsafe, 
so your code looks good. Only later, when the code runs in the unit test suite, 
and you get dirty buffers, do things fail. This can be *very* hard to debug. 
(It is like going back to the C memory-overwrite days.)

To help with this issue, the "row set" tests have gone one step further. The 
{{SubOperatorTest}} fixture deliberately dirties a large amount of memory so 
that tests always run using dirtied buffers. This has found several bugs that 
were then resolved.

So, this leads to a question: what is your use case in which you need a 
zero-filled buffer? If for vectors, then the vector (and {{ColumnWriter}} code 
handles dirty buffers. If for some other use, you can call a method to 
zero-fill the buffer on allocation.

> IntVectors are sometimes not zero-filled on allocation
> --
>
> Key: DRILL-5530
> URL: https://issues.apache.org/jira/browse/DRILL-5530
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Priority: Major
>
> Much of Drill's vector code is based on the assumption that vectors are 
> initialized to 0s.
> For example, consider the "bit" (is-set) vector used for nullable types. The 
> vector is presumed initialized to 0 so that all fields are set to not-set 
> (e.g. null) by default. For example, to allocate a new vector the code 
> (slightly edited) is:
> {code}
>   private void allocateBytes(final long size) {
> final int curSize = (int) size;
> clear();
> data = allocator.buffer(curSize);
> data.setZero(0, data.capacity()); // Zero whole vector
> allocationSizeInBytes = curSize;
>   }
> {code}
> The code to reallocate (grow) the vector zeros the new slots:
> {code}
>   public void reAlloc() {
> final long newAllocationSize = allocationSizeInBytes * 2L;
> final int curSize = (int)newAllocationSize;
> final DrillBuf newBuf = allocator.buffer(curSize);
> newBuf.setZero(0, newBuf.capacity());
> newBuf.setBytes(0, data, 0, data.capacity()); // Zero new part
> data.release();
> data = newBuf;
> allocationSizeInBytes = curSize;
>   }
> {code}
> However, it turns out that other vectors omit the initial zero step. Consider 
> the {{IntVector}}:
> {code}
>   private void allocateBytes(final long size) {
> final int curSize = (int)size;
> clear();
> data = allocator.buffer(curSize);
> // Notice no call here to zero the buffer
> data.readerIndex(0);
> allocationSizeInBytes = curSize;
>   }
> {code}
> Now things start to get strange. If the buffer is newly allocated by Netty 
> from the OS, it will contain zeros. Thus, most test cases will get a zero 
> buffer. But, [Netty does *not* zero the 
> buffer|http://netty.io/wiki/using-as-a-generic-library.html#performance] if 
> the buffer is recycled from Netty's free list.
> Further, the reallocation for ints *does* zero the new half of the buffer:
> {code}
>   public void reAlloc() {
> final long newAllocationSize = allocationSizeInBytes * 2L;
> final DrillBuf newBuf = allocator.buffer((int)newAllocationSize);
> newBuf.setBytes(0, data, 0, data.capacity());
> final int halfNewCapacity = newBuf.capacity() / 2;
> newBuf.setZero(halfNewCapacity, halfNewCapacity); // Zero new bytes
> newBuf.writerIndex(data.writerIndex());
> data.release(1);
> data = newBuf;
> allocationSizeInBytes = (int)newAllocationSize;
>   }
> {code}
> The 

[jira] [Commented] (DRILL-5530) IntVectors are sometimes not zero-filled on allocation

2020-02-20 Thread Oleg Zinoviev (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17041050#comment-17041050
 ] 

Oleg Zinoviev commented on DRILL-5530:
--

[~paul-rogers]
I accidentally discovered a similar behavior when sequentially allocating and 
freeing a buffer. Here is a test that demonstrates this behavior.


{code:java}
 try (OperatorFixture operatorFixture = new 
OperatorFixture.Builder(baseDirTestWatcher).build()) {
  DrillBuf buffer = operatorFixture.allocator().buffer(8);
  buffer.setByte(0, 1);
  long addr = buffer.memoryAddress();

  buffer.release();
  buffer = operatorFixture.allocator().buffer(8);

  Assert.assertEquals(addr, buffer.memoryAddress());// same address
  Assert.assertEquals(0, buffer.getByte(0)); // <== fail, since 
buffer.getByte(0) return 1
}
{code}

In my case, a non-zero buffer is returned from PooledByteBufAllocatorL :: 
threadCache () :: directArena :: allocate (...)

> IntVectors are sometimes not zero-filled on allocation
> --
>
> Key: DRILL-5530
> URL: https://issues.apache.org/jira/browse/DRILL-5530
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Priority: Major
>
> Much of Drill's vector code is based on the assumption that vectors are 
> initialized to 0s.
> For example, consider the "bit" (is-set) vector used for nullable types. The 
> vector is presumed initialized to 0 so that all fields are set to not-set 
> (e.g. null) by default. For example, to allocate a new vector the code 
> (slightly edited) is:
> {code}
>   private void allocateBytes(final long size) {
> final int curSize = (int) size;
> clear();
> data = allocator.buffer(curSize);
> data.setZero(0, data.capacity()); // Zero whole vector
> allocationSizeInBytes = curSize;
>   }
> {code}
> The code to reallocate (grow) the vector zeros the new slots:
> {code}
>   public void reAlloc() {
> final long newAllocationSize = allocationSizeInBytes * 2L;
> final int curSize = (int)newAllocationSize;
> final DrillBuf newBuf = allocator.buffer(curSize);
> newBuf.setZero(0, newBuf.capacity());
> newBuf.setBytes(0, data, 0, data.capacity()); // Zero new part
> data.release();
> data = newBuf;
> allocationSizeInBytes = curSize;
>   }
> {code}
> However, it turns out that other vectors omit the initial zero step. Consider 
> the {{IntVector}}:
> {code}
>   private void allocateBytes(final long size) {
> final int curSize = (int)size;
> clear();
> data = allocator.buffer(curSize);
> // Notice no call here to zero the buffer
> data.readerIndex(0);
> allocationSizeInBytes = curSize;
>   }
> {code}
> Now things start to get strange. If the buffer is newly allocated by Netty 
> from the OS, it will contain zeros. Thus, most test cases will get a zero 
> buffer. But, [Netty does *not* zero the 
> buffer|http://netty.io/wiki/using-as-a-generic-library.html#performance] if 
> the buffer is recycled from Netty's free list.
> Further, the reallocation for ints *does* zero the new half of the buffer:
> {code}
>   public void reAlloc() {
> final long newAllocationSize = allocationSizeInBytes * 2L;
> final DrillBuf newBuf = allocator.buffer((int)newAllocationSize);
> newBuf.setBytes(0, data, 0, data.capacity());
> final int halfNewCapacity = newBuf.capacity() / 2;
> newBuf.setZero(halfNewCapacity, halfNewCapacity); // Zero new bytes
> newBuf.writerIndex(data.writerIndex());
> data.release(1);
> data = newBuf;
> allocationSizeInBytes = (int)newAllocationSize;
>   }
> {code}
> The result is that, most times, the bytes of the vector will be zero. But, 
> the first allocation may be non-zero if Netty reuses an existing block from 
> Netty's free list.
> If any code assumes that values default to zero, that code will fail -- but 
> only for the first few bytes and only if run long enough for the buffer to be 
> a "recycled" "dirty" one.
> This was found by a test that filled vectors with runs of 'X' values for one 
> test, followed by another test that allocated an int vector and inspected its 
> contents.
> This asymmetry is just asking for bugs to appear. To make clear that only bit 
> vectors are zero filled:
> * Remove the {{setZero()}} call when reallocating vectors.
> * When assertions are on, fill the buffer with dummy data (such as 
> 0b1010_1010) to flush out any code that happens to depend on zero values.
> * Code that needs to "back-fill" values (such as when columns are missing in 
> some rows in a reader) should do the back-filling explicitly rather than 
> assuming zeros.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)