Re: assistance needed debugging drill + daffodil

2023-12-07 Thread Paul Rogers
Hi Mike,

I wonder if you've got an array in there somewhere? Either in the data, or
you're creating an array in your code in response to the data?

If you have just scalars, then all you need to do is start a row, write the
scalars, and end the row. The starting and ending are done automagically by
the framework. If your row has a (non-repeated) map, the same rules apply.
This pattern works because every row has zero or one values for each scalar
(zero values means the value is null or default).

However, if you create an array, you need to help the row set loader a bit:
you have to tell it where one array element ends and another begins. Thus,
you must call the end element method on each array for each element. If you
have nested arrays, you must handle the events for each layer of array. In
this case, if you have a repeated map, you have an array in which each
element is a map: you have to tell the array where one map ends and the
next one begins.

So, your description is a couple of scalars, one (non-repeated) map and a
couple of scalar map entries. You should not be hitting the array code
shown in your message. That you are suggests to me that you are reading
something as an array. Either a) change it to read as a non-repeated map,
or b) insert the required array events.

Take a look at the many tests for arrays and nested arrays for the required
calls.

Thanks,

- Paul


On Thu, Dec 7, 2023 at 2:37 PM Mike Beckerle  wrote:

> I am blocked on getting a test (testComplexQuery3) to work that contains a
> row of a couple int columns plus a map column where that map contains 2
> additional int fields.
> Rows just containing simple integer fields work. Next step is let a column
> of the top level row be a map that is a pair of additional fields, and
> that's failing.
>
> The test fails in the assert here:
>
> @Override
> public void endArrayValue() {
>   assert state == State.IN_ROW;  // FAILS HERE WITH State.IDLE
>   for (AbstractObjectWriter writer : writers) {
> writer.events().endArrayValue();
>   }
> }
>
> (That is at line 306 of AbstractTupleWriter.java)
>
> This recursively calls endArrayValue on the child writers, and the
> state of the first of these is IDLE, not IN_ROW, so it fails the
> assert.
>
> This must mean I am doing something wrong with the setup/creation of
> the metadata for the map column (line 193 of
> DrillDaffodilSchemaVisitor.java) ...
>
> and/or creating and populating the data for this map column (line 177
> of DaffodilDrillInfosetOutputter.java).
>
> Any insights would be helpful.
>
> The PR is here: https://github.com/apache/drill/pull/2836
>
> My fork is here: https://github.com/mbeckerle/drill/tree/daffodil-2835
> (that's branch daffodil-2835)
>
> Note this fork works with the current 3.7.0-SNAPSHOT version of Apache
> Daffodil, but the features in Daffodil it needs are not yet in an
> "official" release.
>
> On Linux, in daffodil 'sbt publishM2' before rebuilding drill should
> do it once you have everything installed needed to build daffodil (See
> BUILD.md in Daffodil).
>
> Mike Beckerle
> Apache Daffodil PMC | daffodil.apache.org
> OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> Owl Cyber Defense | www.owlcyberdefense.com
>


assistance needed debugging drill + daffodil

2023-12-07 Thread Mike Beckerle
I am blocked on getting a test (testComplexQuery3) to work that contains a
row of a couple int columns plus a map column where that map contains 2
additional int fields.
Rows just containing simple integer fields work. Next step is let a column
of the top level row be a map that is a pair of additional fields, and
that's failing.

The test fails in the assert here:

@Override
public void endArrayValue() {
  assert state == State.IN_ROW;  // FAILS HERE WITH State.IDLE
  for (AbstractObjectWriter writer : writers) {
writer.events().endArrayValue();
  }
}

(That is at line 306 of AbstractTupleWriter.java)

This recursively calls endArrayValue on the child writers, and the
state of the first of these is IDLE, not IN_ROW, so it fails the
assert.

This must mean I am doing something wrong with the setup/creation of
the metadata for the map column (line 193 of
DrillDaffodilSchemaVisitor.java) ...

and/or creating and populating the data for this map column (line 177
of DaffodilDrillInfosetOutputter.java).

Any insights would be helpful.

The PR is here: https://github.com/apache/drill/pull/2836

My fork is here: https://github.com/mbeckerle/drill/tree/daffodil-2835
(that's branch daffodil-2835)

Note this fork works with the current 3.7.0-SNAPSHOT version of Apache
Daffodil, but the features in Daffodil it needs are not yet in an
"official" release.

On Linux, in daffodil 'sbt publishM2' before rebuilding drill should
do it once you have everything installed needed to build daffodil (See
BUILD.md in Daffodil).

Mike Beckerle
Apache Daffodil PMC | daffodil.apache.org
OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
Owl Cyber Defense | www.owlcyberdefense.com