Hi Marc, That's great! What you have to do is: 1. Fork Drill into your own github account 2. Create a branch with your changes. 3. Then you should be able to create a pull request with your changes. Here are the contribution guidelines (https://drill.apache.org/docs/apache-drill-contribution-guidelines/) Please name your pull request DRILL-XXXX to reflect the JIRA ticket you are addressing. 4. Once the CI passes, a committer will review and merge the PR.
Thanks! -- C > On Dec 29, 2022, at 7:39 AM, marc nicole <[email protected]> wrote: > > Hi, > > Thanks for your reply, > I actually want to submit my changes, but I am being denied to push any > changes to the Drill repo. How to do the pull request in Git ? Are there any > permissions required to get beforehand pushing to the repo ? > > > Le mer. 28 déc. 2022 à 15:46, Charles Givre <[email protected] > <mailto:[email protected]>> a écrit : >> Hi Marc, >> Thanks for this. Here's the thing... Let's say you have json that looks >> like this: >> >> { >> "foo":null >> },{ >> "foo": 3.5 >> } >> >> If you take the approach that `null` is treated like a string, you will get >> a schema change exception when you read the next row. Our current approach >> is to basically ignore fields that Drill cannot figure out what they are in >> terns of data type. Once Drill encounters a data type, it will then assign >> a data type to that column. See the example below which is from DRILL-5033. >> I added a second row to demonstrate what happens once Drill is able to >> determine a data type. Note that for the columns with a defined value in >> the second row, Drill returns 'null' as the value. >> >> >> [{ >> "intKey" : null, >> "bgintKey": null, >> "strKey": null, >> "boolKey": null, >> "fltKey": null, >> "dblKey": null, >> "timKey": null, >> "dtKey": null, >> "tmstmpKey": null, >> "intrvldyKey": null, >> "intrvlyrKey": null >> }, >> { >> "intKey" : 1, >> "bgintKey": 3666565464, >> "strKey": "hithere", >> "boolKey": true, >> "fltKey": 3.5, >> "dblKey": 4.2, >> "timKey": null, >> "dtKey": null, >> "tmstmpKey": null, >> "intrvldyKey": null, >> "intrvlyrKey": null >> }] >> >> >> select * from dfs.test.`nulls.json`; >> +--------+---------------+---------+---------+--------+--------+--------+-------+-----------+-------------+-------------+ >> | intKey | bgintKey | strKey | boolKey | fltKey | dblKey | timKey | >> dtKey | tmstmpKey | intrvldyKey | intrvlyrKey | >> +--------+---------------+---------+---------+--------+--------+--------+-------+-----------+-------------+-------------+ >> | null | null | null | null | null | null | [] | [] >> | [] | [] | [] | >> | 1.0 | 3.666565464E9 | hithere | true | 3.5 | 4.2 | [] | [] >> | [] | [] | [] | >> +--------+---------------+---------+---------+--------+--------+--------+-------+-----------+-------------+-------------+ >> 2 rows selected (0.232 seconds) >> >> You are definitely welcome to submit a pull request, however this area is >> extremely complex, and I'd suspect that what you propose will break other >> unit tests. Another option which you might not be aware of is providing a >> schema. If you do that from the beginning, then Drill will know what data >> types to expect. >> >> Best, >> -- C >> >> >> > On Dec 28, 2022, at 8:57 AM, marc nicole <[email protected] >> > <mailto:[email protected]>> wrote: >> > >> > Hello Drillers :) >> > >> > I came across the aforementioned bug (DRILL-5033) and wanted to contribute. >> > My attempt is to consider a *null *token as a *string *and print the "null" >> > as the column value instead of omitting the key in the output >> > resultset, details >> > of the fix attempt is below: >> > >> > >> > *1)* In JsonReader.java (java-exec/drill-exec/vector/complex/fn/) at line >> > 283 i add the following: >> > >> >> ... >> >> case VALUE_NULL: >> >> // handle null as string >> >> handleString(parser, map, fieldName); >> >> break; >> >> ... >> > >> > >> > *2)* then at line 415 the handleString() becomes: >> > >> > private void handleString(JsonParser parser, MapWriter writer, String >> >> fieldName) throws IOException { >> >> try { >> >> // added the following if >> >> if (parser.nextToken() == VALUE_NULL) >> >> writer.varChar(fieldName) >> >> .writeVarChar(0, workingBuffer.prepareVarCharHolder("null"), >> >> workingBuffer.getBuf()); >> >> else >> >> writer.varChar(fieldName) >> >> .writeVarChar(0, >> >> workingBuffer.prepareVarCharHolder(parser.getText()), >> >> workingBuffer.getBuf()); >> >> } catch (IllegalArgumentException e) { >> >> if (parser.getText() == null || parser.getText().isEmpty()) { >> >> // return; >> >> } >> >> throw e; >> >> } >> >> } >> > >> > >> > >> > Is this a possible fix to the mentioned bug? >> > If yes should i pull request ? >> > >> > Thanks. >>
