Hello Prabhakar,
Good questions. I think you want to understand the internal logic for the
JSON loader.
Actually, the Drill includes two kind of JSON loader, old version and new
revision.
I suggest you to debug based on the following code base :
1. Test Unit with the new JSON loader :
/drill-java-exec/src/test/java/org/apache/drill/exec/store/easy/json/loader
2. New JSON loader in HTTP storage :
https://github.com/apache/drill/blob/bf2b0d79e43bf65448557510a7b39f17c428df78/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpBatchReader.java#L103
<https://github.com/apache/drill/blob/bf2b0d79e43bf65448557510a7b39f17c428df78/contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpBatchReader.java#L103>
3. JSON Record Reader :
org.apache.drill.exec.store.json.TestJsonRecordReader
> 2021年9月24日 下午7:36,Prabhakar Bhosaale <[email protected]> 写道:
>
> Hi Team,
> I am getting following error while querying JSON file
>
> "(java.lang.Exception) UNSUPPORTED_OPERATION ERROR: Schema changes not
> supported in External Sort."
>
> I have identified the root cause as one of the column has NULL value for
> certain rows and STRING value for certain rows
>
> I am trying to find out How drill decides the datatype of columns and
> identify the schema changes.
>
> I tried following changes in data and got different results.
>
> Case 1 - If I remove order by clause in the query then I don't the error.
> Point to note, this specific column is not part of order by clause. But it
> is part of select list
>
> Case 2 - If I keep only two rows in file, one with NULL data and other with
> STRING data for given column then no error. Query returns the data
> successfully
>
> Case 3 - I change the value of given column in first to from NULL to empty
> string that is two double quotes then no error
>
> Previously somewhere I has read that drill reads initial certain rows of
> JSON and decides the datatype but not able to find the same now in the
> documentation.
>
> Thanks and Regards
> Prabhakar