[jira] [Updated] (DRILL-8283) Add a configurable recursive file listing size limit
[ https://issues.apache.org/jira/browse/DRILL-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton updated DRILL-8283: Description: Currently a malicious or merely unwitting user can crash their Drill foreman by sending {code:java} select * from dfs.huge_workspace limit 10 {code} causing the query planner to recurse over every file in huge_workspace and culminating in {code:java} 2022-08-09 15:13:22,251 [1d0da29f-e50c-fd51-43d9-8a5086d52c4e:foreman] ERROR o.a.drill.common.CatastrophicFailure - Catastrophic Failure Occurred, exiting. Information message: Unable to handle out of memory condition in Foreman.java.lang.OutOfMemoryError: null {code} if there are enough files in huge_workspace. A SHOW FILES command can produce the same effect. This issue proposes a new BOOT option named drill.exec.storage.file.recursive_listing_max_size with a default value of, say 10 000. If a file listing task exceeds this limit then the initiating operation is terminated with a UserException preventing runaway resource usage. was: Currently a malicious, or merely an unwitting user can crash their Drill foreman by sending {code:java} select * from dfs.huge_workspace limit 10 {code} causing the query planner to recurse over every file in huge_workspace and culminating in {code:java} 2022-08-09 15:13:22,251 [1d0da29f-e50c-fd51-43d9-8a5086d52c4e:foreman] ERROR o.a.drill.common.CatastrophicFailure - Catastrophic Failure Occurred, exiting. Information message: Unable to handle out of memory condition in Foreman.java.lang.OutOfMemoryError: null {code} if there are enough files in huge_workspace. A SHOW FILES command can produce the same effect. This issue proposes a new BOOT option named drill.exec.storage.file.max_listing_size with a default value of, say 10 000. If a file listing task exceeds this limit then the current operation is terminated with a UserException and runaway resource usage is prevented. > Add a configurable recursive file listing size limit > > > Key: DRILL-8283 > URL: https://issues.apache.org/jira/browse/DRILL-8283 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Other >Affects Versions: 1.20.2 >Reporter: James Turton >Assignee: James Turton >Priority: Minor > Fix For: 1.20.3 > > > Currently a malicious or merely unwitting user can crash their Drill foreman > by sending > {code:java} > select * from dfs.huge_workspace limit 10 > {code} > causing the query planner to recurse over every file in huge_workspace and > culminating in > {code:java} > 2022-08-09 15:13:22,251 [1d0da29f-e50c-fd51-43d9-8a5086d52c4e:foreman] ERROR > o.a.drill.common.CatastrophicFailure - Catastrophic Failure Occurred, > exiting. Information message: Unable to handle out of memory condition in > Foreman.java.lang.OutOfMemoryError: null {code} > if there are enough files in huge_workspace. A SHOW FILES command can produce > the same effect. This issue proposes a new BOOT option named > drill.exec.storage.file.recursive_listing_max_size with a default value of, > say 10 000. If a file listing task exceeds this limit then the initiating > operation is terminated with a UserException preventing runaway resource > usage. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8283) Add a configurable recursive file listing size limit
[ https://issues.apache.org/jira/browse/DRILL-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton updated DRILL-8283: Summary: Add a configurable recursive file listing size limit (was: Implement a configurable file listing size limit) > Add a configurable recursive file listing size limit > > > Key: DRILL-8283 > URL: https://issues.apache.org/jira/browse/DRILL-8283 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Other >Affects Versions: 1.20.2 >Reporter: James Turton >Assignee: James Turton >Priority: Minor > Fix For: 1.20.3 > > > Currently a malicious, or merely an unwitting user can crash their Drill > foreman by sending > {code:java} > select * from dfs.huge_workspace limit 10 > {code} > causing the query planner to recurse over every file in huge_workspace and > culminating in > {code:java} > 2022-08-09 15:13:22,251 [1d0da29f-e50c-fd51-43d9-8a5086d52c4e:foreman] ERROR > o.a.drill.common.CatastrophicFailure - Catastrophic Failure Occurred, > exiting. Information message: Unable to handle out of memory condition in > Foreman.java.lang.OutOfMemoryError: null {code} > if there are enough files in huge_workspace. A SHOW FILES command can produce > the same effect. This issue proposes a new BOOT option named > drill.exec.storage.file.max_listing_size with a default value of, say 10 000. > If a file listing task exceeds this limit then the current operation is > terminated with a UserException and runaway resource usage is prevented. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (DRILL-8284) Apache SQL Query failing while accessing the Json with complex data model
[ https://issues.apache.org/jira/browse/DRILL-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Givre closed DRILL-8284. Resolution: Not A Bug > Apache SQL Query failing while accessing the Json with complex data model > - > > Key: DRILL-8284 > URL: https://issues.apache.org/jira/browse/DRILL-8284 > Project: Apache Drill > Issue Type: Bug >Reporter: SHUBHAM KUMAR >Priority: Major > > Apache SQL Query failing while accessing the Json with complex data model. > Complex Json: > Map object inside another map object then Array Object. > Case1: When we have nested objects within array map, and map within map. > {"attributes": [ > { > "name": "webBrandName", > "value": { > "en-US": "Smashbox" > } > }, > { > "name": "startDate", > "value": "2011-07-25T15:30:00.000Z" > } > ] > } > Case2: Having array with multiple map items with diff data types. eg. String > and Boolean both type. > {"attributes": [ > { > "name": "startDate", > "value": "2011-07-25T15:30:00.000Z" > }, > { > "name": "hasCBD", > "value": false > } > ] > } > Query: > select flatten(attributes) as Var from dfs.`/filepath/filename.json` > > Error: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: readerIndex: 0, writerIndex: 1764642048 (expected: > 0 <= readerIndex <= writerIndex <= capacity(0)) Fragment: 0:0 Please, refer > to logs for more information. [Error Id: c5a3b8fa-cad1-4c9a-8673-de5745e9170b > on GGNUWT461535L.ad.infosys.com:31010] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8284) Apache SQL Query failing while accessing the Json with complex data model
[ https://issues.apache.org/jira/browse/DRILL-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583946#comment-17583946 ] Charles Givre commented on DRILL-8284: -- [~shubhamsmvdu] This is normal behavior for Drill. The issue you are encountering is a schema change exception on the `value` field. In both cases, what is happening is that Drill first encounters one data type and creates a vector for that, then in the next row, encounters the same field but in a different data type and throws an exception. The are a few options: # If you use the v1 JSON reader, you can enable the UNION data type which allows heterogeneous data types. We are working on enabling this for the V2 JSON reader, but for the moment, it is not. This is a variable which must be set at the system level. # Provide a schema: You can provide a schema for the field `value` and set `mode` to JSON. I'd have to dig up the documentation for this but what this does is force the field to a string. If JSON objects are encountered, those will be rendered as a string. I'm going to close this as this is expected behavior. Please use github issues or slack to continue the conversation. > Apache SQL Query failing while accessing the Json with complex data model > - > > Key: DRILL-8284 > URL: https://issues.apache.org/jira/browse/DRILL-8284 > Project: Apache Drill > Issue Type: Bug >Reporter: SHUBHAM KUMAR >Priority: Major > > Apache SQL Query failing while accessing the Json with complex data model. > Complex Json: > Map object inside another map object then Array Object. > Case1: When we have nested objects within array map, and map within map. > {"attributes": [ > { > "name": "webBrandName", > "value": { > "en-US": "Smashbox" > } > }, > { > "name": "startDate", > "value": "2011-07-25T15:30:00.000Z" > } > ] > } > Case2: Having array with multiple map items with diff data types. eg. String > and Boolean both type. > {"attributes": [ > { > "name": "startDate", > "value": "2011-07-25T15:30:00.000Z" > }, > { > "name": "hasCBD", > "value": false > } > ] > } > Query: > select flatten(attributes) as Var from dfs.`/filepath/filename.json` > > Error: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: readerIndex: 0, writerIndex: 1764642048 (expected: > 0 <= readerIndex <= writerIndex <= capacity(0)) Fragment: 0:0 Please, refer > to logs for more information. [Error Id: c5a3b8fa-cad1-4c9a-8673-de5745e9170b > on GGNUWT461535L.ad.infosys.com:31010] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-8284) Apache SQL Query failing while accessing the Json with complex data model
[ https://issues.apache.org/jira/browse/DRILL-8284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583896#comment-17583896 ] SHUBHAM KUMAR commented on DRILL-8284: -- Json sample for case1: {"attributes": [ { "name": "webBrandName", "value": { "en-US": "Smashbox" } }, { "name": "startDate", "value": "2011-07-25T15:30:00.000Z" } ] } Json sample for case2: {"attributes": [ { "name": "startDate", "value": "2011-07-25T15:30:00.000Z" }, { "name": "hasCBD", "value": false } ] } > Apache SQL Query failing while accessing the Json with complex data model > - > > Key: DRILL-8284 > URL: https://issues.apache.org/jira/browse/DRILL-8284 > Project: Apache Drill > Issue Type: Bug >Reporter: SHUBHAM KUMAR >Priority: Major > > Apache SQL Query failing while accessing the Json with complex data model. > Complex Json: > Map object inside another map object then Array Object. > Case1: When we have nested objects within array map, and map within map. > {"attributes": [ > { > "name": "webBrandName", > "value": { > "en-US": "Smashbox" > } > }, > { > "name": "startDate", > "value": "2011-07-25T15:30:00.000Z" > } > ] > } > Case2: Having array with multiple map items with diff data types. eg. String > and Boolean both type. > {"attributes": [ > { > "name": "startDate", > "value": "2011-07-25T15:30:00.000Z" > }, > { > "name": "hasCBD", > "value": false > } > ] > } > Query: > select flatten(attributes) as Var from dfs.`/filepath/filename.json` > > Error: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: readerIndex: 0, writerIndex: 1764642048 (expected: > 0 <= readerIndex <= writerIndex <= capacity(0)) Fragment: 0:0 Please, refer > to logs for more information. [Error Id: c5a3b8fa-cad1-4c9a-8673-de5745e9170b > on GGNUWT461535L.ad.infosys.com:31010] > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8284) Apache SQL Query failing while accessing the Json with complex data model
SHUBHAM KUMAR created DRILL-8284: Summary: Apache SQL Query failing while accessing the Json with complex data model Key: DRILL-8284 URL: https://issues.apache.org/jira/browse/DRILL-8284 Project: Apache Drill Issue Type: Bug Reporter: SHUBHAM KUMAR Apache SQL Query failing while accessing the Json with complex data model. Complex Json: Map object inside another map object then Array Object. Case1: When we have nested objects within array map, and map within map. {"attributes": [ { "name": "webBrandName", "value": { "en-US": "Smashbox" } }, { "name": "startDate", "value": "2011-07-25T15:30:00.000Z" } ] } Case2: Having array with multiple map items with diff data types. eg. String and Boolean both type. {"attributes": [ { "name": "startDate", "value": "2011-07-25T15:30:00.000Z" }, { "name": "hasCBD", "value": false } ] } Query: select flatten(attributes) as Var from dfs.`/filepath/filename.json` Error: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex: 0, writerIndex: 1764642048 (expected: 0 <= readerIndex <= writerIndex <= capacity(0)) Fragment: 0:0 Please, refer to logs for more information. [Error Id: c5a3b8fa-cad1-4c9a-8673-de5745e9170b on GGNUWT461535L.ad.infosys.com:31010] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (DRILL-8283) Implement a configurable file listing size limit
James Turton created DRILL-8283: --- Summary: Implement a configurable file listing size limit Key: DRILL-8283 URL: https://issues.apache.org/jira/browse/DRILL-8283 Project: Apache Drill Issue Type: Improvement Components: Storage - Other Affects Versions: 1.20.2 Reporter: James Turton Assignee: James Turton Fix For: 1.20.3 Currently a malicious, or merely an unwitting user can crash their Drill foreman by sending {code:java} select * from dfs.huge_workspace limit 10 {code} causing the query planner to recurse over every file in huge_workspace and culminating in {code:java} 2022-08-09 15:13:22,251 [1d0da29f-e50c-fd51-43d9-8a5086d52c4e:foreman] ERROR o.a.drill.common.CatastrophicFailure - Catastrophic Failure Occurred, exiting. Information message: Unable to handle out of memory condition in Foreman.java.lang.OutOfMemoryError: null {code} if there are enough files in huge_workspace. A SHOW FILES command can produce the same effect. This issue proposes a new BOOT option named drill.exec.storage.file.max_listing_size with a default value of, say 10 000. If a file listing task exceeds this limit then the current operation is terminated with a UserException and runaway resource usage is prevented. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-7856) Add lgtm badge to Drill and fix alerts
[ https://issues.apache.org/jira/browse/DRILL-7856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583648#comment-17583648 ] ASF GitHub Bot commented on DRILL-7856: --- cgivre closed pull request #2187: DRILL-7856 Add lgtm badge to Drill and fix alerts URL: https://github.com/apache/drill/pull/2187 > Add lgtm badge to Drill and fix alerts > -- > > Key: DRILL-7856 > URL: https://issues.apache.org/jira/browse/DRILL-7856 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.18.0 >Reporter: Vitalii Diravka >Priority: Trivial > Labels: badge, github > > Consider adding new badges to Drill github, for instance _lgtm_ badges (code > quality and alerts number): > [https://lgtm.com/projects/g/apache/drill/context:java] > As an example please check: > [https://github.com/kaitoy/pcap4j] > As a separate ticket can be considered decreasing the number of alerts of > Drill project: > https://lgtm.com/projects/g/apache/drill/alerts/?mode=list -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (DRILL-7856) Add lgtm badge to Drill and fix alerts
[ https://issues.apache.org/jira/browse/DRILL-7856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17583647#comment-17583647 ] ASF GitHub Bot commented on DRILL-7856: --- cgivre commented on PR #2187: URL: https://github.com/apache/drill/pull/2187#issuecomment-1224126726 LGTM is closing in Dec, 2022. https://github.blog/2022-08-15-the-next-step-for-lgtm-com-github-code-scanning/ > Add lgtm badge to Drill and fix alerts > -- > > Key: DRILL-7856 > URL: https://issues.apache.org/jira/browse/DRILL-7856 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.18.0 >Reporter: Vitalii Diravka >Priority: Trivial > Labels: badge, github > > Consider adding new badges to Drill github, for instance _lgtm_ badges (code > quality and alerts number): > [https://lgtm.com/projects/g/apache/drill/context:java] > As an example please check: > [https://github.com/kaitoy/pcap4j] > As a separate ticket can be considered decreasing the number of alerts of > Drill project: > https://lgtm.com/projects/g/apache/drill/alerts/?mode=list -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (DRILL-8282) Upgrade to hadoop-common 3.2.4 due to CVE
[ https://issues.apache.org/jira/browse/DRILL-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Turton updated DRILL-8282: Summary: Upgrade to hadoop-common 3.2.4 due to CVE (was: upgrade to hadoop-common 3.2.4 due to cve ) > Upgrade to hadoop-common 3.2.4 due to CVE > -- > > Key: DRILL-8282 > URL: https://issues.apache.org/jira/browse/DRILL-8282 > Project: Apache Drill > Issue Type: Improvement >Reporter: PJ Fanning >Priority: Major > > https://github.com/advisories/GHSA-8wm5-8h9c-47pc > * this change requires some reload4j dependency changes too - see broken > build - https://github.com/apache/drill/pull/2628 -- This message was sent by Atlassian Jira (v8.20.10#820010)