Raxx55 opened a new issue, #15566:
URL: https://github.com/apache/pinot/issues/15566
### **_The query contains JSON_EXTRACT_SCALAR in the SELECT clause, a LEFT
JOIN between two tables, and a column from the right table used in the WHERE
clause, which causes an error in Pinot._**
**Note:** I am working with **Pinot 1.2.0** version
### **Query:** Not working
select s.name, s.createdby, **json_extract_scalar(s.attributes,
'$.42fb8f5f-9a2f-4e06-aa28-0318df4e632d', 'STRING', 'null') AS
'42fb8f5f-9a2f-4e06-aa28-0318df4e632d'**
from student s **left join** class c on s.classId = c.id
where (s.professorId = '870e3f0d-25ac-45b6-bc42-841cdeda8b78') and **(c.id =
'0c9b1cea-5b98-4ac0-89dc-1ad914b5bf13')** limit 50000
### **Response:**
```
Error Code: 200
QueryExecutionError:
Unable to execute query plan for request: 1129558639000000402 on server:
my-pinot-release-server-0.my-pinot-release-server-headless.pinot.svc.cluster.local@{51314,18307},
ERROR: java.util.concurrent.ExecutionException: java.lang.RuntimeException:
Caught exception while submitting request: 1129558639000000402, stage: 3
at
java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
at
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096)
at
org.apache.pinot.query.service.server.QueryServer.submit(QueryServer.java:156)
at
org.apache.pinot.common.proto.PinotQueryWorkerGrpc$MethodHandlers.invoke(PinotQueryWorkerGrpc.java:284)
...
Caused by: java.lang.RuntimeException: Caught exception while submitting
request: 1129558639000000402, stage: 3
at
org.apache.pinot.query.service.server.QueryServer.lambda$submit$1(QueryServer.java:144)
at
java.base/java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1804)
... 3 more
Caused by: java.util.concurrent.ExecutionException:
java.lang.IllegalStateException: Cannot find function with name:
JSON_EXTRACT_SCALAR
at
java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
at
java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2096)
at
org.apache.pinot.query.service.server.QueryServer.lambda$submit$1(QueryServer.java:141)
... 4 more
...
Caused by: java.lang.IllegalStateException: Cannot find function with name:
JSON_EXTRACT_SCALAR
at
org.apache.pinot.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:604)
at
org.apache.pinot.query.runtime.operator.operands.FunctionOperand.<init>(FunctionOperand.java:51)
at
org.apache.pinot.query.runtime.operator.operands.TransformOperandFactory.getTransformOperand(TransformOperandFactory.java:81)
at
org.apache.pinot.query.runtime.operator.operands.TransformOperandFactory.getTransformOperand(TransformOperandFactory.java:33)
org.apache.pinot.query.service.dispatch.QueryDispatcher.submit(QueryDispatcher.java:198)
org.apache.pinot.query.service.dispatch.QueryDispatcher.submitAndReduce(QueryDispatcher.java:95)
org.apache.pinot.broker.requesthandler.MultiStageBrokerRequestHandler.handleRequest(MultiStageBrokerRequestHandler.java:219)
org.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.handleRequest(BaseBrokerRequestHandler.java:133)
```
### **Observation:**
1. ### **The same query without right tables column in where clause(here its
class id):** Working
select s.name, s.createdby, **json_extract_scalar(s.attributes,
'$.42fb8f5f-9a2f-4e06-aa28-0318df4e632d', 'STRING', 'null') AS
'42fb8f5f-9a2f-4e06-aa28-0318df4e632d'**
from student s **left join** class c on s.classId = c.id
where (s.professorId = '870e3f0d-25ac-45b6-bc42-841cdeda8b78') limit 50000
**Response:**

2. ### **The same query but with "join" instead of "left join":** Working
select s.name, s.createdby, **json_extract_scalar(s.attributes,
'$.42fb8f5f-9a2f-4e06-aa28-0318df4e632d', 'STRING', 'null') AS
'42fb8f5f-9a2f-4e06-aa28-0318df4e632d'**
from student s **join** class c on s.classId = c.id
where (s.professorId = '870e3f0d-25ac-45b6-bc42-841cdeda8b78') **and (c.id =
'0c9b1cea-5b98-4ac0-89dc-1ad914b5bf13')** limit 50000
**Response:**

### Few more details like schema, tables and actual data are as below,
**Student Schema:**
```
{
"schemaName": "student",
"primaryKeyColumns": [
"id"
],
"dimensionFieldSpecs": [
{
"name": "id",
"dataType": "STRING",
"maxLength": 64
},
{
"name": "name",
"dataType": "STRING",
"maxLength": 256
},
{
"name": "professorId",
"dataType": "STRING",
"maxLength": 64
},
{
"name": "classId",
"dataType": "STRING",
"maxLength": 64
},
{
"name": "createdBy",
"dataType": "STRING",
"maxLength": 64
},
{
"name": "updatedBy",
"dataType": "STRING",
"maxLength": 64
},
{
"name": "attributes",
"dataType": "JSON"
}
],
"dateTimeFieldSpecs": [
{
"name": "creationTime",
"dataType": "LONG",
"format": "1:MILLISECONDS:EPOCH",
"granularity": "1:MILLISECONDS",
"defaultNullValue": 0
},
{
"name": "updationTime",
"dataType": "LONG",
"format": "1:MILLISECONDS:EPOCH",
"granularity": "1:MILLISECONDS",
"defaultNullValue": 0
}
]
}
```
**Student table:**
```
{
"tableName": "student",
"tableType": "REALTIME",
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant"
},
"segmentsConfig": {
"schemaName": "student",
"timeColumnName": "updationTime",
"timeType": "MILLISECONDS",
"replication": "1",
"replicasPerPartition": "1"
},
"tableIndexConfig": {
"loadMode": "MMAP",
"invertedIndexColumns": [],
"createInvertedIndexDuringSegmentGeneration": false,
"rangeIndexColumns": [],
"sortedColumn": [],
"bloomFilterColumns": [],
"bloomFilterConfigs": null,
"noDictionaryColumns": [],
"onHeapDictionaryColumns": [],
"varLengthDictionaryColumns": [],
"enableDefaultStarTree": false,
"starTreeIndexConfigs": null,
"enableDynamicStarTreeCreation": false,
"segmentPartitionConfig": null,
"columnMinMaxValueGeneratorMode": null,
"aggregateMetrics": false,
"nullHandlingEnabled": false
},
"metadata": {},
"ingestionConfig": {
"filterConfig": null,
"transformConfigs": null,
"streamIngestionConfig": {
"streamConfigMaps": [
{
"streamType": "kafka",
"stream.kafka.topic.name": "student_entity_data",
"stream.kafka.broker.list": "{{kafka.broker.list}}",
"stream.kafka.consumer.type": "lowlevel",
"stream.kafka.consumer.prop.auto.offset.reset": "smallest",
"stream.kafka.consumer.factory.class.name":
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.decoder.class.name":
"org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"stream.kafka.decoder.prop.format": "JSON",
"key.serializer":
"org.apache.kafka.common.serialization.ByteArraySerializer",
"value.serializer":
"org.apache.kafka.common.serialization.ByteArraySerializer",
"realtime.segment.flush.threshold.rows": "30000",
"realtime.segment.flush.threshold.time": "12h",
"topic.consumption.rate.limit": "6000"
}
]
}
},
"quota": {
"storage": null,
"maxQueriesPerSecond": null
},
"task": {
"taskTypeConfigsMap": {
}
},
"routing": {
"segmentPrunerTypes": null,
"instanceSelectorType": "strictReplicaGroup"
},
"query": {
"timeoutMs": null
},
"fieldConfigList": null,
"upsertConfig": {
"mode": "FULL",
"deleteRecordColumn": "deleted"
},
"tierConfigs": null
}
```
**Student table data:**
```
{
"id": "6f02ae8c-e7ca-4671-b759-5e18cb97c74f",
"name": "Alen",
"professorId": "870e3f0d-25ac-45b6-bc42-841cdeda8b78",
"classId": "0c9b1cea-5b98-4ac0-89dc-1ad914b5bf13",
"createdBy": "870e3f0d-25ac-45b6-bc42-841cdeda8b78",
"creationTime": 1740995630168,
"updatedBy": null,
"updationTime": null,
"attributes": {"42fb8f5f-9a2f-4e06-aa28-0318df4e632d":"Male",
"d30ef901-8acc-4872-887c-d048073fe5d4":5}
},
{
"id": "2743b65e-7ec3-4072-90db-1921fe5728d3",
"name": "Lana",
"professorId": "870e3f0d-25ac-45b6-bc42-841cdeda8b78",
"classId": "0c9b1cea-5b98-4ac0-89dc-1ad914b5bf13",
"createdBy": "870e3f0d-25ac-45b6-bc42-841cdeda8b78",
"creationTime": 1741064094982,
"updatedBy": "870e3f0d-25ac-45b6-bc42-841cdeda8b78",
"updationTime": 1741064099981,
"attributes": {"42fb8f5f-9a2f-4e06-aa28-0318df4e632d":"Female",
"d30ef901-8acc-4872-887c-d048073fe5d4":7}
},
{
"id": "cb18e2c6-6095-49c7-b991-70fc5c981a0e",
"name": "James",
"professorId": "dbac7d66-e443-473d-829c-43bb50a796f0",
"classId": "e2428f7d-565e-4b3b-9423-549a850a2ef5",
"creationTime": 1741253546022,
"createdBy": "dbac7d66-e443-473d-829c-43bb50a796f0",
"updatedBy": null,
"updationTime": null,
"attributes": {"42fb8f5f-9a2f-4e06-aa28-0318df4e632d":"Male",
"d30ef901-8acc-4872-887c-d048073fe5d4":3}
},
{
"id": "74eb710e-63c8-4524-943e-9586055fa840",
"name": "orry",
"professorId": "dbac7d66-e443-473d-829c-43bb50a796f0",
"classId": null,
"creationTime": 1741253946013,
"createdBy": "dbac7d66-e443-473d-829c-43bb50a796f0",
"updatedBy": null,
"updationTime": null,
"attributes": {"42fb8f5f-9a2f-4e06-aa28-0318df4e632d":"Female"}
}
```
**Class schema:**
```
{
"schemaName": "class",
"primaryKeyColumns": [
"id"
],
"dimensionFieldSpecs": [
{
"name": "id",
"dataType": "STRING",
"maxLength": 64
},
{
"name": "name",
"dataType": "STRING",
"maxLength": 256
},
{
"name": "displayOrder",
"dataType": "INT"
}
],
"metricFieldSpecs": [
],
"dateTimeFieldSpecs": [
{
"name": "dateTime",
"dataType": "LONG",
"format": "1:MILLISECONDS:EPOCH",
"granularity": "1:MILLISECONDS"
}
]
}
```
**Class table:**
```
{
"tableName": "class",
"tableType": "REALTIME",
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant"
},
"segmentsConfig": {
"schemaName": "class",
"timeColumnName": "dateTime",
"timeType": "MILLISECONDS",
"replication": "1",
"replicasPerPartition": "1"
},
"tableIndexConfig": {
"loadMode": "MMAP",
"invertedIndexColumns": [],
"createInvertedIndexDuringSegmentGeneration": false,
"rangeIndexColumns": [],
"sortedColumn": [],
"bloomFilterColumns": [],
"bloomFilterConfigs": null,
"noDictionaryColumns": [],
"onHeapDictionaryColumns": [],
"varLengthDictionaryColumns": [],
"enableDefaultStarTree": false,
"starTreeIndexConfigs": null,
"enableDynamicStarTreeCreation": false,
"segmentPartitionConfig": null,
"columnMinMaxValueGeneratorMode": null,
"aggregateMetrics": false,
"nullHandlingEnabled": false
},
"metadata": {},
"ingestionConfig": {
"filterConfig": null,
"transformConfigs": null,
"streamIngestionConfig": {
"streamConfigMaps": [
{
"streamType": "kafka",
"stream.kafka.topic.name": "class_entity_data",
"stream.kafka.broker.list": "{{kafka.broker.list}}",
"stream.kafka.consumer.type": "lowlevel",
"stream.kafka.consumer.prop.auto.offset.reset": "smallest",
"stream.kafka.consumer.factory.class.name":
"org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.decoder.class.name":
"org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"stream.kafka.decoder.prop.format": "JSON",
"key.serializer":
"org.apache.kafka.common.serialization.ByteArraySerializer",
"value.serializer":
"org.apache.kafka.common.serialization.ByteArraySerializer",
"realtime.segment.flush.threshold.rows": "30000",
"realtime.segment.flush.threshold.time": "12h",
"topic.consumption.rate.limit": "6000"
}
]
}
},
"quota": {
"storage": null,
"maxQueriesPerSecond": null
},
"task": {
"taskTypeConfigsMap": {
}
},
"routing": {
"segmentPrunerTypes": null,
"instanceSelectorType": "strictReplicaGroup"
},
"query": {
"timeoutMs": null
},
"fieldConfigList": null,
"upsertConfig": {
"mode": "FULL",
"deleteRecordColumn": "deleted"
},
"tierConfigs": null
}
```
**Class table data:**
```
{
"id": "0c9b1cea-5b98-4ac0-89dc-1ad914b5bf13",
"name": "class A",
"displayOrder": 10000,
"dateTime": 1741695499859
},
{
"id": "e2428f7d-565e-4b3b-9423-549a850a2ef5",
"name": "class B",
"displayOrder": 20000,
"dateTime": 1741695499869
},
{
"id": "5724f202-3cfb-41f5-894f-b44b0389878e",
"name": "class C",
"displayOrder": 30000,
"dateTime": 1741695499879
},
{
"id": "5ff9b60b-d012-4ba7-a02b-4f74dcbc79ac",
"name": "class D",
"displayOrder": 40000,
"dateTime": 1741695499889
},
{
"id": "9a9af0ec-334c-4212-b5ff-51390f909dde",
"name": "class E",
"displayOrder": 50000,
"dateTime": 1741695499899
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]