tsekityam opened a new pull request, #15315:
URL: https://github.com/apache/pinot/pull/15315

   This PR is used to fix #15300 
   
   ## What this PR do?
   
   Add an spark connector option `authorization`, to allow user to set 
authorization header. If the option is set, the value will be set as header 
while sending request to pinot endpoint
   
   ## Before this PR
   
   I have a pinot instance in startree, and the endpoint of my instance is 
protected by authorization. When I tried to use pinot-spark-3-connector to 
fetch data from pinot to databricks, with the following code
   ```scala
   var df = spark
           .read
           .format("pinot")
           .option("controller", "xxxxx")
           .option("broker", "xxxxx")
           .option("table", "xxxxx")
           .option("tableType", "hybrid")
           .option("useGrpcServer", "true")
           .option("usePushDownFilters", "true")
           .load()
   
   display(df)
   ```
   
   I saw this error
   
   ```
   com.databricks.backend.daemon.driver.scalakernel.SparkException: An error 
occurred while getting Pinot schema for table 'xxxxxx'
   ```
   
   ## After this PR
   
   I used the following code to fetch the data from pinot to databricks
   
   ```py
   df = (
       spark
       .read
       .format("pinot")
       .option("controller", "xxxxx")
       .option("broker", "xxxxx)
       .option("table", "xxxxx")
       .option("tableType", "offline")
       .option("authorization", "Basic xxxxx")
       .load()
   )
   
   display(df)
   ```
   
   I can now get the schema from the `table/schema` endpoint, instance info 
from `/instances` endpoint, routing table from `debug/routingTable/sql` 
endpoint.
   
   
   ## Note
   
   However, my spark job still failed at the end because dependency issue same 
as https://github.com/typelevel/cats/issues/3628 and I am working on another PR 
to replace the `cats` library with `jackson`. 
   
   Given that the dependency issue is not part of #15300, I will raise another 
issue and PR about it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to