xiangfu0 opened a new pull request, #16666:
URL: https://github.com/apache/pinot/pull/16666

   ## ๐Ÿš€ Overview
   
   This PR adds comprehensive **Pinot Proxy** and **gRPC configuration 
support** to the pinot-spark-3-connector, enabling secure production 
deployments where proxy is the only exposed endpoint.
   
   **Reference Implementation:** Based on [Trino PR 
#13015](https://github.com/trinodb/trino/pull/13015/files) for feature parity.
   
   ## ๐Ÿ”ง Pinot Proxy Support
   
   ### New Configuration Options
   - `proxy.enabled` - Use Pinot Proxy for controller and broker requests 
(default: false)
   
   ### Key Features
   - โœ… HTTP proxy forwarding with `FORWARD_HOST` and `FORWARD_PORT` headers
   - โœ… Proxy routing for all controller and broker API requests  
   - โœ… Secure cluster access through proxy-only endpoints
   - โœ… Works with existing HTTPS and authentication features
   
   ### Usage Example
   ```scala
   val data = spark.read
     .format("pinot")
     .option("table", "airlineStats")
     .option("tableType", "offline")
     .option("controller", "pinot-proxy:8080")  // Proxy endpoint
     .option("proxy.enabled", "true")
     .load()
   ```
   
   ## ๐Ÿš€ Comprehensive gRPC Configuration
   
   ### New Configuration Options
   | Option | Description | Default |
   |--------|-------------|---------|
   | `grpc.port` | Pinot gRPC port | 8090 |
   | `grpc.max-inbound-message-size` | Max inbound message bytes | 128MB |
   | `grpc.use-plain-text` | Use plain text for gRPC communication | true |
   | `grpc.tls.keystore-type` | TLS keystore type | JKS |
   | `grpc.tls.keystore-path` | TLS keystore file location | None |
   | `grpc.tls.keystore-password` | TLS keystore password | None |
   | `grpc.tls.truststore-type` | TLS truststore type | JKS |
   | `grpc.tls.truststore-path` | TLS truststore file location | None |
   | `grpc.tls.truststore-password` | TLS truststore password | None |
   | `grpc.tls.ssl-provider` | SSL provider | JDK |
   | `grpc.proxy-uri` | Pinot Rest Proxy gRPC endpoint URI | None |
   
   ### Key Features
   - โœ… Complete TLS/SSL configuration for secure gRPC communication
   - โœ… gRPC proxy support with `FORWARD_HOST` and `FORWARD_PORT` metadata
   - โœ… Configurable message sizes and connection pooling
   - โœ… Support for multiple SSL providers (JDK, OPENSSL)
   
   ### Usage Example
   ```scala
   // gRPC with TLS and proxy
   val data = spark.read
     .format("pinot")
     .option("table", "airlineStats")
     .option("tableType", "offline")
     .option("proxy.enabled", "true")
     .option("grpc.proxy-uri", "pinot-proxy:8094")
     .option("grpc.use-plain-text", "false")
     .option("grpc.tls.keystore-path", "/path/to/grpc-keystore.jks")
     .option("grpc.tls.keystore-password", "keystore-password")
     .load()
   ```
   
   ## ๐Ÿ—๏ธ Architecture Changes
   
   ### New Components
   - **`GrpcUtils`** - Complete gRPC channel management and proxy metadata 
handling
   - **`HttpUtils.sendGetRequestWithProxyHeaders()`** - Proxy-aware HTTP 
requests
   
   ### Enhanced Components
   - **`PinotDataSourceReadOptions`** - 12 new configuration fields for proxy 
and gRPC
   - **`PinotClusterClient`** - All API methods now support proxy parameters
   - **`PinotServerDataFetcher`** - gRPC proxy configuration integration
   - **All Spark DataSource V2 components** - Updated to pass proxy parameters
   
   ### Files Changed
   - ๐Ÿ†• `GrpcUtils.scala` (125 lines) - gRPC utilities
   - ๐Ÿ†• `GrpcUtilsTest.scala` (165 lines) - gRPC testing
   - ๐Ÿ“ 12 files modified with comprehensive proxy and gRPC support
   
   ## ๐Ÿงช Testing
   
   ### Test Results
   - โœ… **39/39 tests passing** (including 8 new proxy/gRPC tests)
   - โœ… Configuration parsing validation tests
   - โœ… gRPC channel creation and proxy metadata tests
   - โœ… Error handling and edge case tests
   - โœ… **Full backward compatibility** - existing code works unchanged
   
   ### Test Coverage
   - Proxy configuration parsing and validation
   - gRPC configuration with TLS settings
   - Integration with existing HTTPS and authentication features
   - Error handling for invalid configurations
   
   ## ๐Ÿ“š Documentation
   
   ### Enhanced README
   - โœ… **Comprehensive Pinot Proxy Support** section with examples
   - โœ… **Detailed gRPC Configuration** section with TLS examples  
   - โœ… **Security Best Practices** for production deployments
   - โœ… **Real-world usage examples** combining proxy + gRPC + HTTPS + 
authentication
   
   ## ๏ฟฝ๏ฟฝ Production Benefits
   
   ### Security
   - ๐Ÿ”’ **Secure proxy-only access** to Pinot clusters
   - ๐Ÿ”’ **TLS/SSL support** for both HTTP and gRPC
   - ๐Ÿ”’ **Works with existing authentication** (Bearer tokens, API keys)
   
   ### Performance
   - ๐Ÿš€ **Optimized gRPC** with configurable message sizes
   - ๐Ÿš€ **Connection pooling** for high-throughput scenarios
   - ๐Ÿš€ **Efficient proxy routing** with minimal overhead
   
   ### Compatibility
   - ๐Ÿ”„ **100% backward compatible** - existing code continues to work
   - ๐Ÿ”„ **Feature parity** with Trino's implementation
   - ๐Ÿ”„ **Production-ready** with comprehensive error handling
   
   ## ๐Ÿ”— Integration Example
   
   ```scala
   // Complete production example: proxy + gRPC + TLS + authentication
   val data = spark.read
     .format("pinot")
     .option("table", "airlineStats")
     .option("tableType", "offline")
     .option("controller", "pinot-proxy:8080")
     .option("proxy.enabled", "true")
     .option("useHttps", "true")
     .option("authToken", "my-secure-token")
     .option("grpc.proxy-uri", "pinot-proxy:8094")
     .option("grpc.use-plain-text", "false")
     .option("grpc.tls.keystore-path", "/path/to/grpc-keystore.jks")
     .option("grpc.tls.keystore-password", "keystore-pass")
     .load()
   ```
   
   This enables **secure, scalable Pinot cluster access** through proxy 
infrastructure with full gRPC and TLS support for production environments! ๐ŸŽ‰


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to