GavinRay97 commented on issue #12618:
URL: https://github.com/apache/arrow/issues/12618#issuecomment-1072431572


   FWIW, there is already a pretty solid integration with JDBC `ResultSet` 
objects and conversion between JDBC types and Arrow types:
   
   - 
https://github.com/apache/arrow/blob/09497a976604c1960c5934e8f05dd8203700efd6/java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
   
   > What does a high level dataframe API bring that JDBC does not already give 
us?
   
   I guess maybe the easiest way to answer this is with a code sample.
   
   Say that I have an Arrow FlightSQL service, and I need to respond to 
`GetTables` with a list of Arrow objects describing the tables in my schema.
   
   In the FlightSQL example, this is the implementation currently:
   
   - 
https://github.com/apache/arrow/blob/b4143d309c71c0247c056471a27fa7f03034cc76/java/flight/flight-sql/src/test/java/org/apache/arrow/flight/sql/example/FlightSqlExample.java#L436-L530
   
   This is a personal opinion so I probably wouldn't count this as an argument, 
but I found this really hard to approach from an understanding perspective.
   
   But on a more realistic level -- what if you wanted to return a set of table 
descriptions from data that wasn't a JDBC `ResultSet`?
   
   Something like the below seems a lot more easy/approachable and versatile 
(to me) at the cost of being less performant:
   
   ```java
   record FlightSQLGetTablesSchemaPOJO(String catalogName, String schemaName, 
String tableName, String tableType,
                                       DataFrame dataFrame) {
       public VectorSchemaRoot toArrowVectorSchemaRoot() {
           DataFrame.Builder builder = DataFrame.builder();
           builder.addColumn("catalog_name", MinorType.VARCHAR, false);
           builder.addColumn("db_schema_name", MinorType.VARCHAR, false);
           builder.addColumn("table_name", MinorType.VARCHAR, false);
           builder.addColumn("table_type", MinorType.VARCHAR, false);
           builder.addColumn("table_schema", MinorType.VARBINARY, false);
   
           Map<String, Object> row = new HashMap<>();
           row.put("catalog_name", catalogName);
           row.put("db_schema_name", schemaName);
           row.put("table_name", tableName);
           row.put("table_type", tableType);
           row.put("table_schema", new 
Schema(dataFrame.columns()).toByteArray());
   
           builder.addRow(row);
           DataFrame df = builder.build();
   
           return df.toArrowVectorSchemaRoot();
       }
   }
   
   public void getStreamTables(FlightSql.CommandGetTables command, CallContext 
context,
           ServerStreamListener listener) {
       try {
           DataFrame userSchema = DataFrame.builder()
                   .addColumn("id", Types.MinorType.INT.getType(), false)
                   .addColumn("name", Types.MinorType.VARCHAR.getType(), false)
                   .build();
   
           DataFrame todoSchema = DataFrame.builder()
                   .addColumn("id", Types.MinorType.INT.getType(), false)
                   .addColumn("description", Types.MinorType.VARCHAR.getType(), 
false)
                   .addColumn("completed", Types.MinorType.BIT.getType(), false)
                   .build();
   
           FlightSQLGetTablesSchemaPOJO userTable = new 
FlightSQLGetTablesSchemaPOJO(
                   "catalog1", "schema1", "user", "TABLE", userSchema);
   
           FlightSQLGetTablesSchemaPOJO todoTable = new 
FlightSQLGetTablesSchemaPOJO(
                   "catalog1", "schema1", "todo", "TABLE", todoSchema);
   
           VectorSchemaRoot userVectorSchema = userTable.toVectorSchemaRoot();
           VectorSchemaRoot todoVectorSchema = todoTable.toVectorSchemaRoot();
   
           VectorSchemaRoot merged = DataFrame
                   .mergeDataFrames(true,
                           DataFrame.fromVectorSchemaRoot(userVectorSchema),
                           DataFrame.fromVectorSchemaRoot(todoVectorSchema))
                   .toVectorSchemaRoot();
   
           listener.start(merged);
           listener.putNext();
       } catch (Exception e) {
           listener.error(e);
           e.printStackTrace();
       } finally {
           listener.completed();
       }
   }
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to