Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-28 Thread via GitHub


alamb merged PR #10531:
URL: https://github.com/apache/datafusion/pull/10531


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-27 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1616109221


##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -512,6 +572,62 @@ pub fn to_substrait_rel(
 }
 }
 
+fn to_substrait_named_struct(schema: ) -> Result {
+// Substrait wants a list of all field names, including nested fields from 
structs,
+// also from within lists and maps. However, it does not want the list and 
map field names
+// themselves - only structs are considered to have useful names.
+fn names_dfs(dtype: ) -> Result> {
+match dtype {
+DataType::Struct(fields) => {
+let mut names = Vec::new();
+for field in fields {
+names.push(field.name().to_string());
+names.extend(names_dfs(field.data_type())?);
+}
+Ok(names)
+}
+DataType::List(l) => names_dfs(l.data_type()),

Review Comment:
   I think so too, let's keep it as a TODO.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-27 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1616085373


##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -512,6 +572,62 @@ pub fn to_substrait_rel(
 }
 }
 
+fn to_substrait_named_struct(schema: ) -> Result {
+// Substrait wants a list of all field names, including nested fields from 
structs,
+// also from within lists and maps. However, it does not want the list and 
map field names
+// themselves - only structs are considered to have useful names.
+fn names_dfs(dtype: ) -> Result> {
+match dtype {
+DataType::Struct(fields) => {
+let mut names = Vec::new();
+for field in fields {
+names.push(field.name().to_string());
+names.extend(names_dfs(field.data_type())?);
+}
+Ok(names)
+}
+DataType::List(l) => names_dfs(l.data_type()),

Review Comment:
   Nice! I added that in 
https://github.com/apache/datafusion/pull/10531/commits/4b8babe5353cff2740c1211cff6973dc83264098,
 though it seems that arrow_cast doesn't [yet support 
structs](https://github.com/apache/datafusion/blob/2e6beb434d45693bc8c7ba636b0407c26c04fc37/datafusion/functions/src/core/arrow_cast.rs#L194),
 so the test still wouldn't have caught this. Unless you can think of some 
workaround? I'm not familiar at all with arrow_cast and the like  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-27 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1616026851


##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -512,6 +572,62 @@ pub fn to_substrait_rel(
 }
 }
 
+fn to_substrait_named_struct(schema: ) -> Result {
+// Substrait wants a list of all field names, including nested fields from 
structs,
+// also from within lists and maps. However, it does not want the list and 
map field names
+// themselves - only structs are considered to have useful names.
+fn names_dfs(dtype: ) -> Result> {
+match dtype {
+DataType::Struct(fields) => {
+let mut names = Vec::new();
+for field in fields {
+names.push(field.name().to_string());
+names.extend(names_dfs(field.data_type())?);
+}
+Ok(names)
+}
+DataType::List(l) => names_dfs(l.data_type()),

Review Comment:
   `arrow_cast` is a solution .
   ```sql
   VALUES(arrow_cast([1,2], 'LargeList(Int64)'));
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-27 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1615825150


##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -512,6 +572,62 @@ pub fn to_substrait_rel(
 }
 }
 
+fn to_substrait_named_struct(schema: ) -> Result {
+// Substrait wants a list of all field names, including nested fields from 
structs,
+// also from within lists and maps. However, it does not want the list and 
map field names
+// themselves - only structs are considered to have useful names.
+fn names_dfs(dtype: ) -> Result> {
+match dtype {
+DataType::Struct(fields) => {
+let mut names = Vec::new();
+for field in fields {
+names.push(field.name().to_string());
+names.extend(names_dfs(field.data_type())?);
+}
+Ok(names)
+}
+DataType::List(l) => names_dfs(l.data_type()),

Review Comment:
   Is there a way to create a LargeList using VALUES for testing, without 
writing a billion values in there?  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-27 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1615824249


##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -512,6 +572,62 @@ pub fn to_substrait_rel(
 }
 }
 
+fn to_substrait_named_struct(schema: ) -> Result {
+// Substrait wants a list of all field names, including nested fields from 
structs,
+// also from within lists and maps. However, it does not want the list and 
map field names
+// themselves - only structs are considered to have useful names.
+fn names_dfs(dtype: ) -> Result> {
+match dtype {
+DataType::Struct(fields) => {
+let mut names = Vec::new();
+for field in fields {
+names.push(field.name().to_string());
+names.extend(names_dfs(field.data_type())?);
+}
+Ok(names)
+}
+DataType::List(l) => names_dfs(l.data_type()),

Review Comment:
   Thanks, fixed 
https://github.com/apache/datafusion/pull/10531/commits/9c83ba469ece074a9611f4db871f2e0044652605



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-27 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1615803887


##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -685,6 +685,19 @@ async fn roundtrip_literal_struct() -> Result<()> {
 .await
 }
 
+#[tokio::test]
+async fn roundtrip_values() -> Result<()> {
+assert_expected_plan(
+"VALUES (1, 'a', [[-213.1, NULL, 5.5, 2.0, 1.0], []], STRUCT(true, 1, 
CAST(NULL AS STRING))), (NULL, NULL, NULL, NULL)",
+"Values: (Int64(1), Utf8(\"a\"), List([[-213.1, , 5.5, 2.0, 1.0], 
[]]), Struct({c0:true,c1:1,c2:})), (Int64(NULL), Utf8(NULL), List(), 
Struct({c0:,c1:,c2:}))")
+.await
+}
+
+#[tokio::test]
+async fn roundtrip_empty_relation() -> Result<()> {
+roundtrip("SELECT * FROM (VALUES (1)) LIMIT 0").await

Review Comment:
   Good catch! Fixed in 
https://github.com/apache/datafusion/pull/10531/commits/24d07c9d92d90f6f3287caca4b70e0f6e2d971f6
 and added schema verification to most tests



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-25 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1614673020


##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -685,6 +685,19 @@ async fn roundtrip_literal_struct() -> Result<()> {
 .await
 }
 
+#[tokio::test]
+async fn roundtrip_values() -> Result<()> {
+assert_expected_plan(
+"VALUES (1, 'a', [[-213.1, NULL, 5.5, 2.0, 1.0], []], STRUCT(true, 1, 
CAST(NULL AS STRING))), (NULL, NULL, NULL, NULL)",
+"Values: (Int64(1), Utf8(\"a\"), List([[-213.1, , 5.5, 2.0, 1.0], 
[]]), Struct({c0:true,c1:1,c2:})), (Int64(NULL), Utf8(NULL), List(), 
Struct({c0:,c1:,c2:}))")
+.await
+}
+
+#[tokio::test]
+async fn roundtrip_empty_relation() -> Result<()> {
+roundtrip("SELECT * FROM (VALUES (1)) LIMIT 0").await

Review Comment:
   We might need a test to verify the schemas. For this test, the plan str on 
both sides would just be `EmptyRelation`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-25 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1614671030


##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -685,6 +685,19 @@ async fn roundtrip_literal_struct() -> Result<()> {
 .await
 }
 
+#[tokio::test]
+async fn roundtrip_values() -> Result<()> {
+assert_expected_plan(
+"VALUES (1, 'a', [[-213.1, NULL, 5.5, 2.0, 1.0], []], STRUCT(true, 1, 
CAST(NULL AS STRING))), (NULL, NULL, NULL, NULL)",
+"Values: (Int64(1), Utf8(\"a\"), List([[-213.1, , 5.5, 2.0, 1.0], 
[]]), Struct({c0:true,c1:1,c2:})), (Int64(NULL), Utf8(NULL), List(), 
Struct({c0:,c1:,c2:}))")
+.await
+}
+
+#[tokio::test]
+async fn roundtrip_empty_relation() -> Result<()> {
+roundtrip("SELECT * FROM (VALUES (1)) LIMIT 0").await

Review Comment:
   I've slightly modified this test and printed out the schema on the consumer 
side, but it seems to be incorrect.
   ```rust
   #[tokio::test]
   async fn roundtrip_empty_relation() -> Result<()> {
   roundtrip("SELECT * FROM (VALUES ([STRUCT(1 as a)], 2)) LIMIT 0").await
   }
   ```
   ```sh
   plan2 schema: DFSchema { inner: Schema { fields: [
   Field { name: "column1", data_type: List(Field { name: "item", data_type: 
Struct([Field { name: "c0", data_type: Int64, nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }]), nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }), nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }, 
   # Its name should be "column2"
   Field { name: "a", data_type: Int64, nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }], metadata: {} }, 
   field_qualifiers: [None, None], functional_dependencies: 
FunctionalDependencies { deps: [] } }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-25 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1614666495


##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -512,6 +572,62 @@ pub fn to_substrait_rel(
 }
 }
 
+fn to_substrait_named_struct(schema: ) -> Result {
+// Substrait wants a list of all field names, including nested fields from 
structs,
+// also from within lists and maps. However, it does not want the list and 
map field names
+// themselves - only structs are considered to have useful names.
+fn names_dfs(dtype: ) -> Result> {
+match dtype {
+DataType::Struct(fields) => {
+let mut names = Vec::new();
+for field in fields {
+names.push(field.name().to_string());
+names.extend(names_dfs(field.data_type())?);
+}
+Ok(names)
+}
+DataType::List(l) => names_dfs(l.data_type()),

Review Comment:
   I think we also need to consider `DataType::LargeList` here because 
`to_substrait_type` supports it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-24 Thread via GitHub


Blizzara commented on PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#issuecomment-2129741689

   @jonahgao @alamb this last one is ready now too :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-23 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1612010377


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1277,6 +1407,56 @@ pub(crate) fn from_substrait_literal(lit: ) -> 
Result {
 s,
 )
 }
+Some(LiteralType::List(l)) => {

Review Comment:
   Split those into https://github.com/apache/datafusion/pull/10615, 
https://github.com/apache/datafusion/pull/10622, 
https://github.com/apache/datafusion/pull/10640



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-23 Thread via GitHub


Blizzara commented on PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#issuecomment-2127582955

   @jonahgao Thanks for the review - I pushed a new version but it builds on 
top of https://github.com/apache/datafusion/pull/10622 and 
https://github.com/apache/datafusion/pull/10640 so will need those merged and 
this rebased first before this PRs looks clean. 
   
   This is mostly what it used to be with one more major change: I removed 
passing the datatype into from_substrait_literal and instead do the same thing 
with names as I did for from_substrait_type. That maybe seems a bit cleaner 
(even though I dislike the name handling, but I don't see a better way to deal 
with it based on how it works in Substrait :/)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-23 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1611984584


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1277,7 +1404,84 @@ pub(crate) fn from_substrait_literal(lit: ) -> 
Result {
 s,
 )
 }
-Some(LiteralType::Null(ntype)) => from_substrait_null(ntype)?,
+Some(LiteralType::List(l)) => {
+let element_dt: Option<> = match dt {
+Some(DataType::List(l)) => Ok(Some(l.data_type())),

Review Comment:
   This was because I wasn't handling empty lists properly (producer created 
Lists rather than EmptyList) - that was fixed in the Lists PR. Thanks for 
catching!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1609182533


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -505,7 +507,35 @@ pub async fn from_substrait_rel(
 _ => Ok(t),
 }
 }
-_ => not_impl_err!("Only NamedTable reads are supported"),
+Some(ReadType::VirtualTable(vt)) => {
+let base_schema = read
+.base_schema
+.clone()
+.expect("expected schema to exist for virtual table");
+
+let fields = from_substrait_named_struct(_schema)?;
+let schema = 
DFSchemaRef::new(DFSchema::try_from(Schema::new(fields))?);

Review Comment:
   I once wanted to remove the schema, but we need to use it to pass the 
struct's names, so the current approach should be fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1609180143


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1277,7 +1404,84 @@ pub(crate) fn from_substrait_literal(lit: ) -> 
Result {
 s,
 )
 }
-Some(LiteralType::Null(ntype)) => from_substrait_null(ntype)?,
+Some(LiteralType::List(l)) => {
+let element_dt: Option<> = match dt {
+Some(DataType::List(l)) => Ok(Some(l.data_type())),

Review Comment:
   Thank you for your explanation @Blizzara . My concern is that using two 
types for List simultaneously is a bit verbose unless it is necessary.
   
   I tested the following two cases: the first one panicked, and the second one 
also failed.
   ```rust
   
   #[tokio::test]
   async fn roundtrip_list_iteral() -> Result<()> {
   roundtrip("SELECT [] from data").await
   }
   
   #[tokio::test]
   async fn roundtrip_struct_iteral() -> Result<()> {
   roundtrip("select struct(1 as name0, 3.14 as name1, 'e', true as name3) 
from data")
   .await
   }
   ``` 
   ```sh
    cases::roundtrip_logical_plan::roundtrip_list_iteral stdout 
   thread 'cases::roundtrip_logical_plan::roundtrip_list_iteral' panicked at 
consumer.rs:1425:41:
   index out of bounds: the len is 0 but the index is 0
   ```
   I think lists and structs are more complex than the virtual table, so it's 
better for them to have separate PRs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1608623644


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1277,7 +1404,84 @@ pub(crate) fn from_substrait_literal(lit: ) -> 
Result {
 s,
 )
 }
-Some(LiteralType::Null(ntype)) => from_substrait_null(ntype)?,
+Some(LiteralType::List(l)) => {
+let element_dt: Option<> = match dt {
+Some(DataType::List(l)) => Ok(Some(l.data_type())),

Review Comment:
   Yeah, we already do that (look at the first element) if `dt` is not given. I 
think the reason I included `dt` was because Substrait types don't include the 
field names for structs, so not using `dt` would mean using something like 
`names` and `name_idx` instead (like in from_substrait_type_with_names).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1608622303


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -505,7 +507,35 @@ pub async fn from_substrait_rel(
 _ => Ok(t),
 }
 }
-_ => not_impl_err!("Only NamedTable reads are supported"),
+Some(ReadType::VirtualTable(vt)) => {
+let base_schema = read
+.base_schema
+.clone()
+.expect("expected schema to exist for virtual table");
+
+let fields = from_substrait_named_struct(_schema)?;
+let schema = 
DFSchemaRef::new(DFSchema::try_from(Schema::new(fields))?);

Review Comment:
   It probably could, except for the column names - LogicalPlanBuilder::values 
names everything to just `column1` etc. Do you see some benefit to using that 
(and then handling the column names somehow) instead of passing in the schema 
explicitly?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1608610498


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1277,6 +1407,56 @@ pub(crate) fn from_substrait_literal(lit: ) -> 
Result {
 s,
 )
 }
+Some(LiteralType::List(l)) => {

Review Comment:
   Sure, will do!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1608431571


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -505,7 +507,35 @@ pub async fn from_substrait_rel(
 _ => Ok(t),
 }
 }
-_ => not_impl_err!("Only NamedTable reads are supported"),
+Some(ReadType::VirtualTable(vt)) => {
+let base_schema = read
+.base_schema
+.clone()
+.expect("expected schema to exist for virtual table");
+
+let fields = from_substrait_named_struct(_schema)?;
+let schema = 
DFSchemaRef::new(DFSchema::try_from(Schema::new(fields))?);

Review Comment:
   I suspect that the schema here may not be necessary, perhaps it can be 
restored through 
[LogicalPlanBuilder::values](https://github.com/apache/datafusion/blob/f80794779ef251528ee983e22ec03c158deb42a6/datafusion/expr/src/logical_plan/builder.rs#L168).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1608413505


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1277,6 +1407,56 @@ pub(crate) fn from_substrait_literal(lit: ) -> 
Result {
 s,
 )
 }
+Some(LiteralType::List(l)) => {

Review Comment:
   If it is convenient, using separate PRs would make the review process 
easier. Perhaps one PR could focus on the virtual table, while the other two 
could address the list and struct separately.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1608408244


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1277,7 +1404,84 @@ pub(crate) fn from_substrait_literal(lit: ) -> 
Result {
 s,
 )
 }
-Some(LiteralType::Null(ntype)) => from_substrait_null(ntype)?,
+Some(LiteralType::List(l)) => {
+let element_dt: Option<> = match dt {
+Some(DataType::List(l)) => Ok(Some(l.data_type())),

Review Comment:
   I am wondering if `dt` is redundant here and can be removed. We still need 
to check whether it is a `LargeList`, just as we check 
`type_variation_reference` below.
   
   Is it possible to only extract `element_dt` from `elements[0]`?  I guess 
that `elements` are not empty because substrait has a special type called 
EmptyList.
   ```rust
pub struct List {
   /// A homogeneously-typed list of one or more expressions that 
form the
   /// list entries. To specify an empty list, use 
Literal.empty_list
   /// (otherwise type information would be missing).
   #[prost(message, repeated, tag="1")]
   pub values: ::prost::alloc::vec::Vec,
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1608039288


##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -607,6 +608,15 @@ async fn qualified_catalog_schema_table_reference() -> 
Result<()> {
 roundtrip("SELECT a,b,c,d,e FROM datafusion.public.data;").await
 }
 
+#[tokio::test]
+async fn roundtrip_local_relation() -> Result<()> {
+let ctx = create_context().await?;
+let row1 = vec![lit(1), lit("a")];
+let row2 = vec![lit(2), lit("b")];
+let plan = LogicalPlanBuilder::values(vec![row1, row2])?.build()?;

Review Comment:
   Nice, that works - 18b7f0821799b6f446a5191de6038684ce79fc94.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-21 Thread via GitHub


Blizzara commented on PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#issuecomment-2122279768

   @jonahgao @alamb I think this is ready by me :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-20 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1607637374


##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -607,6 +608,15 @@ async fn qualified_catalog_schema_table_reference() -> 
Result<()> {
 roundtrip("SELECT a,b,c,d,e FROM datafusion.public.data;").await
 }
 
+#[tokio::test]
+async fn roundtrip_local_relation() -> Result<()> {
+let ctx = create_context().await?;
+let row1 = vec![lit(1), lit("a")];
+let row2 = vec![lit(2), lit("b")];
+let plan = LogicalPlanBuilder::values(vec![row1, row2])?.build()?;

Review Comment:
   I agree. For Values plan, these aliases should be able to be ignored.
   I think we could use `assert_expected_plan` during testing to ignore 
aliases, for example:
   ```rust
   #[tokio::test]
   async fn roundtrip_values() -> Result<()> {
   assert_expected_plan(
   "VALUES (1, 'a', [[1,2,3], []], STRUCT(true, 1))",
   "Values: (Int64(1), Utf8(\"a\"), List([[1, 2, 3], []]), 
Struct({c0:true,c1:1}))",
   )
   .await
   }
   ```
   Additionally, aliases should be ignored when handling `LogicalPlan::Values` 
in `to_substrait_rel`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-20 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1607637374


##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -607,6 +608,15 @@ async fn qualified_catalog_schema_table_reference() -> 
Result<()> {
 roundtrip("SELECT a,b,c,d,e FROM datafusion.public.data;").await
 }
 
+#[tokio::test]
+async fn roundtrip_local_relation() -> Result<()> {
+let ctx = create_context().await?;
+let row1 = vec![lit(1), lit("a")];
+let row2 = vec![lit(2), lit("b")];
+let plan = LogicalPlanBuilder::values(vec![row1, row2])?.build()?;

Review Comment:
   I agree. For Values plan, these aliases should be be ignored.
   I think we could use `assert_expected_plan` during testing to ignore 
aliases, for example:
   ```rust
   #[tokio::test]
   async fn roundtrip_values() -> Result<()> {
   assert_expected_plan(
   "VALUES (1, 'a', [[1,2,3], []], STRUCT(true, 1))",
   "Values: (Int64(1), Utf8(\"a\"), List([[1, 2, 3], []]), 
Struct({c0:true,c1:1}))",
   )
   .await
   }
   ```
   Additionally, aliases should be ignored when handling `LogicalPlan::Values` 
in `to_substrait_rel`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-20 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1606981891


##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -607,6 +608,15 @@ async fn qualified_catalog_schema_table_reference() -> 
Result<()> {
 roundtrip("SELECT a,b,c,d,e FROM datafusion.public.data;").await
 }
 
+#[tokio::test]
+async fn roundtrip_local_relation() -> Result<()> {
+let ctx = create_context().await?;
+let row1 = vec![lit(1), lit("a")];
+let row2 = vec![lit(2), lit("b")];
+let plan = LogicalPlanBuilder::values(vec![row1, row2])?.build()?;

Review Comment:
   Ah, I'm blind, thanks!
   
   I tried switching to that, but it produces somewhat ugly plans as it uses 
`make_array` and `struct` as UDFs, making the assert fail:
   
   ```
 left: "Values: (Int64(1), Utf8(\"a\"), List([[-213.1, , 5.5, 2.0, 1.0], 
[]]) AS 
make_array(make_array(Float64(-213.1),NULL,Float64(5.5),Float64(2),Float64(1)),make_array()),
 Struct({c0:true,c1:1}) AS struct(Boolean(true),Int64(1))), (Int64(NULL), 
Utf8(NULL), List(), Struct({c0:,c1:}))"
right: "Values: (Int64(1), Utf8(\"a\"), List([[-213.1, , 5.5, 2.0, 1.0], 
[]]), Struct({c0:true,c1:1})), (Int64(NULL), Utf8(NULL), List(), 
Struct({c0:,c1:}))"
``` 
   The `AS` aliases it produces seem to get ignored (the column names are still 
column1, column2, column3, ..), so it doesn't seem to make sense to map them 
into Substrait (and I'm not even sure if that'd be possible). Any ideas how to 
go around that?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-20 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1601980707


##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -165,6 +168,53 @@ pub fn to_substrait_rel(
 }))),
 }))
 }
+LogicalPlan::Values(v) => {
+let names = v
+.schema
+.fields()
+.iter()
+.map(|f| f.name().to_owned())
+// todo: collect also nested names
+.collect();
+let field_types = r#type::Struct {
+types: v
+.schema
+.fields()
+.iter()
+.map(|f| to_substrait_type(f.data_type()).unwrap())
+.collect(),
+type_variation_reference: DEFAULT_TYPE_REF,
+nullability: Nullability::Unspecified as i32,
+};
+let values = v
+.values
+.iter()
+.map(|row| {
+let fields = row
+.iter()
+.map(|v| match v {
+Expr::Literal(sv) => 
to_substrait_literal(sv).unwrap(),
+_ => panic!(),

Review Comment:
   note for self: better error



##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -165,6 +168,53 @@ pub fn to_substrait_rel(
 }))),
 }))
 }
+LogicalPlan::Values(v) => {
+let names = v
+.schema
+.fields()
+.iter()
+.map(|f| f.name().to_owned())
+// todo: collect also nested names

Review Comment:
   note for self: todo



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-20 Thread via GitHub


jonahgao commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1606501759


##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -607,6 +608,15 @@ async fn qualified_catalog_schema_table_reference() -> 
Result<()> {
 roundtrip("SELECT a,b,c,d,e FROM datafusion.public.data;").await
 }
 
+#[tokio::test]
+async fn roundtrip_local_relation() -> Result<()> {
+let ctx = create_context().await?;
+let row1 = vec![lit(1), lit("a")];
+let row2 = vec![lit(2), lit("b")];
+let plan = LogicalPlanBuilder::values(vec![row1, row2])?.build()?;

Review Comment:
It is implemented, and you can use it like this.
   ```rust
   roundtrip("VALUES (1, 'a', [[1,2,3], []], STRUCT(true, 1))").await
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-16 Thread via GitHub


alamb commented on PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#issuecomment-2116424197

   Thanks @Blizzara  -- I started the CI on this PR


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-16 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1601979954


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1277,6 +1407,56 @@ pub(crate) fn from_substrait_literal(lit: ) -> 
Result {
 s,
 )
 }
+Some(LiteralType::List(l)) => {

Review Comment:
   I don't necessarily need these for this PR, so if it's preferrable I'm happy 
to split them into separate PRs - I just added them while coding to confirm 
that lists and structs inside lists are handled correctly.
   
   Same applies for the ones in `producer.rs`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-15 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1601981711


##
datafusion/substrait/tests/cases/roundtrip_logical_plan.rs:
##
@@ -607,6 +608,15 @@ async fn qualified_catalog_schema_table_reference() -> 
Result<()> {
 roundtrip("SELECT a,b,c,d,e FROM datafusion.public.data;").await
 }
 
+#[tokio::test]
+async fn roundtrip_local_relation() -> Result<()> {
+let ctx = create_context().await?;
+let row1 = vec![lit(1), lit("a")];
+let row2 = vec![lit(2), lit("b")];
+let plan = LogicalPlanBuilder::values(vec![row1, row2])?.build()?;

Review Comment:
   I didn't find `VALUES` in the DataFusion SQL - is it not implemented yet?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-15 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1601980707


##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -165,6 +168,53 @@ pub fn to_substrait_rel(
 }))),
 }))
 }
+LogicalPlan::Values(v) => {
+let names = v
+.schema
+.fields()
+.iter()
+.map(|f| f.name().to_owned())
+// todo: collect also nested names
+.collect();
+let field_types = r#type::Struct {
+types: v
+.schema
+.fields()
+.iter()
+.map(|f| to_substrait_type(f.data_type()).unwrap())
+.collect(),
+type_variation_reference: DEFAULT_TYPE_REF,
+nullability: Nullability::Unspecified as i32,
+};
+let values = v
+.values
+.iter()
+.map(|row| {
+let fields = row
+.iter()
+.map(|v| match v {
+Expr::Literal(sv) => 
to_substrait_literal(sv).unwrap(),
+_ => panic!(),

Review Comment:
   note for self: better error



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-15 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1601980361


##
datafusion/substrait/src/logical_plan/producer.rs:
##
@@ -165,6 +168,53 @@ pub fn to_substrait_rel(
 }))),
 }))
 }
+LogicalPlan::Values(v) => {
+let names = v
+.schema
+.fields()
+.iter()
+.map(|f| f.name().to_owned())
+// todo: collect also nested names

Review Comment:
   note for self: todo



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-15 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1601979954


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1277,6 +1407,56 @@ pub(crate) fn from_substrait_literal(lit: ) -> 
Result {
 s,
 )
 }
+Some(LiteralType::List(l)) => {

Review Comment:
   I don't necessarily need these for this PR, so if it's preferrable I'm happy 
to split them into separate PRs - I just added them while coding to confirm 
that lists and structs inside lists are handled correctly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Support Substrait's VirtualTables [datafusion]

2024-05-15 Thread via GitHub


Blizzara commented on code in PR #10531:
URL: https://github.com/apache/datafusion/pull/10531#discussion_r1601978505


##
datafusion/substrait/src/logical_plan/consumer.rs:
##
@@ -1137,7 +1220,8 @@ fn from_substrait_type(dt: ::proto::Type) -> 
Result {
 from_substrait_type(list.r#type.as_ref().ok_or_else(|| {
 substrait_datafusion_err!("List type must have inner 
type")
 })?)?;
-let field = Arc::new(Field::new("list_item", inner_type, 
true));
+// "item" as the field name aligns with Arrow's default, see 
e.g. array_into_list_array
+let field = Arc::new(Field::new("item", inner_type, true));

Review Comment:
   this is in theory a breaking change I guess, but I don't know if there's any 
reason for anyone to be referring to the item field name



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org