[jira] [Commented] (SPARK-44111) Prepare Apache Spark 4.0.0

2024-04-25 Thread Florent BIVILLE (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17840788#comment-17840788
 ] 

Florent BIVILLE commented on SPARK-44111:
-

Is there going to be pre-releases for Spark 4 that library authors can try?

Or shall we build from the `master` branch and report back?

> Prepare Apache Spark 4.0.0
> --
>
> Key: SPARK-44111
> URL: https://issues.apache.org/jira/browse/SPARK-44111
> Project: Spark
>  Issue Type: Umbrella
>  Components: Build
>Affects Versions: 4.0.0
>Reporter: Dongjoon Hyun
>Priority: Critical
>  Labels: pull-request-available
>
> For now, this issue aims to collect ideas for planning Apache Spark 4.0.0.
> We will add more items which will be excluded from Apache Spark 3.5.0 
> (Feature Freeze: July 16th, 2023).
> {code}
> Spark 1: 2014.05 (1.0.0) ~ 2016.11 (1.6.3)
> Spark 2: 2016.07 (2.0.0) ~ 2021.05 (2.4.8)
> Spark 3: 2020.06 (3.0.0) ~ 2026.xx (3.5.x)
> Spark 4: 2024.06 (4.0.0, NEW)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44262) JdbcUtils hardcodes some SQL statements

2023-07-03 Thread Florent BIVILLE (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17739511#comment-17739511
 ] 

Florent BIVILLE commented on SPARK-44262:
-

Thanks for sharing the document, much appreciated!
I know and co-maintain the Spark connector for Neo4j, which is indeed the most 
straightforward way to interact with Spark.
My investigation is specifically about JDBC, which is why I have not used the 
Spark connector for Neo4j here.

> JdbcUtils hardcodes some SQL statements
> ---
>
> Key: SPARK-44262
> URL: https://issues.apache.org/jira/browse/SPARK-44262
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Florent BIVILLE
>Priority: Major
>
> I am currently investigating an integration with the [Neo4j JBDC 
> driver|https://github.com/neo4j-contrib/neo4j-jdbc] and a Spark-based cloud 
> vendor SDK.
>  
> This SDK relies on Spark's {{JdbcUtils}} to run queries and insert data.
> While {{JdbcUtils}} partly delegates to 
> \{{org.apache.spark.sql.jdbc.JdbcDialect}} for some queries, some others are 
> hardcoded to SQL, see:
>  * {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#dropTable}}
>  * 
> {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getInsertStatement}}
>  
> This works fine for relational databases but breaks for NOSQL stores that do 
> not support SQL translation (like Neo4j).
> Is there a plan to augment the {{JdbcDialect}} surface so that it is also 
> responsible for these currently-hardcoded queries?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-44262) JdbcUtils hardcodes some SQL statements

2023-07-03 Thread Florent BIVILLE (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17739496#comment-17739496
 ] 

Florent BIVILLE edited comment on SPARK-44262 at 7/3/23 8:46 AM:
-

Thanks for the quick reply. While I don't have any access to the 
(closed-source) SDK project, I'd be interested in learning more about this.

Maybe it's not the right place, but looking at 
{{{}org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider{}}}, I 
see some code paths using these hardcoded SQL statements (see 
{{org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider#createRelation}}{{
 for instance).}}


Is there any specific pointers in the codebase or documentation you could point 
me to? Thanks a lot!


was (Author: fbiville):
Thanks for the quick reply. While I don't have any access to the 
(closed-source) SDK project, I'd be interested in learning more about this.
Is there any specific pointers in the codebase or documentation you could point 
me to? Thanks a lot!

> JdbcUtils hardcodes some SQL statements
> ---
>
> Key: SPARK-44262
> URL: https://issues.apache.org/jira/browse/SPARK-44262
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Florent BIVILLE
>Priority: Major
>
> I am currently investigating an integration with the [Neo4j JBDC 
> driver|https://github.com/neo4j-contrib/neo4j-jdbc] and a Spark-based cloud 
> vendor SDK.
>  
> This SDK relies on Spark's {{JdbcUtils}} to run queries and insert data.
> While {{JdbcUtils}} partly delegates to 
> \{{org.apache.spark.sql.jdbc.JdbcDialect}} for some queries, some others are 
> hardcoded to SQL, see:
>  * {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#dropTable}}
>  * 
> {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getInsertStatement}}
>  
> This works fine for relational databases but breaks for NOSQL stores that do 
> not support SQL translation (like Neo4j).
> Is there a plan to augment the {{JdbcDialect}} surface so that it is also 
> responsible for these currently-hardcoded queries?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-44262) JdbcUtils hardcodes some SQL statements

2023-07-03 Thread Florent BIVILLE (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-44262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17739496#comment-17739496
 ] 

Florent BIVILLE commented on SPARK-44262:
-

Thanks for the quick reply. While I don't have any access to the 
(closed-source) SDK project, I'd be interested in learning more about this.
Is there any specific pointers in the codebase or documentation you could point 
me to? Thanks a lot!

> JdbcUtils hardcodes some SQL statements
> ---
>
> Key: SPARK-44262
> URL: https://issues.apache.org/jira/browse/SPARK-44262
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Florent BIVILLE
>Priority: Major
>
> I am currently investigating an integration with the [Neo4j JBDC 
> driver|https://github.com/neo4j-contrib/neo4j-jdbc] and a Spark-based cloud 
> vendor SDK.
>  
> This SDK relies on Spark's {{JdbcUtils}} to run queries and insert data.
> While {{JdbcUtils}} partly delegates to 
> \{{org.apache.spark.sql.jdbc.JdbcDialect}} for some queries, some others are 
> hardcoded to SQL, see:
>  * {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#dropTable}}
>  * 
> {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getInsertStatement}}
>  
> This works fine for relational databases but breaks for NOSQL stores that do 
> not support SQL translation (like Neo4j).
> Is there a plan to augment the {{JdbcDialect}} surface so that it is also 
> responsible for these currently-hardcoded queries?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-44262) JdbcUtils hardcodes some SQL statements

2023-06-30 Thread Florent BIVILLE (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-44262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Florent BIVILLE updated SPARK-44262:

Description: 
I am currently investigating an integration with the [Neo4j JBDC 
driver|https://github.com/neo4j-contrib/neo4j-jdbc] and a Spark-based cloud 
vendor SDK.

 

This SDK relies on Spark's {{JdbcUtils}} to run queries and insert data.

While {{JdbcUtils}} partly delegates to 
\{{org.apache.spark.sql.jdbc.JdbcDialect}} for some queries, some others are 
hardcoded to SQL, see:
 * {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#dropTable}}
 * 
{{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getInsertStatement}}

 

This works fine for relational databases but breaks for NOSQL stores that do 
not support SQL translation (like Neo4j).

Is there a plan to augment the {{JdbcDialect}} surface so that it is also 
responsible for these currently-hardcoded queries?

  was:
I am currently investigating an integration with the [Neo4j JBDC 
driver|https://github.com/neo4j-contrib/neo4j-jdbc] and a Spark-based cloud 
vendor SDK.

 

This SDK relies on Spark's {{JdbcUtils}} to run queries and insert data.

While {{JdbcUtils}} partly delegates to {{org.apache.spark.sql.jdbc.JdbcDialect 
}}for some queries, some others are hardcoded to SQL, see:
 * {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#dropTable}}
 * 
{{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getInsertStatement}}

 

This works fine for relational databases but breaks for NOSQL stores that do 
not support SQL translation (like Neo4j).

Is there a plan to augment the {{JdbcDialect}} surface so that it is also 
responsible for these currently-hardcoded queries?


> JdbcUtils hardcodes some SQL statements
> ---
>
> Key: SPARK-44262
> URL: https://issues.apache.org/jira/browse/SPARK-44262
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Florent BIVILLE
>Priority: Major
>
> I am currently investigating an integration with the [Neo4j JBDC 
> driver|https://github.com/neo4j-contrib/neo4j-jdbc] and a Spark-based cloud 
> vendor SDK.
>  
> This SDK relies on Spark's {{JdbcUtils}} to run queries and insert data.
> While {{JdbcUtils}} partly delegates to 
> \{{org.apache.spark.sql.jdbc.JdbcDialect}} for some queries, some others are 
> hardcoded to SQL, see:
>  * {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#dropTable}}
>  * 
> {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getInsertStatement}}
>  
> This works fine for relational databases but breaks for NOSQL stores that do 
> not support SQL translation (like Neo4j).
> Is there a plan to augment the {{JdbcDialect}} surface so that it is also 
> responsible for these currently-hardcoded queries?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-44262) JdbcUtils hardcodes some SQL statements

2023-06-30 Thread Florent BIVILLE (Jira)
Florent BIVILLE created SPARK-44262:
---

 Summary: JdbcUtils hardcodes some SQL statements
 Key: SPARK-44262
 URL: https://issues.apache.org/jira/browse/SPARK-44262
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 2.2.0
Reporter: Florent BIVILLE


I am currently investigating an integration with the [Neo4j JBDC 
driver|https://github.com/neo4j-contrib/neo4j-jdbc] and a Spark-based cloud 
vendor SDK.

 

This SDK relies on Spark's {{JdbcUtils}} to run queries and insert data.

While {{JdbcUtils}} partly delegates to {{org.apache.spark.sql.jdbc.JdbcDialect 
}}for some queries, some others are hardcoded to SQL, see:
 * {{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#dropTable}}
 * 
{{org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils#getInsertStatement}}

 

This works fine for relational databases but breaks for NOSQL stores that do 
not support SQL translation (like Neo4j).

Is there a plan to augment the {{JdbcDialect}} surface so that it is also 
responsible for these currently-hardcoded queries?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org