[jira] [Assigned] (SPARK-42000) saveAsTable fail to find the default source (ReadwriterTests.test_insert_into)

2023-02-13 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-42000:


Assignee: Takuya Ueshin

> saveAsTable fail to find the default source (ReadwriterTests.test_insert_into)
> --
>
> Key: SPARK-42000
> URL: https://issues.apache.org/jira/browse/SPARK-42000
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Assignee: Takuya Ueshin
>Priority: Major
>
> {code}
> org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed 
> to find the data source: . Please find packages at 
> `https://spark.apache.org/third-party-projects.html`.
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.dataSourceNotFoundError(QueryExecutionErrors.scala:739)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:646)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:696)
>   at 
> org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:860)
>   at 
> org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:559)
>   at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.handleWriteOperation(SparkConnectPlanner.scala:1426)
>   at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:1297)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handleCommand(SparkConnectStreamHandler.scala:182)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handle(SparkConnectStreamHandler.scala:48)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectService.executePlan(SparkConnectService.scala:135)
>   at 
> org.apache.spark.connect.proto.SparkConnectServiceGrpc$MethodHandlers.invoke(SparkConnectServiceGrpc.java:306)
>   at 
> org.sparkproject.connect.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
>   at 
> org.sparkproject.connect.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:352)
>   at 
> org.sparkproject.connect.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866)
>   at 
> org.sparkproject.connect.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> org.sparkproject.connect.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: .DefaultSource
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:632)
>   at scala.util.Try$.apply(Try.scala:213)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:632)
>   at scala.util.Failure.orElse(Try.scala:224)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:632)
>   ... 17 more
> pyspark/sql/tests/test_readwriter.py:159 
> (ReadwriterParityTests.test_insert_into)
> self = 
>  testMethod=test_insert_into>
> def test_insert_into(self):
> df = self.spark.createDataFrame([("a", 1), ("b", 2)], ["C1", "C2"])
> with self.table("test_table"):
> >   df.write.saveAsTable("test_table")
> ../test_readwriter.py:163: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../../connect/readwriter.py:381: in saveAsTable
> 
> self._spark.client.execute_command(self._write.command(self._spark.client))
> ../../connect/client.py:478: in execute_command
> self._execute(req)
> ../../connect/client.py:562: in _execute
> self._handle_error(rpc_error)
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self =  0x7fe0d069b5b0>
> rpc_error = <_MultiThreadedRendezvous of RPC that terminated with:
>   status = StatusCode.INTERNAL
>   details = ".DefaultSource"
>   debu...pv6:%5B::1%5D:15002 
> {created_time:"2023-01-12T11:27:46.698322+09:00", grpc_status:13, 
> grpc_message:".DefaultSource"}"
> >
> def _handle_error(self, rpc_error: grpc.RpcError) -> NoReturn:
> """
> 

[jira] [Assigned] (SPARK-42000) saveAsTable fail to find the default source (ReadwriterTests.test_insert_into)

2023-02-13 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42000:


Assignee: (was: Apache Spark)

> saveAsTable fail to find the default source (ReadwriterTests.test_insert_into)
> --
>
> Key: SPARK-42000
> URL: https://issues.apache.org/jira/browse/SPARK-42000
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> {code}
> org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed 
> to find the data source: . Please find packages at 
> `https://spark.apache.org/third-party-projects.html`.
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.dataSourceNotFoundError(QueryExecutionErrors.scala:739)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:646)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:696)
>   at 
> org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:860)
>   at 
> org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:559)
>   at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.handleWriteOperation(SparkConnectPlanner.scala:1426)
>   at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:1297)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handleCommand(SparkConnectStreamHandler.scala:182)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handle(SparkConnectStreamHandler.scala:48)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectService.executePlan(SparkConnectService.scala:135)
>   at 
> org.apache.spark.connect.proto.SparkConnectServiceGrpc$MethodHandlers.invoke(SparkConnectServiceGrpc.java:306)
>   at 
> org.sparkproject.connect.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
>   at 
> org.sparkproject.connect.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:352)
>   at 
> org.sparkproject.connect.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866)
>   at 
> org.sparkproject.connect.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> org.sparkproject.connect.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: .DefaultSource
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:632)
>   at scala.util.Try$.apply(Try.scala:213)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:632)
>   at scala.util.Failure.orElse(Try.scala:224)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:632)
>   ... 17 more
> pyspark/sql/tests/test_readwriter.py:159 
> (ReadwriterParityTests.test_insert_into)
> self = 
>  testMethod=test_insert_into>
> def test_insert_into(self):
> df = self.spark.createDataFrame([("a", 1), ("b", 2)], ["C1", "C2"])
> with self.table("test_table"):
> >   df.write.saveAsTable("test_table")
> ../test_readwriter.py:163: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../../connect/readwriter.py:381: in saveAsTable
> 
> self._spark.client.execute_command(self._write.command(self._spark.client))
> ../../connect/client.py:478: in execute_command
> self._execute(req)
> ../../connect/client.py:562: in _execute
> self._handle_error(rpc_error)
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self =  0x7fe0d069b5b0>
> rpc_error = <_MultiThreadedRendezvous of RPC that terminated with:
>   status = StatusCode.INTERNAL
>   details = ".DefaultSource"
>   debu...pv6:%5B::1%5D:15002 
> {created_time:"2023-01-12T11:27:46.698322+09:00", grpc_status:13, 
> grpc_message:".DefaultSource"}"
> >
> def _handle_error(self, rpc_error: grpc.RpcError) -> NoReturn:
> """
> Error handling helper for 

[jira] [Assigned] (SPARK-42000) saveAsTable fail to find the default source (ReadwriterTests.test_insert_into)

2023-02-13 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42000:


Assignee: Apache Spark

> saveAsTable fail to find the default source (ReadwriterTests.test_insert_into)
> --
>
> Key: SPARK-42000
> URL: https://issues.apache.org/jira/browse/SPARK-42000
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Assignee: Apache Spark
>Priority: Major
>
> {code}
> org.apache.spark.SparkClassNotFoundException: [DATA_SOURCE_NOT_FOUND] Failed 
> to find the data source: . Please find packages at 
> `https://spark.apache.org/third-party-projects.html`.
>   at 
> org.apache.spark.sql.errors.QueryExecutionErrors$.dataSourceNotFoundError(QueryExecutionErrors.scala:739)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:646)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSourceV2(DataSource.scala:696)
>   at 
> org.apache.spark.sql.DataFrameWriter.lookupV2Provider(DataFrameWriter.scala:860)
>   at 
> org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:559)
>   at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.handleWriteOperation(SparkConnectPlanner.scala:1426)
>   at 
> org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:1297)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handleCommand(SparkConnectStreamHandler.scala:182)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handle(SparkConnectStreamHandler.scala:48)
>   at 
> org.apache.spark.sql.connect.service.SparkConnectService.executePlan(SparkConnectService.scala:135)
>   at 
> org.apache.spark.connect.proto.SparkConnectServiceGrpc$MethodHandlers.invoke(SparkConnectServiceGrpc.java:306)
>   at 
> org.sparkproject.connect.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
>   at 
> org.sparkproject.connect.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:352)
>   at 
> org.sparkproject.connect.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866)
>   at 
> org.sparkproject.connect.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>   at 
> org.sparkproject.connect.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.ClassNotFoundException: .DefaultSource
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:632)
>   at scala.util.Try$.apply(Try.scala:213)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:632)
>   at scala.util.Failure.orElse(Try.scala:224)
>   at 
> org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:632)
>   ... 17 more
> pyspark/sql/tests/test_readwriter.py:159 
> (ReadwriterParityTests.test_insert_into)
> self = 
>  testMethod=test_insert_into>
> def test_insert_into(self):
> df = self.spark.createDataFrame([("a", 1), ("b", 2)], ["C1", "C2"])
> with self.table("test_table"):
> >   df.write.saveAsTable("test_table")
> ../test_readwriter.py:163: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../../connect/readwriter.py:381: in saveAsTable
> 
> self._spark.client.execute_command(self._write.command(self._spark.client))
> ../../connect/client.py:478: in execute_command
> self._execute(req)
> ../../connect/client.py:562: in _execute
> self._handle_error(rpc_error)
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self =  0x7fe0d069b5b0>
> rpc_error = <_MultiThreadedRendezvous of RPC that terminated with:
>   status = StatusCode.INTERNAL
>   details = ".DefaultSource"
>   debu...pv6:%5B::1%5D:15002 
> {created_time:"2023-01-12T11:27:46.698322+09:00", grpc_status:13, 
> grpc_message:".DefaultSource"}"
> >
> def _handle_error(self, rpc_error: grpc.RpcError) -> NoReturn:
> """
> Error