[GitHub] [incubator-seatunnel] Bingz2 opened a new issue, #2449: [Bug] [Connector-V2][File Local Sink] Error running Spark Connector V2 example using IDE

GitBox Wed, 17 Aug 2022 08:02:45 -0700


Bingz2 opened a new issue, #2449:
URL: https://github.com/apache/incubator-seatunnel/issues/2449


   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22bug%22)
 and found no similar issues.
   
   
   ### What happened
   
   Error running Spark Connector V2 example using IDE
   
   ### SeaTunnel Version
   
   dev
   
   ### SeaTunnel Config
   
   ```conf
   env {
     # You can set spark configuration here
     # see available properties defined by spark: 
https://spark.apache.org/docs/latest/configuration.html#available-properties
     #job.mode = BATCH
     spark.app.name = "SeaTunnel"
     spark.executor.instances = 2
     spark.executor.cores = 1
     spark.executor.memory = "1g"
     spark.master = local
   }
   
   source {
     # This is a example input plugin **only for test and demonstrate the 
feature input plugin**
     FakeSource {
       result_table_name = "fake"
       field_name = "name,age,timestamp"
     }
   
     # You can also use other input plugins, such as hdfs
     # hdfs {
     #   result_table_name = "accesslog"
     #   path = "hdfs://hadoop-cluster-01/nginx/accesslog"
     #   format = "json"
     # }
   
     # If you would like to get more information about how to configure 
seatunnel and see full list of input plugins,
     # please go to 
https://seatunnel.apache.org/docs/spark/configuration/source-plugins/Fake
   }
   
   transform {
     # split data by specific delimiter
   
     # you can also use other transform plugins, such as sql
     sql {
       sql = "select name,age from fake"
       result_table_name = "sql"
     }
   
     # If you would like to get more information about how to configure 
seatunnel and see full list of transform plugins,
     # please go to 
https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Split
   }
   
   sink {
     # choose stdout output plugin to output data to console
     LocalFile {
        format = "orc"
        path = "D:/workspace/test/st"
        file_name_expression = "orc"
     }
   
     # you can also you other output plugins, such as sql
     # hdfs {
     #   path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed"
     #   save_mode = "append"
     # }
   
     # If you would like to get more information about how to configure 
seatunnel and see full list of output plugins,
     # please go to 
https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console
   }
   ```
   
   
   ### Running Command
   
   ```shell
   Run the Spark Connector v2 Example using a local IDE
   ```
   
   
   ### Error Exception
   
   ```log
   22/08/17 22:49:03 INFO Executor: Starting executor ID driver on host 
localhost
   22/08/17 22:49:03 INFO Utils: Successfully started service 
'org.apache.spark.network.netty.NettyBlockTransferService' on port 53914.
   22/08/17 22:49:03 INFO NettyBlockTransferService: Server created on 
GITV:53914
   22/08/17 22:49:03 INFO BlockManager: Using 
org.apache.spark.storage.RandomBlockReplicationPolicy for block replication 
policy
   22/08/17 22:49:03 INFO BlockManagerMaster: Registering BlockManager 
BlockManagerId(driver, GITV, 53914, None)
   22/08/17 22:49:04 INFO BlockManagerMasterEndpoint: Registering block manager 
GITV:53914 with 1965.3 MB RAM, BlockManagerId(driver, GITV, 53914, None)
   22/08/17 22:49:04 INFO BlockManagerMaster: Registered BlockManager 
BlockManagerId(driver, GITV, 53914, None)
   22/08/17 22:49:04 INFO BlockManager: Initialized BlockManager: 
BlockManagerId(driver, GITV, 53914, None)
   22/08/17 22:49:04 INFO ContextHandler: Started 
o.s.j.s.ServletContextHandler@6f95cd51{/metrics/json,null,AVAILABLE,@Spark}
   22/08/17 22:49:04 WARN StreamingContext: spark.master should be set as 
local[n], n > 1 in local mode if you have receivers to get data, otherwise 
Spark jobs will not get resources to process the received data.
   22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load SeaTunnelSource Plugin 
from 
D:\workspace\idea\seatunnel\incubator-seatunnel\seatunnel-common\connectors\seatunnel
   22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load plugin: 
PluginIdentifier{engineType='seatunnel', pluginType='source', 
pluginName='FakeSource'} from classpath
   22/08/17 22:49:04 INFO SparkEnvironment: register plugins :[]
   22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load BaseSparkTransform 
Plugin from 
D:\workspace\idea\seatunnel\incubator-seatunnel\seatunnel-common\connectors\seatunnel
   22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load plugin: 
PluginIdentifier{engineType='seatunnel', pluginType='transform', 
pluginName='sql'} from classpath
   22/08/17 22:49:04 INFO SparkEnvironment: register plugins :[]
   22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load SeaTunnelSink Plugin 
from 
D:\workspace\idea\seatunnel\incubator-seatunnel\seatunnel-common\connectors\seatunnel
   22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load plugin: 
PluginIdentifier{engineType='seatunnel', pluginType='sink', 
pluginName='LocalFile'} from classpath
   22/08/17 22:49:04 INFO SparkEnvironment: register plugins :[]
   22/08/17 22:49:04 INFO SharedState: Setting hive.metastore.warehouse.dir 
('null') to the value of spark.sql.warehouse.dir 
('file:/D:/workspace/idea/seatunnel/incubator-seatunnel/spark-warehouse/').
   22/08/17 22:49:04 INFO SharedState: Warehouse path is 
'file:/D:/workspace/idea/seatunnel/incubator-seatunnel/spark-warehouse/'.
   22/08/17 22:49:04 INFO ContextHandler: Started 
o.s.j.s.ServletContextHandler@7c8326a4{/SQL,null,AVAILABLE,@Spark}
   22/08/17 22:49:04 INFO ContextHandler: Started 
o.s.j.s.ServletContextHandler@77128dab{/SQL/json,null,AVAILABLE,@Spark}
   22/08/17 22:49:04 INFO ContextHandler: Started 
o.s.j.s.ServletContextHandler@6f012914{/SQL/execution,null,AVAILABLE,@Spark}
   22/08/17 22:49:04 INFO ContextHandler: Started 
o.s.j.s.ServletContextHandler@18fdb6cf{/SQL/execution/json,null,AVAILABLE,@Spark}
   22/08/17 22:49:04 INFO ContextHandler: Started 
o.s.j.s.ServletContextHandler@720653c2{/static/sql,null,AVAILABLE,@Spark}
   22/08/17 22:49:05 INFO StateStoreCoordinatorRef: Registered 
StateStoreCoordinator endpoint
   22/08/17 22:49:07 ERROR SparkApiTaskExecuteCommand: Run SeaTunnel on spark 
failed.
   java.lang.RuntimeException: file_name_expression must contains transactionId 
when is_enable_transaction is true
        at 
org.apache.seatunnel.connectors.seatunnel.file.sink.config.TextFileSinkConfig.<init>(TextFileSinkConfig.java:112)
        at 
org.apache.seatunnel.connectors.seatunnel.file.sink.AbstractFileSink.getSinkConfig(AbstractFileSink.java:143)
        at 
org.apache.seatunnel.connectors.seatunnel.file.sink.AbstractFileSink.createAggregatedCommitter(AbstractFileSink.java:114)
        at 
org.apache.seatunnel.translation.spark.sink.SparkDataSourceWriter.<init>(SparkDataSourceWriter.java:48)
        at 
org.apache.seatunnel.translation.spark.sink.SparkSink.createWriter(SparkSink.java:67)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:254)
        at 
org.apache.seatunnel.core.starter.spark.execution.SinkExecuteProcessor.execute(SinkExecuteProcessor.java:75)
        at 
org.apache.seatunnel.core.starter.spark.execution.SparkExecution.execute(SparkExecution.java:60)
        at 
org.apache.seatunnel.core.starter.spark.command.SparkApiTaskExecuteCommand.execute(SparkApiTaskExecuteCommand.java:54)
        at org.apache.seatunnel.core.starter.Seatunnel.run(Seatunnel.java:40)
        at 
org.apache.seatunnel.example.spark.v2.ExampleUtils.builder(ExampleUtils.java:43)
        at 
org.apache.seatunnel.example.spark.v2.SeaTunnelApiExample.main(SeaTunnelApiExample.java:28)
   22/08/17 22:49:07 INFO SparkContext: Invoking stop() from shutdown hook
   22/08/17 22:49:07 INFO AbstractConnector: Stopped 
Spark@9bd0fa6{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
   22/08/17 22:49:07 INFO SparkUI: Stopped Spark web UI at http://GITV:4040
   22/08/17 22:49:07 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   22/08/17 22:49:07 INFO MemoryStore: MemoryStore cleared
   22/08/17 22:49:07 INFO BlockManager: BlockManager stopped
   22/08/17 22:49:07 INFO BlockManagerMaster: BlockManagerMaster stopped
   22/08/17 22:49:07 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   22/08/17 22:49:07 INFO SparkContext: Successfully stopped SparkContext
   22/08/17 22:49:07 INFO ShutdownHookManager: Shutdown hook called
   ```
   
   
   ### Flink or Spark Version
   
   _No response_
   
   ### Java or Scala Version
   
   _No response_
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-seatunnel] Bingz2 opened a new issue, #2449: [Bug] [Connector-V2][File Local Sink] Error running Spark Connector V2 example using IDE

Reply via email to