Feng Zhang created SEDONA-568:
---------------------------------

             Summary: Refactor TestBaseScala to use method instead of a 
class-level variable for sparkSession
                 Key: SEDONA-568
                 URL: https://issues.apache.org/jira/browse/SEDONA-568
             Project: Apache Sedona
          Issue Type: Improvement
            Reporter: Feng Zhang


Refactoring the base class (org.apache.sedona.sql.TestBaseScala) to use a 
method instead of a class-level variable for sparkSession can be a good idea 
for several reasons:

- *Lazy* Initialization: Using a method allows for lazy initialization, which 
can be beneficial if the creation of the SparkSession is resource-intensive or 
if it should only be created when needed.

- {*}Flexibility{*}: It provides more flexibility for derived classes to 
customize or extend the initialization logic without having to override a 
class-level variable.

- {*}Testability{*}: It can improve testability by allowing the SparkSession to 
be created in a controlled manner, which can be useful for unit tests.

An example is as followings:
{code:java}
trait SparkSessionBuilder {
  protected val warehouseLocation: String
  protected val resourceFolder: String  def 
createSparkSession(enableBroadcastJoin: Boolean, setInference: Boolean, 
enableMetrics: Boolean): SparkSession = {
    val builder = SedonaContext.builder()
      .master("local[*]")
      .appName("sedonasqlScalaTest")
      .config("spark.sql.warehouse.dir", warehouseLocation)    if 
(enableBroadcastJoin) {
      builder.config("sedona.join.autoBroadcastJoinThreshold", "-1")
    }    if (setInference) {
      builder.config("spark.kryoserializer.buffer.max", "64m")
        .config("spark.wherobots.inference.entrance", resourceFolder + 
"python/udfEntrance.py")
        .config("spark.wherobots.inference.files", resourceFolder + 
"python/udfDefinition.py")
        .config("spark.wherobots.inference.args", "3")
    }    if (enableMetrics) {
      builder.config("spark.metrics.conf.*.sink.console.class", 
"org.apache.spark.metrics.sink.ConsoleSink")
    }    builder.getOrCreate()
  }
} {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to