[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-20 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/19861


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-01 Thread jiangxb1987
GitHub user jiangxb1987 opened a pull request:

https://github.com/apache/spark/pull/19861

[SPARK-22387][SQL] Propagate session configs to data source read/write 
options

## What changes were proposed in this pull request?

Introduce a new interface `ConfigSupport` for `DataSourceV2`, it can help 
to propagate session configs with chosen key-prefixes to the particular data 
source.

## How was this patch tested?

Add corresponding test case in `DataSourceV2Suite`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiangxb1987/spark datasource-configs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/19861.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19861


commit 356c55eb5d50db1ab0fe9f15285cf31d993fad8a
Author: Xingbo Jiang 
Date:   2017-12-01T12:17:58Z

propagate session configs to data source options.

commit b076a69b0180825d1726019ef6213d1be2324c26
Author: Xingbo Jiang 
Date:   2017-12-01T13:36:22Z

fix scalastyle




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-02 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r154503549
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -732,3 +743,25 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
   private val extraOptions = new scala.collection.mutable.HashMap[String, 
String]
 
 }
+
+private[sql] object DataFrameReader {
+
+  /**
+   * Helper method to filter session configs with config key that matches 
at least one of the given
+   * prefixes.
+   *
+   * @param cs the config key-prefixes that should be filtered.
+   * @param conf the session conf
+   * @return an immutable map that contains all the session configs that 
should be propagated to
+   * the data source.
+   */
+  def withSessionConfig(
--- End diff --

These helper functions need to be moved to 
org.apache.spark.sql.execution.datasources.v2 package. This will be called by 
SQL API code path. 

Another more straightforward option is to provide it by `ConfigSupport`. 
WDYT? cc @cloud-fan  


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-02 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r154503590
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -184,9 +188,16 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
 
 val cls = DataSource.lookupDataSource(source)
 if (classOf[DataSourceV2].isAssignableFrom(cls)) {
-  val options = new DataSourceV2Options(extraOptions.asJava)
+  val dataSource = cls.newInstance()
+  val options = dataSource match {
+case cs: ConfigSupport =>
+  val confs = withSessionConfig(cs, sparkSession.sessionState.conf)
+  new DataSourceV2Options((confs ++ extraOptions).asJava)
--- End diff --

What happened if they have duplicate names?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-04 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r154687413
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -184,9 +188,16 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
 
 val cls = DataSource.lookupDataSource(source)
 if (classOf[DataSourceV2].isAssignableFrom(cls)) {
-  val options = new DataSourceV2Options(extraOptions.asJava)
+  val dataSource = cls.newInstance()
+  val options = dataSource match {
+case cs: ConfigSupport =>
+  val confs = withSessionConfig(cs, sparkSession.sessionState.conf)
+  new DataSourceV2Options((confs ++ extraOptions).asJava)
--- End diff --

Good catch! Should the confs in the `extraOptions` have a higher priority? 
WDYT @cloud-fan ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-04 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r154692737
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -184,9 +188,16 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
 
 val cls = DataSource.lookupDataSource(source)
 if (classOf[DataSourceV2].isAssignableFrom(cls)) {
-  val options = new DataSourceV2Options(extraOptions.asJava)
+  val dataSource = cls.newInstance()
+  val options = dataSource match {
+case cs: ConfigSupport =>
+  val confs = withSessionConfig(cs, sparkSession.sessionState.conf)
+  new DataSourceV2Options((confs ++ extraOptions).asJava)
--- End diff --

yea `extraOptions` needs higher priority.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-04 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r154694199
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/ConfigSupport.java ---
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
+
+import org.apache.spark.annotation.InterfaceStability;
+import org.apache.spark.sql.sources.v2.reader.DataSourceV2Reader;
+
+import java.util.List;
+
+/**
+ * A mix-in interface for {@link DataSourceV2}. Data sources can implement 
this interface to
+ * propagate session configs with chosen key-prefixes to the particular 
data source.
+ */
+@InterfaceStability.Evolving
+public interface ConfigSupport {
+
+/**
+ * Create a list of key-prefixes, all session configs that match at 
least one of the prefixes
+ * will be propagated to the data source options.
+ */
+List getConfigPrefixes();
--- End diff --

we need to think about the current use cases and validate this API. E.g. 
CSV data source and JSON data source both accept an option 
`columnNameOfCorruptRecord`, or session config 
`spark.sql.columnNameOfCorruptRecord`. We get the following information:

1. mostly session config maps to an existing option.
2. session configs are always prefixed with `spark.sql`, we should not ask 
the data source to always specify it.
3. do we really need to support more than one prefixes?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-07 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r155693966
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
+
+  /**
+   * Helper method to propagate session configs with config key that 
matches at least one of the
+   * given prefixes to the corresponding data source options.
+   *
+   * @param cs the session config propagate help class
+   * @param source the data source format
+   * @param conf the session conf
+   * @return an immutable map that contains all the session configs that 
should be propagated to
+   * the data source.
+   */
+  def withSessionConfig(
+  cs: ConfigSupport,
+  source: String,
+  conf: SQLConf): immutable.Map[String, String] = {
+val prefixes = cs.getConfigPrefixes
+require(prefixes != null, "The config key-prefixes cann't be null.")
--- End diff --

double n


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-07 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r155693977
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,85 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
+
+  /**
+   * Helper method to propagate session configs with config key that 
matches at least one of the
+   * given prefixes to the corresponding data source options.
+   *
+   * @param cs the session config propagate help class
+   * @param source the data source format
+   * @param conf the session conf
+   * @return an immutable map that contains all the session configs that 
should be propagated to
+   * the data source.
+   */
+  def withSessionConfig(
+  cs: ConfigSupport,
+  source: String,
+  conf: SQLConf): immutable.Map[String, String] = {
+val prefixes = cs.getConfigPrefixes
+require(prefixes != null, "The config key-prefixes cann't be null.")
+val mapping = cs.getConfigMapping.asScala
+val validOptions = cs.getValidOptions
+require(validOptions != null, "The valid options list cann't be null.")
--- End diff --

double n


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156652932
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/ConfigSupport.java ---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
+
+import org.apache.spark.annotation.InterfaceStability;
+
+import java.util.List;
+import java.util.Map;
+
+/**
+ * A mix-in interface for {@link DataSourceV2}. Data sources can implement 
this interface to
+ * propagate session configs with chosen key-prefixes to the particular 
data source.
+ */
+@InterfaceStability.Evolving
+public interface ConfigSupport {
--- End diff --

nit: `SessionConfigSupport`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156653418
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/ConfigSupport.java ---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
+
+import org.apache.spark.annotation.InterfaceStability;
+
+import java.util.List;
+import java.util.Map;
+
+/**
+ * A mix-in interface for {@link DataSourceV2}. Data sources can implement 
this interface to
+ * propagate session configs with chosen key-prefixes to the particular 
data source.
--- End diff --

`propagate session configs with the specified key-prefix to all the 
read/write operations under this session`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156653492
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/ConfigSupport.java ---
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
+
+import org.apache.spark.annotation.InterfaceStability;
+
+import java.util.List;
+import java.util.Map;
+
+/**
+ * A mix-in interface for {@link DataSourceV2}. Data sources can implement 
this interface to
+ * propagate session configs with chosen key-prefixes to the particular 
data source.
+ */
+@InterfaceStability.Evolving
+public interface ConfigSupport {
+
+/**
+ * Name for the specified data source, will extract all session 
configs that starts with
+ * `spark.datasource.$name`, turn `spark.datasource.$name.xxx -> 
yyy` into
+ * `xxx -> yyy`, and propagate them to all data source operations 
in this session.
+ */
+String name();
--- End diff --

how about `keyPrefix`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156653981
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
--- End diff --

how about just naming it `Util`? We may add more functions in it for other 
purpose in the future.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156654124
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
+
+  /**
+   * Helper method that turns session configs with config keys that start 
with
+   * `spark.datasource.$name` into k/v pairs, the k/v pairs will be used 
to create data source
+   * options.
+   * A session config `spark.datasource.$name.xxx -> yyy` will be 
transformed into
+   * `xxx -> yyy`.
+   *
+   * @param name the data source name
+   * @param conf the session conf
+   * @return an immutable map that contains all the extracted and 
transformed k/v pairs.
+   */
+  def withSessionConfig(
+  name: String,
+  conf: SQLConf): immutable.Map[String, String] = {
--- End diff --

in Scala `Map` by default refers to `immutable.Map`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156654259
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
+
+  /**
+   * Helper method that turns session configs with config keys that start 
with
+   * `spark.datasource.$name` into k/v pairs, the k/v pairs will be used 
to create data source
+   * options.
+   * A session config `spark.datasource.$name.xxx -> yyy` will be 
transformed into
+   * `xxx -> yyy`.
+   *
+   * @param name the data source name
+   * @param conf the session conf
+   * @return an immutable map that contains all the extracted and 
transformed k/v pairs.
+   */
+  def withSessionConfig(
+  name: String,
+  conf: SQLConf): immutable.Map[String, String] = {
+require(name != null, "The data source name can't be null.")
+
+val pattern = Pattern.compile(s"spark\\.datasource\\.$name\\.(.*)")
--- End diff --

this can be a member variable to avoid re-compile it everytime.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156654638
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
+
+  /**
+   * Helper method that turns session configs with config keys that start 
with
+   * `spark.datasource.$name` into k/v pairs, the k/v pairs will be used 
to create data source
+   * options.
+   * A session config `spark.datasource.$name.xxx -> yyy` will be 
transformed into
+   * `xxx -> yyy`.
+   *
+   * @param name the data source name
+   * @param conf the session conf
+   * @return an immutable map that contains all the extracted and 
transformed k/v pairs.
+   */
+  def withSessionConfig(
+  name: String,
+  conf: SQLConf): immutable.Map[String, String] = {
+require(name != null, "The data source name can't be null.")
+
+val pattern = Pattern.compile(s"spark\\.datasource\\.$name\\.(.*)")
--- End diff --

nit: s"^spark..." to make sure the matched string starts with `spark...`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156654748
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
+
+  /**
+   * Helper method that turns session configs with config keys that start 
with
+   * `spark.datasource.$name` into k/v pairs, the k/v pairs will be used 
to create data source
+   * options.
+   * A session config `spark.datasource.$name.xxx -> yyy` will be 
transformed into
+   * `xxx -> yyy`.
+   *
+   * @param name the data source name
+   * @param conf the session conf
+   * @return an immutable map that contains all the extracted and 
transformed k/v pairs.
+   */
+  def withSessionConfig(
+  name: String,
+  conf: SQLConf): immutable.Map[String, String] = {
+require(name != null, "The data source name can't be null.")
+
+val pattern = Pattern.compile(s"spark\\.datasource\\.$name\\.(.*)")
+val filteredConfigs = conf.getAllConfs.filterKeys { confKey =>
+  confKey.startsWith(s"spark.datasource.$name")
+}
+filteredConfigs.map { entry =>
--- End diff --

can we use a `flatMap` to just loop the input once?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156874423
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
+
+  /**
+   * Helper method that turns session configs with config keys that start 
with
+   * `spark.datasource.$name` into k/v pairs, the k/v pairs will be used 
to create data source
+   * options.
+   * A session config `spark.datasource.$name.xxx -> yyy` will be 
transformed into
+   * `xxx -> yyy`.
+   *
+   * @param name the data source name
+   * @param conf the session conf
+   * @return an immutable map that contains all the extracted and 
transformed k/v pairs.
+   */
+  def withSessionConfig(
+  name: String,
+  conf: SQLConf): immutable.Map[String, String] = {
+require(name != null, "The data source name can't be null.")
+
+val pattern = Pattern.compile(s"spark\\.datasource\\.$name\\.(.*)")
--- End diff --

The regex contains `$keyPrefix`, so we are not able to remove it from the 
method.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156881214
  
--- Diff: 
sql/core/src/main/java/org/apache/spark/sql/sources/v2/SessionConfigSupport.java
 ---
@@ -0,0 +1,39 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2;
+
+import org.apache.spark.annotation.InterfaceStability;
+
+import java.util.List;
+import java.util.Map;
+
+/**
+ * A mix-in interface for {@link DataSourceV2}. Data sources can implement 
this interface to
+ * propagate session configs with the specified key-prefix to all data 
source operations in this
+ * session.
+ */
+@InterfaceStability.Evolving
+public interface SessionConfigSupport {
+
+/**
+ * Name for the specified data source, will extract all session 
configs that starts with
--- End diff --

nit: `Key prefix of the session configs to propagate. Spark will extract 
...`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156881519
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala ---
@@ -184,9 +187,16 @@ class DataFrameReader private[sql](sparkSession: 
SparkSession) extends Logging {
 
 val cls = DataSource.lookupDataSource(source)
 if (classOf[DataSourceV2].isAssignableFrom(cls)) {
-  val options = new DataSourceV2Options(extraOptions.asJava)
+  val dataSource = cls.newInstance()
+  val options = dataSource match {
+case cs: SessionConfigSupport =>
+  val confs = withSessionConfig(cs.keyPrefix, 
sparkSession.sessionState.conf)
--- End diff --

just call `Utils. withSessionConfig` here. Don't import `Utils._` at the 
beginning


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156883149
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/Utils.scala
 ---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+
+private[sql] object Utils extends Logging {
+
+  /**
+   * Helper method that turns session configs with config keys that start 
with
+   * `spark.datasource.$keyPrefix` into k/v pairs, the k/v pairs will be 
used to create data source
+   * options.
+   * A session config `spark.datasource.$keyPrefix.xxx -> yyy` will be 
transformed into
+   * `xxx -> yyy`.
+   *
+   * @param keyPrefix the data source config key prefix to be matched
+   * @param conf the session conf
+   * @return an immutable map that contains all the extracted and 
transformed k/v pairs.
+   */
+  def withSessionConfig(
+  keyPrefix: String,
+  conf: SQLConf): Map[String, String] = {
+require(keyPrefix != null, "The data source config key prefix can't be 
null.")
+
+val pattern = 
Pattern.compile(s"^spark\\.datasource\\.$keyPrefix\\.(.*)")
--- End diff --

can we move it out and make it a memory variable?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156883554
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala 
---
@@ -32,6 +34,8 @@ import org.apache.spark.sql.types.StructType
 class DataSourceV2Suite extends QueryTest with SharedSQLContext {
   import testImplicits._
 
+  private val dsName = "userDefinedDataSource"
--- End diff --

nit: `keyPrefix`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156884555
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2Suite.scala 
---
@@ -43,6 +47,23 @@ class DataSourceV2Suite extends QueryTest with 
SharedSQLContext {
 }
   }
 
+  test("simple implementation with config support") {
--- End diff --

this doesn't need to be here, but just a new test suite for the new `Utils`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread jiangxb1987
Github user jiangxb1987 commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156886333
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
--- End diff --

How about rename it to `DataSourceV2Utils`, since the name `Utils` may 
conflict with `org.apache.spark.util.Utils`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156921078
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2ConfigSupport.scala
 ---
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import scala.collection.JavaConverters._
+import scala.collection.immutable
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.ConfigSupport
+
+private[sql] object DataSourceV2ConfigSupport extends Logging {
--- End diff --

yea good idea


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156969297
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -238,7 +239,16 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
 if (classOf[DataSourceV2].isAssignableFrom(cls)) {
   cls.newInstance() match {
 case ds: WriteSupport =>
-  val options = new DataSourceV2Options(extraOptions.asJava)
+  val dataSource = cls.newInstance()
--- End diff --

we can use `ds` directly here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156969856
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Utils.scala
 ---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+
+private[sql] object DataSourceV2Utils extends Logging {
+
+  /**
+   * Helper method that turns session configs with config keys that start 
with
+   * `spark.datasource.$keyPrefix` into k/v pairs, the k/v pairs will be 
used to create data source
+   * options.
+   * A session config `spark.datasource.$keyPrefix.xxx -> yyy` will be 
transformed into
+   * `xxx -> yyy`.
+   *
+   * @param keyPrefix the data source config key prefix to be matched
+   * @param conf the session conf
+   * @return an immutable map that contains all the extracted and 
transformed k/v pairs.
+   */
+  def withSessionConfig(
--- End diff --

how about we do more things here?
```
def extractSessionConfigs(ds: DataSourceV2, conf: SQLConf) = ds match {
  case cs: SessionConfigSupport => ...
  case _ => Map.empty
}
```
and the caller side
```
val options = new DataSourceV2Options((extraOptions ++ 
extractSessionConfigs(ds, conf)).asJava)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r156970195
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2UtilsSuite.scala
 ---
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2
+
+import org.apache.spark.sql.QueryTest
+import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.test.SharedSQLContext
+
+class DataSourceV2UtilsSuite extends QueryTest with SharedSQLContext {
+
+  private val keyPrefix = "userDefinedDataSource"
+
+  test("method withSessionConfig() should propagate session configs 
correctly") {
+// Only match configs with keys start with 
"spark.datasource.${keyPrefix}".
+withSQLConf(s"spark.datasource.$keyPrefix.foo.bar" -> "false",
--- End diff --

We don't need this heavy `withConf` stuff, just 
```
val conf = new SQLConf
conf.set(...)
...
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r157017332
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2UtilsSuite.scala
 ---
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2
+
+import org.apache.spark.sql.{QueryTest, SparkSession}
+import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils
+import org.apache.spark.sql.internal.SQLConf
+
+class DataSourceV2UtilsSuite extends QueryTest {
--- End diff --

this can just extend `SparkFunSuite`?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r157017556
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/DataSourceV2UtilsSuite.scala
 ---
@@ -0,0 +1,50 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.sources.v2
+
+import org.apache.spark.sql.{QueryTest, SparkSession}
+import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Utils
+import org.apache.spark.sql.internal.SQLConf
+
+class DataSourceV2UtilsSuite extends QueryTest {
+
+  protected var spark: SparkSession = null
+
+  private val keyPrefix = "userDefinedDataSource"
--- End diff --

to be safe: `keyPrefix = new DataSourceV2WithSessionConfig().keyPrefix`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #19861: [SPARK-22387][SQL] Propagate session configs to d...

2017-12-14 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/19861#discussion_r157140363
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Utils.scala
 ---
@@ -0,0 +1,58 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources.v2
+
+import java.util.regex.Pattern
+
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.sources.v2.{DataSourceV2, SessionConfigSupport}
+
+private[sql] object DataSourceV2Utils extends Logging {
+
+  /**
+   * Helper method that extracts and transforms session configs into k/v 
pairs, the k/v pairs will
+   * be used to create data source options.
+   * Only extract when `ds` implements [[SessionConfigSupport]], in this 
case we may fetch the
+   * specified key-prefix from `ds`, and extract session configs with 
config keys that start with
+   * `spark.datasource.$keyPrefix`. A session config 
`spark.datasource.$keyPrefix.xxx -> yyy` will
+   * be transformed into `xxx -> yyy`.
+   *
+   * @param ds a [[DataSourceV2]] object
+   * @param conf the session conf
+   * @return an immutable map that contains all the extracted and 
transformed k/v pairs.
+   */
+  def extractSessionConfigs(ds: DataSourceV2, conf: SQLConf): Map[String, 
String] = ds match {
+case cs: SessionConfigSupport =>
+  val keyPrefix = cs.keyPrefix()
+  require(keyPrefix != null, "The data source config key prefix can't 
be null.")
+
+  val pattern = 
Pattern.compile(s"^spark\\.datasource\\.$keyPrefix\\.(.*)")
--- End diff --

nit: `(.*)` -> `(.+)`. Just to forbid some corner case like 
`spark.datasource.$keyPrefix.`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org