[jira] [Commented] (SPARK-16893) Spark CSV Provider option is not documented

2016-08-05 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409270#comment-15409270
 ] 

Sean Owen commented on SPARK-16893:
---

Just omit the .format() call. I'm saying that is the problem. See the examples 
in 2.0 which read csv files.

> Spark CSV Provider option is not documented
> ---
>
> Key: SPARK-16893
> URL: https://issues.apache.org/jira/browse/SPARK-16893
> Project: Spark
>  Issue Type: Documentation
>Affects Versions: 2.0.0
>Reporter: Aseem Bansal
>Priority: Minor
>
> I was working with databricks spark csv library and came across an error. I 
> have logged the issue in their github but it would be good to document that 
> in Apache Spark's documentation also
> I faced it with CSV. Someone else faced that with JSON 
> http://stackoverflow.com/questions/38761920/spark2-0-error-multiple-sources-found-for-json-when-read-json-file
> Complete Issue details here
> https://github.com/databricks/spark-csv/issues/367



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16893) Spark CSV Provider option is not documented

2016-08-05 Thread Aseem Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409260#comment-15409260
 ] 

Aseem Bansal commented on SPARK-16893:
--

Yes. I would expect it to work without the use of format function as spark's 
documentation does not tell me anything about the need to use the format when 
using the csv function. 

> Spark CSV Provider option is not documented
> ---
>
> Key: SPARK-16893
> URL: https://issues.apache.org/jira/browse/SPARK-16893
> Project: Spark
>  Issue Type: Documentation
>Affects Versions: 2.0.0
>Reporter: Aseem Bansal
>Priority: Minor
>
> I was working with databricks spark csv library and came across an error. I 
> have logged the issue in their github but it would be good to document that 
> in Apache Spark's documentation also
> I faced it with CSV. Someone else faced that with JSON 
> http://stackoverflow.com/questions/38761920/spark2-0-error-multiple-sources-found-for-json-when-read-json-file
> Complete Issue details here
> https://github.com/databricks/spark-csv/issues/367



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16893) Spark CSV Provider option is not documented

2016-08-05 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409201#comment-15409201
 ] 

Sean Owen commented on SPARK-16893:
---

You don't need that format line at all right? It seems like that's the 
ambiguity.

> Spark CSV Provider option is not documented
> ---
>
> Key: SPARK-16893
> URL: https://issues.apache.org/jira/browse/SPARK-16893
> Project: Spark
>  Issue Type: Documentation
>Affects Versions: 2.0.0
>Reporter: Aseem Bansal
>Priority: Minor
>
> I was working with databricks spark csv library and came across an error. I 
> have logged the issue in their github but it would be good to document that 
> in Apache Spark's documentation also
> I faced it with CSV. Someone else faced that with JSON 
> http://stackoverflow.com/questions/38761920/spark2-0-error-multiple-sources-found-for-json-when-read-json-file
> Complete Issue details here
> https://github.com/databricks/spark-csv/issues/367



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16893) Spark CSV Provider option is not documented

2016-08-05 Thread Aseem Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409183#comment-15409183
 ] 

Aseem Bansal commented on SPARK-16893:
--

Reading a CSV causes an exception. Code used and excpetion are below. Also 
present in the github issue that I have referenced here.

{code}
public static void main(String[] args) {
SparkSession spark = SparkSession
.builder()
.appName("my app")
.getOrCreate();

Dataset df = spark.read()
.format("com.databricks.spark.csv")
.option("header", "true")
.option("nullValue", "")
.csv("/home/aseem/data.csv")
;

df.show();
}
{code}

bq. Exception in thread "main" java.lang.RuntimeException: Multiple sources 
found for csv (org.apache.spark.sql.execution.datasources.csv.CSVFileFormat, 
com.databricks.spark.csv.DefaultSource15), please specify the fully qualified 
class name.

People need to use format("csv"). I think that is counter intuitive seeing that 
I am using the CSV method.

> Spark CSV Provider option is not documented
> ---
>
> Key: SPARK-16893
> URL: https://issues.apache.org/jira/browse/SPARK-16893
> Project: Spark
>  Issue Type: Documentation
>Affects Versions: 2.0.0
>Reporter: Aseem Bansal
>Priority: Minor
>
> I was working with databricks spark csv library and came across an error. I 
> have logged the issue in their github but it would be good to document that 
> in Apache Spark's documentation also
> I faced it with CSV. Someone else faced that with JSON 
> http://stackoverflow.com/questions/38761920/spark2-0-error-multiple-sources-found-for-json-when-read-json-file
> Complete Issue details here
> https://github.com/databricks/spark-csv/issues/367



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16893) Spark CSV Provider option is not documented

2016-08-04 Thread Aseem Bansal (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407601#comment-15407601
 ] 

Aseem Bansal commented on SPARK-16893:
--

[~hyukjin.kwon] cc

> Spark CSV Provider option is not documented
> ---
>
> Key: SPARK-16893
> URL: https://issues.apache.org/jira/browse/SPARK-16893
> Project: Spark
>  Issue Type: Documentation
>Affects Versions: 2.0.0
>Reporter: Aseem Bansal
>
> I was working with databricks spark csv library and came across an error. I 
> have logged the issue in their github but it would be good to document that 
> in Apache Spark's documentation also
> I faced it with CSV. Someone else faced that with JSON 
> http://stackoverflow.com/questions/38761920/spark2-0-error-multiple-sources-found-for-json-when-read-json-file
> Complete Issue details here
> https://github.com/databricks/spark-csv/issues/367



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org