[jira] [Updated] (SPARK-2360) CSV import to SchemaRDDs
[ https://issues.apache.org/jira/browse/SPARK-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2360: Issue Type: Sub-task (was: New Feature) Parent: SPARK-3247 CSV import to SchemaRDDs Key: SPARK-2360 URL: https://issues.apache.org/jira/browse/SPARK-2360 Project: Spark Issue Type: Sub-task Components: SQL Reporter: Michael Armbrust Assignee: Hossein Falaki I think the first step it to design the interface that we want to present to users. Mostly this is defining options when importing. Off the top of my head: - What is the separator? - Provide column names or infer them from the first row. - how to handle multiple files with possibly different schemas - do we have a method to let users specify the datatypes of the columns or are they just strings? - what types of quoting / escaping do we want to support? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2360) CSV import to SchemaRDDs
[ https://issues.apache.org/jira/browse/SPARK-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2360: Target Version/s: 1.2.0 (was: 1.1.0) CSV import to SchemaRDDs Key: SPARK-2360 URL: https://issues.apache.org/jira/browse/SPARK-2360 Project: Spark Issue Type: New Feature Components: SQL Reporter: Michael Armbrust Assignee: Hossein Falaki Priority: Minor I think the first step it to design the interface that we want to present to users. Mostly this is defining options when importing. Off the top of my head: - What is the separator? - Provide column names or infer them from the first row. - how to handle multiple files with possibly different schemas - do we have a method to let users specify the datatypes of the columns or are they just strings? - what types of quoting / escaping do we want to support? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2360) CSV import to SchemaRDDs
[ https://issues.apache.org/jira/browse/SPARK-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2360: --- Issue Type: New Feature (was: Bug) CSV import to SchemaRDDs Key: SPARK-2360 URL: https://issues.apache.org/jira/browse/SPARK-2360 Project: Spark Issue Type: New Feature Components: SQL Reporter: Michael Armbrust Assignee: Hossein Falaki Priority: Minor I think the first step it to design the interface that we want to present to users. Mostly this is defining options when importing. Off the top of my head: - What is the separator? - Provide column names or infer them from the first row. - how to handle multiple files with possibly different schemas - do we have a method to let users specify the datatypes of the columns or are they just strings? - what types of quoting / escaping do we want to support? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2360) CSV import to SchemaRDDs
[ https://issues.apache.org/jira/browse/SPARK-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2360: Assignee: Hossein Falaki CSV import to SchemaRDDs Key: SPARK-2360 URL: https://issues.apache.org/jira/browse/SPARK-2360 Project: Spark Issue Type: Bug Components: SQL Reporter: Michael Armbrust Assignee: Hossein Falaki Priority: Minor I think the first step it to design the interface that we want to present to users. Mostly this is defining options when importing. Off the top of my head: - What is the separator? - Provide column names or infer them from the first row. - how to handle multiple files with possibly different schemas - do we have a method to let users specify the datatypes of the columns or are they just strings? - what types of quoting / escaping do we want to support? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2360) CSV import to SchemaRDDs
[ https://issues.apache.org/jira/browse/SPARK-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2360: Description: I think the first step it to design the interface that we want to present to users. Mostly this is defining options when importing. Off the top of my head: - What is the separator? - Provide column names or infer them from the first row. - how to handle multiple files with possibly different schemas - do we have a method to let users specify the datatypes of the columns or are they just strings? - what types of quoting / escaping do we want to support? CSV import to SchemaRDDs Key: SPARK-2360 URL: https://issues.apache.org/jira/browse/SPARK-2360 Project: Spark Issue Type: Bug Components: SQL Reporter: Michael Armbrust Priority: Minor I think the first step it to design the interface that we want to present to users. Mostly this is defining options when importing. Off the top of my head: - What is the separator? - Provide column names or infer them from the first row. - how to handle multiple files with possibly different schemas - do we have a method to let users specify the datatypes of the columns or are they just strings? - what types of quoting / escaping do we want to support? -- This message was sent by Atlassian JIRA (v6.2#6252)