Re: Does Spark CSV accept a CSV String

2016-03-31 Thread Mich Talebzadeh
well my guess is just pkunzip it and use bzip2 to zip it or leave it as it is. Databricks handles *.bz2 type files. I know that. Anyway that is the easy part :) Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Does Spark CSV accept a CSV String

2016-03-30 Thread Benjamin Kim
Hi Mich, I forgot to mention that - this is the ugly part - the source data provider gives us (Windows) pkzip compressed files. Will spark uncompress these automatically? I haven’t been able to make it work. Thanks, Ben > On Mar 30, 2016, at 2:27 PM, Mich Talebzadeh

Re: Does Spark CSV accept a CSV String

2016-03-30 Thread Mich Talebzadeh
Hi Ben, Well I have done it for standard csv files downloaded from spreadsheets to staging directory on hdfs and loaded from there. First you may not need to unzip them. dartabricks can read them (in my case) and zipped files. Check this. Mine is slightly different from what you have, First I

Re: Does Spark CSV accept a CSV String

2016-03-30 Thread Benjamin Kim
Hi Mich, You are correct. I am talking about the Databricks package spark-csv you have below. The files are stored in s3 and I download, unzip, and store each one of them in a variable as a string using the AWS SDK (aws-java-sdk-1.10.60.jar). Here is some of the code. val filesRdd =

Re: Does Spark CSV accept a CSV String

2016-03-30 Thread Mich Talebzadeh
just to clarify are you talking about databricks csv package. $SPARK_HOME/bin/spark-shell --packages com.databricks:spark-csv_2.11:1.3.0 Where are these zipped files? Are they copied to a staging directory in hdfs? HTH Dr Mich Talebzadeh LinkedIn *

Does Spark CSV accept a CSV String

2016-03-30 Thread Benjamin Kim
I have a quick question. I have downloaded multiple zipped files from S3 and unzipped each one of them into strings. The next step is to parse using a CSV parser. I want to know if there is a way to easily use the spark csv package for this? Thanks, Ben