Re: can spark-csv package accept strings instead of files?

2016-04-15 Thread Benjamin Kim
Thanks!

I got this to work.

val csvRdd = sc.parallelize(data.split("\n"))
val df = new 
com.databricks.spark.csv.CsvParser().withUseHeader(true).withInferSchema(true).csvRdd(sqlContext,
 csvRdd)

> On Apr 15, 2016, at 1:14 PM, Hyukjin Kwon  wrote:
> 
> Hi,
> 
> Would you try this codes below?
> 
> val csvRDD = ...your processimg for csv rdd..
> val df = new CsvParser().csvRdd(sqlContext, csvRDD, useHeader = true)
> 
> Thanks!
> 
> On 16 Apr 2016 1:35 a.m., "Benjamin Kim"  > wrote:
> Hi Hyukjin,
> 
> I saw that. I don’t know how to use it. I’m still learning Scala on my own. 
> Can you help me to start?
> 
> Thanks,
> Ben
> 
>> On Apr 15, 2016, at 8:02 AM, Hyukjin Kwon > > wrote:
>> 
>> I hope it was not too late :).
>> 
>> It is possible.
>> 
>> Please check csvRdd api here, 
>> https://github.com/databricks/spark-csv/blob/master/src/main/scala/com/databricks/spark/csv/CsvParser.scala#L150
>>  
>> .
>> 
>> Thanks!
>> 
>> On 2 Apr 2016 2:47 a.m., "Benjamin Kim" > > wrote:
>> Does anyone know if this is possible? I have an RDD loaded with rows of CSV 
>> data strings. Each string representing the header row and multiple rows of 
>> data along with delimiters. I would like to feed each thru a CSV parser to 
>> convert the data into a dataframe and, ultimately, UPSERT a Hive/HBase table 
>> with this data.
>> 
>> Please let me know if you have any ideas.
>> 
>> Thanks,
>> Ben
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
>> 
>> For additional commands, e-mail: user-h...@spark.apache.org 
>> 
>> 
> 



Re: can spark-csv package accept strings instead of files?

2016-04-15 Thread Benjamin Kim
Is this right?


import com.databricks.spark.csv

val csvRdd = data.flatMap(x => x.split("\n"))
val df = new CsvParser().csvRdd(sqlContext, csvRdd, useHeader = true)

Thanks,
Ben


> On Apr 15, 2016, at 1:14 PM, Hyukjin Kwon  wrote:
> 
> Hi,
> 
> Would you try this codes below?
> 
> val csvRDD = ...your processimg for csv rdd..
> val df = new CsvParser().csvRdd(sqlContext, csvRDD, useHeader = true)
> 
> Thanks!
> 
> On 16 Apr 2016 1:35 a.m., "Benjamin Kim"  > wrote:
> Hi Hyukjin,
> 
> I saw that. I don’t know how to use it. I’m still learning Scala on my own. 
> Can you help me to start?
> 
> Thanks,
> Ben
> 
>> On Apr 15, 2016, at 8:02 AM, Hyukjin Kwon > > wrote:
>> 
>> I hope it was not too late :).
>> 
>> It is possible.
>> 
>> Please check csvRdd api here, 
>> https://github.com/databricks/spark-csv/blob/master/src/main/scala/com/databricks/spark/csv/CsvParser.scala#L150
>>  
>> .
>> 
>> Thanks!
>> 
>> On 2 Apr 2016 2:47 a.m., "Benjamin Kim" > > wrote:
>> Does anyone know if this is possible? I have an RDD loaded with rows of CSV 
>> data strings. Each string representing the header row and multiple rows of 
>> data along with delimiters. I would like to feed each thru a CSV parser to 
>> convert the data into a dataframe and, ultimately, UPSERT a Hive/HBase table 
>> with this data.
>> 
>> Please let me know if you have any ideas.
>> 
>> Thanks,
>> Ben
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
>> 
>> For additional commands, e-mail: user-h...@spark.apache.org 
>> 
>> 
> 



Re: can spark-csv package accept strings instead of files?

2016-04-15 Thread Hyukjin Kwon
Hi,

Would you try this codes below?

val csvRDD = ...your processimg for csv rdd..
val df = new CsvParser().csvRdd(sqlContext, csvRDD, useHeader = true)

Thanks!
On 16 Apr 2016 1:35 a.m., "Benjamin Kim"  wrote:

> Hi Hyukjin,
>
> I saw that. I don’t know how to use it. I’m still learning Scala on my
> own. Can you help me to start?
>
> Thanks,
> Ben
>
> On Apr 15, 2016, at 8:02 AM, Hyukjin Kwon  wrote:
>
> I hope it was not too late :).
>
> It is possible.
>
> Please check csvRdd api here,
> https://github.com/databricks/spark-csv/blob/master/src/main/scala/com/databricks/spark/csv/CsvParser.scala#L150
> .
>
> Thanks!
> On 2 Apr 2016 2:47 a.m., "Benjamin Kim"  wrote:
>
>> Does anyone know if this is possible? I have an RDD loaded with rows of
>> CSV data strings. Each string representing the header row and multiple rows
>> of data along with delimiters. I would like to feed each thru a CSV parser
>> to convert the data into a dataframe and, ultimately, UPSERT a Hive/HBase
>> table with this data.
>>
>> Please let me know if you have any ideas.
>>
>> Thanks,
>> Ben
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: can spark-csv package accept strings instead of files?

2016-04-15 Thread Benjamin Kim
Hi Hyukjin,

I saw that. I don’t know how to use it. I’m still learning Scala on my own. Can 
you help me to start?

Thanks,
Ben

> On Apr 15, 2016, at 8:02 AM, Hyukjin Kwon  wrote:
> 
> I hope it was not too late :).
> 
> It is possible.
> 
> Please check csvRdd api here, 
> https://github.com/databricks/spark-csv/blob/master/src/main/scala/com/databricks/spark/csv/CsvParser.scala#L150
>  
> .
> 
> Thanks!
> 
> On 2 Apr 2016 2:47 a.m., "Benjamin Kim"  > wrote:
> Does anyone know if this is possible? I have an RDD loaded with rows of CSV 
> data strings. Each string representing the header row and multiple rows of 
> data along with delimiters. I would like to feed each thru a CSV parser to 
> convert the data into a dataframe and, ultimately, UPSERT a Hive/HBase table 
> with this data.
> 
> Please let me know if you have any ideas.
> 
> Thanks,
> Ben
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org 
> 
> For additional commands, e-mail: user-h...@spark.apache.org 
> 
> 



can spark-csv package accept strings instead of files?

2016-04-01 Thread Benjamin Kim
Does anyone know if this is possible? I have an RDD loaded with rows of CSV 
data strings. Each string representing the header row and multiple rows of data 
along with delimiters. I would like to feed each thru a CSV parser to convert 
the data into a dataframe and, ultimately, UPSERT a Hive/HBase table with this 
data.

Please let me know if you have any ideas.

Thanks,
Ben
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org