Re: How Fault Tolerance is achieved in Spark ??

2017-12-12 Thread Naresh Dulam
Hi Nikhil, Fault tolerance is something which is not lost incase of failures. Fault tolerance achieved in different way in case of different cases. In case of HDFS fault tolerance is achieved by having the replication across different nodes. In case of spark fault tolerance is achieved by having

Access Array StructField inside StructType.

2017-12-12 Thread satyajit vegesna
Hi All, How to iterate over the StructField inside *after*, StructType(StructField(*after*,StructType(*StructField(Alarmed,LongType,true), StructField(CallDollarLimit,StringType,true), StructField(CallRecordWav,StringType,true), StructField(CallTimeLimit,LongType,true),

Re: How do I save the dataframe data as a pdf file?

2017-12-12 Thread Anthony Thomas
No problem. Assuming you're data has been collected as "A = Array[Array[Double]]" something along the lines of "A.map(x => x.mkString(" & ")).mkString(" \n")" should do the trick. Another, somewhat more convoluted, option would be to write your data as a CSV or other delimited text file and

Re: How do I save the dataframe data as a pdf file?

2017-12-12 Thread anna stax
Thanks Anthony for the response. Yes, the data in the dataframe represents a report and I want to create pdf files. I am using scala so hoping to find a easier solution in scala, if not I will try out your suggestion . On Tue, Dec 12, 2017 at 11:29 AM, Anthony Thomas

Re: Json to csv

2017-12-12 Thread Subhash Sriram
I was curious about this too, and found this. You may find it helpful: http://www.tegdesign.com/converting-a-nested-json-document-to-csv-using-scala-hadoop-and-apache-spark/ Thanks, Subhash Sent from my iPhone > On Dec 12, 2017, at 1:44 AM, Prabha K wrote: > > Any

Re: How do I save the dataframe data as a pdf file?

2017-12-12 Thread Anthony Thomas
Are you trying to produce a formatted table in a pdf file where the numbers in the table come from a dataframe? I.e. to present summary statistics or other aggregates? If so I would guess your best bet would be to collect the dataframe as a Pandas dataframe and use the to_latex method. You can

How do I save the dataframe data as a pdf file?

2017-12-12 Thread shyla deshpande
Hello all, Is there a way to write the dataframe data as a pdf file? Thanks -Shyla

Unsubscribe

2017-12-12 Thread Olivier MATRAT
Unsubscribe

Re: unsubscribe

2017-12-12 Thread Malcolm Croucher
subsubscribe On Tue, Dec 12, 2017 at 5:16 PM, Divya Narayan wrote: >

unsubscribe

2017-12-12 Thread Divya Narayan

unsubscribe

2017-12-12 Thread Anshuman Kumar
unsubscribe - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: RDD[internalRow] -> DataSet

2017-12-12 Thread Vadim Semenov
not possible, but you can add your own object in your project to the spark's package that would give you access to private methods package org.apache.spark.sql import org.apache.spark.rdd.RDD import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.execution.LogicalRDD import

unsubscribe

2017-12-12 Thread Felipe Gustavo
unsubscribe

Re: Union of RDDs Hung

2017-12-12 Thread Gerard Maas
Can you show us the code? On Tue, Dec 12, 2017 at 9:02 AM, Vikash Pareek wrote: > Hi All, > > I am unioning 2 rdds(each of them having 2 records) but this union it is > getting hang. > I found a solution to this that is caching both the rdds before performing > union

Union of RDDs Hung

2017-12-12 Thread Vikash Pareek
Hi All, I am unioning 2 rdds(each of them having 2 records) but this union it is getting hang. I found a solution to this that is caching both the rdds before performing union but I could not figure out the root cause of hanging the job. Is somebody knows why this happens with union? Spark