Spark preserve timestamp
Do we have option to say to spark to preserve time stamp while creating struct. Regards, Sudhir
Timestamp changing while writing
Hello, I am using createDataframe and passing java row rdd and schema . But it is changing the time value when I write that data frame to a parquet file. Can any one help . Thank you, Sudhir
Re: Custom line/record delimiter
Thanks for the update Kwon. Regards, On Mon, Jan 1, 2018 at 7:54 PM Hyukjin Kwon <gurwls...@gmail.com> wrote: > Hi, > > > There's a PR - https://github.com/apache/spark/pull/18581 and JIRA > - SPARK-21289 > > Alternatively, you could check out multiLine option for CSV and see if > applicable. > > > Thanks. > > > 2017-12-30 2:19 GMT+09:00 sk skk <spark.s...@gmail.com>: > >> Hi, >> >> Do we have an option to write a csv or text file with a custom >> record/line separator through spark ? >> >> I could not find any ref on the api. I have a issue while loading data >> into a warehouse as one of the column on csv have a new line character and >> the warehouse is not letting to escape that new line character . >> >> Thank you , >> Sk >> > >
Custom line/record delimiter
Hi, Do we have an option to write a csv or text file with a custom record/line separator through spark ? I could not find any ref on the api. I have a issue while loading data into a warehouse as one of the column on csv have a new line character and the warehouse is not letting to escape that new line character . Thank you , Sk
Sparkcontext on udf
I have registered a udf with sqlcontext , I am trying to read another parquet using sqlcontext under same udf it’s throwing null pointer exception . Any help how to access sqlcontext inside a udf ? Regards, Sk
Appending column to a parquet
Hi , I have two parquet files with different schemas based on unique I have to fetch one column value and append to all rows on the parquet file . I tried join but I guess due to diff schema it’s not working . I can use withcolumn but can we get single value of a column and assign it to a literal as if I register it as a temp table and fetch that column value and assigning it to a string it is return a row to string schema and not getting a literal . Is there a better way to handle this or how to get a literal value from temporary table . Thank you , Sk
Java Rdd of String to dataframe
Can we create a dataframe from a Java pair rdd of String . I don’t have a schema as it will be a dynamic Json. I gave encoders.string class. Any help is appreciated !! Thanks, SK
how to fetch schema froma dynamic nested JSON
Hi, i have a requirement where i have to read a dynamic nested JSON for schema and need to check the data quality based on the schema. i.e i get the details from a JSON i.e say column 1 should be string, length kinda... this is dynamic json and nested one. so traditionally i have to loop the json object and fetch all the data. Coming to data array i have to read a json array where each json object should be checked with the above json schema i.e on the json array first json object first column data should be string,lengthmatch . With out looping schema json and inside that looping this data array which will be performance impact, do we have any options or better way to handle.. Thanks in advance. sk